NSSpain 2020 — Q&A about platform teams

Larissa Barra
8 min readNov 19, 2020

Michel Bueno and I have presented a talk today (the 19th of November, 2020) about scaling your app with platform teams. We're super happy about all the interaction with us and the questions received, so we've decided to answer all of them here!

The link for our talk is coming soon, and here are all our answers:

  1. Do you remove feature toggles after a feature has been for several weeks/months in production? Don’t you have a problem with hundreds of feature toggles in app?
    No, we didn’t remove them. We didn’t actually have this necessity of removing them mainly because they were used to control a whole flow of a particular part or product, so after having this feature flag we would use it to control this flow in production, especially for incident scenarios. Also we had a separation between local toggles (used for development only) and remote toggles (controlled via Firebase Remote Config) and the local toggles had a higher priority over the remote ones. To organise all of these on Firebase, we used the group feature available on Remote Config. Inside the application we had .plist files to configure the local toggles using one file per environment we had.
  2. Did you split the code base in multiple repositories or use a mono repo?
    We had initially a mono repo and transitioned into one core app and modularised features. Not getting into whether that is the best thing to do or not, both models need a platform team: either to coordinate the modularisation or to optimise the mono repo for everyone to be able to build, test and run in a timely manner.
  3. Feature toggles tend to make the code more cluttered & complex. Do you have/ can you share some best practices to “do feature toggles the right way”?
    This separation between local and remote toggles was very helpful for us in this sense. On production we would rely only on the remote ones so we ended up not having too many toggles on Firebase Remote Config. Another important way to reduce cluttered and complex code is to have a pattern/architecture that allows you to insert toggles without having too much attrition. This model will depend pretty much on what design pattern you might be using, so this varies a little bit.
  4. What architecture did you use in such a big app? MVVM? VIPER?
    Our app was born without a defined architecture and then we decided to use VIPER, for a few reasons:
    - we were able to have a very similar structure in Android and iOS
    - the testability of VIPER was important to us
    - the clearly separated responsibilities of each layer allowed fast onboarding for new team members
  5. Is there a best practice to coordinate the work when Team A needs to step in the area of Team B? E.g. Team A handles Payment, Team B handles Booking (that also needs payment integration).
    The platform team supported the integration between different modules / teams by standardising the navigation between them: by encapsulating the necessary payload for each feature in their specific and globally available DTO, and handling the message exchange between modules in a centralised navigator that accepts calls from every module, we’re able to provide a smooth integration between different features.
    If team B needed payment and team A still hadn’t released a version, we’d also take care of the interaction between modules and webviews with the navigation, so there was always that fallback if necessary.
  6. Do all the teams merge into develop branches regularly and leave a feature under a feature flag or do you merge everything together when a feature is almost ready? How do you deal with merge conflicts?
    Actually we used trunk based development (not so by the book, but pretty close to it), so pretty much everything would go to master and from there we would generate lots of alpha versions per day. Basically one new version every time a push to master occurred. About the feature flags, the teams had full responsibility over their own toggles. That means they are the ones deciding when to enable/disable them. And knowing the release train schedule, they could control their toggles accordingly.
    Conflicting merges in this context were quite rare, since the separation of teams by product was very strict.
  7. How can you protect the platform team so they don’t get pulled into delivering features?
    The platform team works very closely with the app core team, which will take care of features that aren’t yet owned by any feature team. It’s also very important to set expectations very clearly from the beginning of the platform team, with the company, management, any bosses (haha) and with every team that integrates into the app. Communication is key!
  8. What’s the difference between a ‘platform team’ and the ‘core team’ that appeared in one of the first slides?
    The core team is responsible for the main structure of the application that allows other products to be plugged to. In a more practical way, this could be like the home screen of the application, where all of the other flows will start from. The core team will be the one proving ways for the products to communicate with each other and also accessing parts of the main application.
    The platform team though, is the one taking care of more transversal and shared tools/components. Basically trying to accelerate the development and incorporation of other products/teams. Also this team can coordinate the release process and promote the integration between all of the products, technically speaking.
  9. Did you have all the meetings separate in the teams? You’ve mentioned daily so I guess you used Scrum — so planning, refinement, retro, daily — the were all separate?
    Yes. All the teams had their own ceremonies. They could choose the model they prefer to work in like scrum or kanban but we had a few technical sessions with all the teams together when necessary, like tech huddles.
    Also there was an instance to coordinate the release called Release Sync, in which we would discuss the release itself, to have everyone onboard about what was going to production and try to anticipate possible integration problems.
  10. How do you make it work when there are a large amount of features and fewer member teams? Is it possible to be part of several teams ?
    We would only transfer the ownership of a feature if there was a team to receive it — that made sure that that piece of code we decided to modularise was going to be maintained. Each feature would have their roadmap, and if it was over that team could receive a new feature, or we could distribute the developers between existing teams. In case there were no people to work on a specific product or feature, the core team would have to assume and maintain it until we form a new team and handle this product to this new team.
  11. How does it work when the work on a feature is done for now (no longer work on it to do, for some months)? Does this feature team is dispatched across other teams? (what if bugs in it ?)
    In general the team would be disassembled and the members would be re-allocated to other teams. In case of a bug or some fix in this part, the core team would assume it. If necessary, the core team could bring someone from the original team to help out. Moving people to other teams was great because they’d already have the context of the big application and already be aware of some of the process, architecture, patterns and tools used.
  12. Did you ever enter in the scenario where there is too much in for the platform team to do, like the feature teams need a bunch of core things? What was your approach?
    In the first few weeks of the platform team’s existence, it was absolutely overwhelming, exactly because of that. It was necessary to list all the transversal functionalities — existing and necessary — , and shared processes like the release, then prioritise and create a roadmap. The platform’s roadmap was “public knowledge”, so feature teams would know what we had planned for when, and we would frequently revisit and reprioritise based on demand and criticality. Good communication between the platform and the other teams also played a very important role in this situation, because we had to make sure that everybody was on board on where the platform was going, and we needed to hear the needs and feedback from the teams.
  13. Does the amount of Platform Teams control the size of the Core Team, due to demand that they can create? Wouldn’t Core Team grow big enough that it also might need a splitting into small Core Teams?
    Having a big app core team was an actual problem we faced. That didn’t really come from the platform team, as we work closely together and try to get as much out of their hands as we can, be that by modularising and distributing features or improving navigation and other transversal functionalities that live in the app core’s code. Our way to get around that was working with the business to prioritise — we had a well defined roadmap and shared our Product Owner, so they had a holistic view of both teams’ work and responsibilities; besides, our Project Managers worked as a pair in some tasks that would fall in a gray area between teams. About the platform team’s size, it got bigger over time. We started with 3 people and ended having 10 after a couple of months, exactly due to the amount of work we have discovered after the platform team was created.
  14. I’m curious how your team chose to handle merging features from multiple teams. Squash PRs? Rebase on a common integration branch? Merge commits?
    (followup if squash: how do you keep important context about why a change was made if a feature gets lumped into a single commit?) Thanks!
    Please check question 6! :)
  15. How hotfixes in the modules are handled?
    Before the app went to production we would have a full week for testing: each team was responsible to test the integration between the app and their modules and the app core team would test the general parts of the app plus a few test cases of the most used flows of every module. Then, through the next week we would do phased releases and monitor the performance, crashes, usage, etc, and pause the release if anything critical was found.
    If a bug managed to get to prod after all that, we had to evaluate various aspects before making the decision of releasing a hotfix:
    - would it be quick to fix?
    - how many people already had that version (i.e. how far on the release / close to the next one were we)?
    - how many users were actually affected by that bug?
    - does it affect both Android and iOS?
    We’d combine the previous answers and evaluate it on a matrix of probability vs impact and decide what to do.
Probability vs Impact matrix to evaluate a bug's criticality

Usually, if a bug was considered trivial, it would only become tech debt. A minor bug would become a card in the backlog and a major bug would be prioritised for the next release. Critical and blocker bugs would demand more radical measures, and could end up demanding an emergency release.
It is important that the platform team evaluates this as a team that has a broad view of the whole app, because every team of course thinks their bug is very important, but it might not be as critical as they think in the context of the whole app and its usage. As a platform team member, though, prepare: you’ll have to explain that to some unhappy business people.

Big thanks for everyone for watching and talking to us! See you :D

--

--

Larissa Barra

Engineer at Spotify, enthusiastic about many things, curious about everything // Desenvolvedora no Spotify, entusiasta de várias coisas, curiosa por quase todas