The Microservices vs Conway Test
Following on from my article on Mescoservices back in 2015, this article expands on an idea I had in September on how monoliths, mescoservices, and microservices fit into organisation design. The microservices vs Conway test encodes a common piece of advice into a first-draft formula for testing your architecture against your organisation.
Microservices offer several benefits, and also some cost. In return for increased complexity, you get to mix different technology and scale up the number of autonomous teams working on the platform. A simple way to look at this is to imagine a successful team working on a monolith who need to either broaden the scope of their application, or divide the work between themselves and another team. If they can find a seam that allows them to divide the monolith, each team can work autonomously on each of the new parts that have been created.
Not only does each team get to work how they want, using whatever tech stack they choose; they also get to work at their own pace without tripping over or impacting the other team.
This is just one example of the Inverse Conway Manoeuvre. Whereas Conway’s Law states that any application’s architecture will end up looking a lot like the organisation’s communication structure, the Inverse Conway Manoeuvre utilises organisation design to take advantage of Conway’s Law. Putting it bluntly, you fix the communication structure of the organisation to ensure that when Conway’s Law strikes, it results in the software architecture you intended.
So, it’s pretty common for people to give advice that includes organisation design, and warnings about the complexity trade-off when microservices crop up.
Microservices vs Conway Test
If we use m to represent the number microservices you have, and t to represent the number of teams you have, we can use the following test to determine how well microservices fit into our organisation by testing the resulting complexity, which we’ll call c.
If things go well, complexity will scale linearly. When this is the case, the complexity will be zero. If you have too many services compared to teams, or too many teams compared to services, current wisdom says that things will be more complex. Negative numbers indicate the complexity of multiple teams tripping over each other as part of the delivery pipeline, for example multiple teams attempting to service a monolith. Positive numbers represent complexity that is being introduced without benefit.
For example, one team on a monolith will score zero. Three teams working on three services will score zero. This doesn’t mean zero-complexity; it means the complexity and benefit are likely to be balanced.
To look at some negative cases, four teams working on a monolith will score 9, as will one team working on four services.
The relationships can be described with the following examples.
The relationship for a single team is shown below. Complexity increases as more services are added to a single team. The more you hope to break Conway’s Law, the more the complexity hurts.
The relationship for a larger number of teams is illustrated below. Where we have five teams, we can survive give-or-take two either way. But if there are too-few services, or too-many services we increase the otherwise linear complexity.
The complexity curve follows the assertion that the further you deviate from the team-per-service organisation design, the more complex things will become; no matter whether it is too many teams for the number of services, or vice versa.
Comparing these figures to real-world examples, I would say that single-digit complexity is desirable unless you can take additional action to limit complexity.
Other Complexity Limiting Techniques
I’ll add more examples as they emerge, but Monzo (who have more microservices than most organisations) undertook an exercise to limit the connections. By making connections explicit, they prevented a situation wherein any service could talk to any other. This massively reduces the total complexity. For example, if you have 100 services all able to talk to the others, you have the classic
(n x (n – 1)) ÷ 2 problem (4,950), but if you review and limit connections to a maximum of five dependencies, you limit the connections to just 500. If you review and limit connections, you can understand how much complexity a team has based on the exact number of connections a team must manage.
Having been careful to consider Conway’s law, I have avoided designing teams and architecture in isolation of each other. I believe this is the only way to ensure the design of both is successful. If you don’t balance the design on both sides (the technology on one side and the people on the other), the complexity is damaging to both.