This article on costly deployments is part of a series on How to Choose Your First Microservices.
Switching to a new software architecture style is costly. I advocate choosing where to employ microservices first by using them to solve existing problems in the team. This helps the organisation extract value from the new architecture with each step towards the new world. This series is working through a set of problems that your org may have, and how microservices might help solve them.
Tricky Deployments
Some components of your system will be tricker to deploy than others. This is true regardless of whether you’ve already got microservices or you’re deploying a monolith. Some components might be in use by 1000s of customers 24×7, while others might only be used once a month. Upgrading a component might require other components to be placed into a certain state, while others can be upgraded independently of the rest of the system. Coordination may be required between multiple teams in order to effect a trouble-free deploy of some components, whereas others might be actioned by a single person.
Ultimately, we’d like all our components to fall into the second half of those categorisations. The reality of prioritisation and resource constraints means we’re unlikely to reach deployment nirvana for all artefacts.
Let’s think about a monolithic architecture which, in this context, refers specifically to a monolithic deployment. In this type of architecture, deploying a change to one component means deploying everything – you’re unable to deploy components separately. That means that whatever is the trickiest step of the trickiest component to deploy becomes a necessary step in every deploy.
Costly Deployments
I’m going to flip the language now from “tricky” to “expensive”. Usually, when something is tricky to deploy, it’s expensive to deploy. It may be expensive because there’s manual steps to make it lower risk; or there’s coordination required that takes people’s time away from other, more valuable work; or maybe you literally lose revenue every time a component is taken down for a few minutes. Imagine if each component were able to be deployed independently. It’s easy to conceive that the deployment of any component has an average cost to the organisation. And now we can see clearly a huge downside of monolithic deployments: the cost of any deploy is at least the cost of the most expensive component to deploy.
In an architecture with a monolithic deployment, the cost of deploying a change to any component is at least the cost of the deploying the component that is most expensive to deploy. Share on XActually, the situation might be even worse. If these “costs” of deployment can’t be significantly mitigated by the fact that the deploy happens all at once, the cost of deploying any component may be the cost of all components combined.
So, do you have a component in your monolith which costs you significantly more to deploy than other components? If you do, you can reduce the cost of deploying the monolith by extracting that component from the monolithic deploy.
Here’s an Example
At Tyro, we had a piece of software which routed financial transactions from payment terminals to banks and card schemes. It also had many corollary responsibilities. Most notable of those were reporting the results of transactions to backend systems and providing payment terminals with their configuration.
As well as doing the routing, this software maintained the actual links to the banks and schemes. That included Layer 7 protocols for connection establishment, keep-alive and security; multiplexing of message requests and responses; and marshalling/unmarshalling of messages. As much of this code was based off specifications used by hundreds of banks across the world, it was very static. The big international schemes typically issued changes which we might incorporate once every six months. The local banks were even less likely to change.
Now, there was an expectation from our partners about these links. The expectation was that the links remain up – that is, the TCP connections should remain active for a long time. Long time here = months. If a connection went down, they expected to see it re-established almost instantly. If the link wasn’t going to be re-established instantly, e.g. because of scheduled maintenance, they wanted us to tell them in advance. From memory, they wanted to be given a week’s notice.
The code for operating the links rarely changed, but other code in this mini-monolith changed regularly. We were constantly adding new features to the terminals, so the corollary functions were changing constantly. We were rolling out these changes on a two-week schedule.
So, we had a piece of software which contained components that were hardly ever changed, but which were deployed every two weeks because their deployment was tied to other frequently changing components. There was an expense of having to advise third parties of “scheduled maintenance” every other week. More often than not, though, the message about the maintenance wouldn’t get delivered to the right people in one of these partners. A panicked call would then come in during the deployment from one of the partners. They would see the link had been down for 1 minute, their incident management process would trigger and we would then need to respond to that.
Reducing our Costly Deployments
To resolve our costly deployment issue, we decided to move the links to the third parties out into their own piece of software. In fact, we moved each link into its own individual service. This meant that, going forward, each link would only be taken down if the code for that specific partner changed. The links are now able to meet the partners’ expectations of staying online for months. Consequently, the expense for us to coordinate with the other parties was constrained to the few times a year we did an upgrade.
Some years later, the work to separate these components into their own services also paid extra dividends. As the company’s business scaled and transaction traffic increased, the transaction routing software was able to be scaled independently of the links.
Gotchas
The first caveat to this approach is to ensure that deployments of the component you intend to break out will be infrequent. The key payoff comes from just deploying the expensive-to-deploy component less frequently. If the component is actually in active development, then you’re not going to see much saving by moving it out. There’s a relationship between how much non-valuable work is created by deploying the component and the amount of deploys you’ll want to avoid in order to get a positive ROI.
The second caveat is that it’s a good idea to first look for ways to reduce the cost of deploying the expensive component before extracting it. Think about how you can somehow remove the coordination with other teams, or make it less time-consuming, or automate any manual steps associated with costly deployments. Extracting a component from a monolith into its own service its often a costly exercise itself. Reducing the pain associated with deployment may be a cheaper investment in many cases. However, also keep in mind that having components moved into their own services can have many other advantages in the short and long term.
If you want to be notified when future articles are published, sign up using the form just under my photo.
Previous article in this series: Use Microservices to Solve Developers Stepping on Each Other’s Toes
Image credit:
‘US Navy 070720-N-8119R-040‘ by Gretchen M. Roth (U.S. Navy)