An Agile Approach to Technical Debt

You,Mon Oct 09 2023•agile

We’ve all been there.

“Product owner, I really think we should pay down this piece of technical debt.”

“Well, how long is that going to take and how much time is it going to save?”

“Erm, I’m not really sure…”

“Hmm, can you find out?”

“Not very easily. It’s definitely causing a lot of pain though.”

“Alright, we’ll put it in the backlog, but without a good idea of the return on investment it’s going to be difficult to prioritise above what we have in the roadmap”

Needless to say, that debt won’t be paid any time soon. Of course, sometimes, you can put a number on things. Performance-related issues, such as slow tests, are easy to quantify in terms of time spent. But the hard stuff: the poorly-designed data schemas, tight coupling across distant parts of the codebase, the mystifying architectural choices of ancient history? By the time you’ve worked out how to measure the time lost to issues like those, you could have fixed them ten times over. And so they never get resolved, and we plough on adding features on top of them until the team grinds to a halt because they can't touch one part of the system without something in a far distant part of the codebase simultaneously combusting.

And who, really, can argue with the product owner from the conversation above? Wanting to know what the return on investment for the effort is going to be is a perfectly reasonable request, isn’t it?

Actually, no. It isn’t.

There aren’t many product owners, more than twenty years after the publication of the agile manifesto, that would be comfortable giving substantial guarantees regarding the cost and impact of a complex new feature. These days, we recognise that software development is a process of constant learning and iteration in an uncertain environment. A good product owner is constantly looking for opportunities to improve the product based on customer feedback. When they identify a potentially valuable piece of work, they organise it as a series of experiments, where the goal is to learn as much as possible whilst committing minimal resources. This allows them to quickly assess the potential value of the work to customers and therefore to gauge its probable impact on the business’ bottom line. Given that in general we’re ready to take a risk based on customer feedback in order to unlock value for the business, why are we so reluctant to do the same with feedback from our developers? Why do we often insist on a quantifiable business case when it comes to investing in resolving technical debt? Remediating a complex piece of technical debt is no less experimental a process than exploring a significant new feature, so why do we generally treat the two cases so differently?

We know why, of course. Devs aren’t the customer. They don’t pay the bills. And in any case, devs are always complaining about technical debt. If it was up to them they’d spend all their time playing with shiny new Javascript frameworks and doing endless refactorings and nothing would ever be delivered. I’m caricaturing, but there’s some truth in that argument. Some developers, and I’ve been one of them earlier in my career, do like to obsess about code quality to an unprofitable degree. Fortunately, you (probably) wouldn’t invest in a new feature based on feedback from a single customer, and the same is true when it comes to technical changes suggested by developers.

On the other hand, if a large part of the development team is saying that there’s value to be gained from paying down technical debt, then they’re usually worth listening to. Good engineers understand that getting value into the hands of customers is the number one goal, and they will only advocate for working on technical debt if they really believe it will be beneficial to the product. It is important to ensure that they have lots of exposure to both customers and the business so that they can effectively make such judgements. Once those feedback loops are in place, teams can start identifying debt which is having a negative impact on business success. However, if we require them to devote the effort required to get accurate estimates of savings and costs before starting work on remediation, then nothing will change.

By contrast, once we start approaching technical debt in an agile manner, instead of as a waterfall-style “deliver x by y” proposition, we can apply the same proven techniques that we do to product management to increase our chances of success. For example, we assess product development effectiveness via a number of metrics. Customer satisfaction surveys, net promoter scores, retention rates and recurring revenue are common choices. It’s important to note that these metrics are measuring outcomes, not deliverables, and that they are lagging indicators. Just because you’ve beaten the competition to market with a killer new feature, doesn’t mean you’re instantly going to see a massive jump on your NPS.

When it comes to measuring technical effectiveness, we have the DORA metrics identified by Dr. Nicole Forsgren, Jez Humble and their team. The DORA metrics have been proven through substantial research to have a significant impact on business success. They also have similar characteristics to our product metrics, in that they measure outcomes and are lagging indicators. You can’t predict exactly how they will be impacted by any particular change. However, we’re able to accept that we can’t know what the impact of a new feature will be before we release it to customers. We should be able to accept the same in the case of attempts at technical improvements. It is also widely acknowledged that it is valuable to invest significant resources in capturing accurate and useful product metrics, and the same is true when it comes to measuring engineering effectiveness.

Once we’ve got our metrics in place, we can feed them into our prioritisation decisions. We might see that our customer-facing metrics are positive, but, for example, our cycle time is increasing, slowing the flow of value to customers. In this situation, we can run an experiment to reduce the complexity of our codebase, and see what the impact is on cycle time. On the other hand, if our customer retention rate is dropping because our competitors have stolen a march on us, then that refactor will have to wait. The important thing is that we evaluate the priority of work to resolve technical debt using the same criteria that we would any other work, i.e. “which work would have the biggest positive impact on our business according to our (carefully) selected metrics?”.

So here’s my request to the tech industry. When we’re talking about technical debt, let’s stop talking about business cases and returns on investment. Instead let’s talk about experimenting, measuring, learning, and iterating, just as we do every day we spend improving our products for customers. Paying down technical debt is still software development. The usual rules apply.