The full cost of supporting software is often overlooked. It's relatively easy to track the time it takes to fix bugs that crop up, but that isn't the full picture. The problem is that when so much of the support process goes unnoticed, it's difficult to know where you could be saving time.
The cost of dealing with a bug always consists of more than meets the eye. Here are a few questions to ask, and how to make things more efficient.
How often does your application fail?
This may seem like an obvious question, but you'd be surprised how many teams don’t know how well their application is doing in production. I’ve worked on projects for users who were sitting less than a mile away, and yet we had absolutely no idea what the experience of using the software was like.
There are many ways to gather feedback: surveys, chat tools, and automated bug reporting, among others. But the principle is simple: Talk to your users and find out whether your application works. The most expensive cost of supporting a system is not knowing what bugs to fix, because you'll lose users and never know why.
What's the impact when your application fails?
People are probably not using your application just for fun. They want to do things such as book their holiday, invest their money, or apply for a social grant. How badly can things go wrong for them? The worse the impact on your users, the higher the cost to your team. A bad bug can seriously damage your reputation.
Find ways to make your system more robust in the event of failure. A simple example would be to try exiting a payment process before the payment stage to see if anything goes wrong, such as the payment being taken unnecessarily.
Focus on improving your quality practices (such as automated testing and pair programming) and performing preventative maintenance (such as upgrading dependencies) when working on high-impact areas of your system.
How quickly do you find out?
Do you need to wait for an angry phone call or tweet before you know something has gone wrong? The later you find out about a bug, the bigger the hit to your reputation and the more likely that you'll be fixing the issue under pressure and causing further issues.
Set up some dashboards for your team to look at daily—humans are remarkably good at noticing patterns and deviations. Look into automated alerting, so your team is notified when something bad happens in production. With these techniques, you can often find out about issues before a user is sure that something has gone wrong, and fix those issues before they affect more people.
How long does it take to diagnose an issue?
You've noticed that something’s not right, perhaps because an alert fired or one of your graphs dipped unusually low for a Monday morning. But without the right diagnostic information at the ready, figuring out what went wrong becomes a time-consuming and frustrating task.
The trick is to ensure that the right information is available when an error happens. to do that you must have good logging in place. When an error occurs, it's useful to log the unique ID of the item the user was interacting with (e.g., an order number or a user ID) so you can look in your database to see what state it's in. Ensure that you log a stack trace too, so you've got a starting point in the code.
Make it easy to find and view your logs by using tooling such as the ELK Stack. Another trick is to set up audit collections so it's easy to see a list of relevant interactions.
If an error occurs and you don’t have the right information in your logs, don’t panic! Do your best with what you have, and then add the logging you wish you had so it will be available next time.
How long does it take to fix an issue?
Surprisingly, this is usually the fastest part. Once engineers find an issue, they’re good at spotting a quick fix or workaround they can put in place. Don't forget to look again after applying the initial patch, because sometimes you'll find a more fundamental problem (or fix) that presents itself on further investigation.
Making an investment in the engineering practices of your team really pays off when it comes to fixing bugs. How easy is your code to navigate? Does everyone on your team know the system well enough to fix a bug, or do you have key-person dependencies? Are there automated tests to tell you whether you've accidentally broken something else?
How long does it take to ship a fix?
It’s extremely frustrating to have a bug fix ready and not be able to get it in front of users because your release cycle is too slow. For many issues, the ability to ship code quickly means engineers can release an initial workaround to stop the worst of a bug—making sure a broken web page renders, or stopping bad data from being created—and calmly spend time investigating what else needs to be done.
When releases take too much time and effort, I've seen panic set in, which makes it much harder for people to come up with sensible fixes.
Make sure you have an automated deployment pipeline and that you adopt some DevOps techniques to make your releases faster and less stressful.
How do you stop it from happening again?
It's easy to forget this last phase, but it's important to take some time to reduce the chance of similar issues in the future. Have a postmortem with your team to unpack what happened and find ways to improve. Write automated tests for defect scenarios wherever possible, and take small steps to make your code easier to work with.
One step at a time
Supporting your application costs more than you think. By asking yourself these questions, you can break the process into its components and find ways to reduce the overall time and frustration involved in dealing with production issues.
Keep learning
Take a deep dive into the state of quality with TechBeacon's Guide. Plus: Download the free World Quality Report 2022-23.
Put performance engineering into practice with these top 10 performance engineering techniques that work.
Find to tools you need with TechBeacon's Buyer's Guide for Selecting Software Test Automation Tools.
Discover best practices for reducing software defects with TechBeacon's Guide.
- Take your testing career to the next level. TechBeacon's Careers Topic Center provides expert advice to prepare you for your next move.