Do software initiatives such maintenance, upgrades and refactoring fall under the umbrella of Technical Debt? Perhaps a trick question, but when this body of work is planned proactively then it never reaches the Tech Debt categorization of work even though it is often ‘confounded’ with the term Tech Debt as another grouping of work that often doesn’t deliver significant new feature functionality to the end customer.
This post is part of series on the overall ongoing management of application health and strategies for understanding and planning to be a good steward of any product. Our first post defined and discussed Tech Debt. This was followed by the corollary of Tech Debt – errors, omissions and mistakes, which are often confounded with the classification of Tech Debt as well.
But these should be managed in their own bucket of work – separate from Tech Debt which is taken on consciously usually for TTM or experimentation reasons (e.g. TTM is the most important driver and/or we want to wait and see how and if the feature is used before we improve its implementation).
In this post we discuss other types of work that constitute good system stewardship but are not Tech Debt (Part I) or errors and mistakes (Part II). These are generally types of work that are needed to keep the application running well on an ongoing basis but are not delivering significant new feature/functionality for the product itself.
The main types of work in this final stewardship related bucket are the more proactive or pre-planned maintenance related items including:
- Currency upgrades (e.g. moving to the next version of Angular)
- Security-focused fixes based on emerging threats (i.e. assuming you follow secure coding practices there will STILL be new security issues that arise and must be addressed and are not a result of mistakes or bad practices)
- Ongoing refactoring to address entropy/decay as many people are working on an application simultaneously
- Framework changes that are broad/impactful in terms of effort (e.g. move from Angular to React front end)
These types of work don’t fit neatly into either of the previous two categories discussed but are sometimes grouped incorrectly into the Tech Debt bucket.
You can think of good housekeeping as being similar to maintaining a house. If you change the air filters and clean the eaves troughs the house will fare better in the long term. Similarly, something like a major framework upgrade is analogous to things like a roof replacement - you can delay it for a while but eventually it will become a disaster and present high risk to the home’s occupants if not addressed in time.
These types of maintenance, care and feeding are different from errors and omissions which by definition are not something that you can NOT plan in advance, since they result from ‘something you did not know or forgot completely’.
As a technology leader, even though this work is usually relatively straight forward to define and plan for, there can be a few hazards to actually getting this work done. Even though this application stewardship bucket of work is generally proactive, with the rise in public cloud computing it can be challenging to track, catalogue and understand the implications of some upgrades in the infrastructure.
A best practice is to track all dimensions of the system in an application inventory including monitoring for upgrades at a regular cadence. It also helps to work closely with DevOps teams (depending on your team topology) to help keep the inventory current including clarity of who is accountable for stewardship of EACH part of the system and what the process is for pro-actively determining when upgrades are coming and if and when you will adopt them. I have seen it work well to use a standard quarterly operational review forum to scan for upgrades across the FULL stack inventory and determine what is to be addressed in the coming 2-3 quarters.
Another common challenge I have experienced in staying on top of this backlog of work is simply aligning the cross-functional team on prioritization, engineering capacity and timing to address each item. Although currency and other platform or framework upgrades sometimes do add functionality to a product, they are often more about staying current with an underlying components’ security capabilities or adding functionality that benefits speed of development or application performance or other similar benefits that are more indirect from an end customer perspective. Especially when working with Product Managers that do not have a technology background it is critical to explain the upgrade or refactoring benefits in terms of:
- Objective and measurable evaluation criteria wherever possible
- Tie benefits to clear and cross team aligned end customer benefits
- Enumerate risks of delaying – again in objective terms tied to business and customer outcomes
It is also helpful to catalog this grouping of work in whatever system is used to manage the backlog of work so that it is straight forward to quantify the engineering capacity for care and feeding of the system on an ongoing basis and to roughly understand the split of work between:
Having an understanding of these groupings of work helps in multiple ways. First of all it helps to understand how healthy (or unhealthy) the application is overall as we would expect new feature functionality to be at least 60+% of the total effort for a going concern application that has been and will remain in service for the foreseeable future. Next having this understanding helps add objectivity to planning for this fiduciary work with Product on an ongoing basis since the amount of time it is currently taking is commonly understood and quantifiable.
Under normal circumstances, Product pushes to spend more time on New Feature Functionality and less time on ‘basic care and feeding’. This is a typical push/pull between product and engineering that is often a larger prioritization gap in older more established companies with Product leaders from varying backgrounds and mixed levels of technical understanding. Having care and feeding work defined, inventoried and tracked ongoing goes a LONG way to creating common understanding of the work to be done – even if prioritization is not easy to agree upon.
On the flipside, if you don’t define, inventory and quantify this work ongoing, then this bucket of work starts to merge with the bucket of Errors/Omissions/Mistakes, as in this case, although there is a difference between ignorance and neglect, the poor quality outcome is the same. That is, if there is no system for defining, tracking and prioritizing this work then it will soon become errors and omissions as the team will likely miss upcoming currency needs (creating security and other compatibility risks) and the application will decay as new features are added without ongoing refactoring, modernization and cleanup.
So now we have come full circle – while this work is NOT tech debt, if you ignore or don’t proactively manage these needs then ultimately these items become errors/omissions in your application stewardship and need to be added to your tech debt ledger once they are finally identified.
In my experience, it is worth the effort to understand and closely manage this work, jointly with Product Management, and keep your system as healthy as possible on an ongoing basis. Contrast this approach with the other extreme of starving application health and digging a deep hole that momentum will cascade over time to poorer quality, higher risk and a trend that is hard to recover from - just like Financial Debt when interest exceeds the teams’ ability to manage both interest and capital draw down. Don’t let ongoing basic maintenance move from an opportunity to optimally manage your system risk to be part of your Tech Debt ‘baggage!’