Dear Rob – 2
Rob, as some readers may recall from the first post, is a made-up character, representing a typical senior business executive, responsible at a high level for the Agile initiative — among his many other duties. Perhaps the CEO (although probably not a CEO at the Jack Welch level at GE).
In this post, we talk about Quality and Technical Debt.
It is interesting that you want to focus on quality first — this is actually a very good thing.
In Agile, we start to do professional testing in the very first Sprint. This is much earlier than in Waterfall, as you probably know. Many reasons for this; among them, we can measure progress by the amount of completed, ‘fully’ tested code.
In Scrum, we have a Definition of Done for the team. It defines, for the typical ‘story’ they are working on, how ‘done’ the story will get to in the Sprint, or, is expected to get to. Like the meat metaphor — rare, medium rare, medium, medium-well and well done.
Fully done, done, done means the product (and the newly finished stories) is/are in production and being used by the customer with no problems. Most teams have too many impediments to get to done, done, done in one Sprint (a one to four-week timebox) at first.
The Definition of Done should also make clear, especially for the Product Owner, what is not being done in the Sprint. For example, often the team can’t do a full regression test in a fully integrated environment. Often (full) performance testing can’t be done, etc.
As Ken Schwaber suggests, this ‘undone’ work needs to be added to the Product Backlog somewhere.
Two reasons we don’t like anything but ‘done, done, done’ product increments. (There are related reasons.)
- The bad news is getting better with age.
Meaning, of course, worse. For example, all the undiscovered bugs are quickly becoming harder and harder to fix.
- The 90% problem is growing.
The 90% problem is very common in software development. For example, a manager goes to a team and asks them how complete they are (using waterfall). They say 90%, and then he (in this case) asks, ‘how much longer to be 100% done?’ And they say, ‘oh, it took us about nine months to get to here — ummm, that will be another nine months.’ Yogi Berra summarized this problem by saying: “It ain’t over ’til it’s over.”
So, any time the Product Owner sees things in the Definition of Done that are not getting done in the Sprint, he should worry about the 90% problem and discuss those things.
What does this term mean? Well, there is not one simple definition of technical debt. I say it is anything in or around the system that makes it harder to change.
Examples include: Duplicative functions or modules in the code, spaghetti code, lack of automated tests, zero or poor documentation (in the code, perhaps), code written by George (when George has left the firm), the bug list, any upgrade (e.g., to the database or the language or some middleware) that we have put off, etc., etc.
Every system that is six months old has some technical debt, and it is starting to be obvious that it is hurting us — older systems have more technical debt.
If your product is older than two years, I can just about guarantee that technical debt is a serious problem in your shop, and that real Velocity is lower than it should be.
In simple terms, technical debt decreases the velocity of Scrum Teams.
We (seemingly purposefully) grow technical debt by saying “we have to deliver features now; let’s put that stuff off until later,” and then we do that. ‘That stuff’ is building automated tests, upgrading the XYZ, fixing bugs, etc.
Every manager and every team must start to understand technical debt, and fight to get the team to dig out of technical debt because lack of technical debt is key to business agility. Meaning: We can adapt and change with the market and customers faster than our competition if we minimize Technical Debt.
As we discuss these matters, you will find that your shop (and every shop) will discover impediments. It is tempting to say: This is Scrum’s fault! In fact, Scrum is only making the problems around quality and technical debt (and other issues) more visible. To blame Scrum is only to blame the messenger.
So, don’t be surprised that the group identified a bunch of impediments, that they cost money to fix and that you will get real business results (better quality, eventually lower cost, etc.) from fixing them.
How do we dig out of Technical Debt?
How do we improve the Definition of Done over time?
Mainly by removing or reducing (certain kinds of) impediments.
One way of minimizing and maybe reducing technical debt is to focus on quality.
Now, quality is itself a complex subject, and understanding it fully and more completely requires looking at a specific product and specific customers.
But in simplistic terms, one might say it is the lack of technical debt. Or one might say that it means the customer gets a perfectly fitting product (to her problem), with no added ‘features’ (e.g., bugs), no extra delays, no extra effort, and it is a beautiful solution. (I think most products require a certain level of beauty — not necessarily Monet level, but some level. So, to me, quality also involves beauty or elegance, or something like that.)
As relatively low-level indicators of quality in Scrum, we can measure:
- No bugs escaped the Sprint this week. Meaning: For all the stories we say we completed in the Sprint, we did a professional (automated) testing effort (e.g., unit and functional, at least), and any bugs identified were fixed and retested green in the Sprint.
- X number of new automated tests were built this week (maybe one number at the unit level and another at the functional level).
- Y number of automated tests are passing in the regression tests this week.
- Our Definition of Done is relatively strong, meaning that, in areas one through five, when a story is done, it means that no technical debt was built in those areas (at least), or at least we made a serious effort to minimize the new technical debt being built.
- The list of (pre)existing bugs is prioritized, and shrunk by Z items or A% this week.
- We have prioritized the work around an increase of code coverage by the automated tests, and it increased B percentage points in the past week. (And these were meaningful, useful tests, not just baloney tests to make the coverage metrics look better.)
- We saw that in the last release, field issues decreased by C%, comparing first month to first month.
- Our Velocity has increased X percent. This is in part due to less technical debt and in part due to less effort per unit of work due to: better configuration management, better continuous integration, better testing tools and faster testing servers. (If you don’t understand some of these terms, and how they are inter-linked, we should talk. You need to understand them a bit, since they are key to each (Scrum) Team’s success.)
Measuring quality is complicated, and my comments above are over-simplified, but, if the business and technology folks both know you care about quality, that will help a lot. A reasonable, and fairly frequent, discussion around quality metrics can help focus their attention on that. Be careful about getting too focused on one aspect of quality, one metric. The keys are (a) higher quality for the customer, as the customer defines quality and (b) lower technical debt.
Feedback is welcome.