I consult, write, and speak on running better technology businesses (tech firms and IT captives) and the things that make it possible: good governance behaviors (activist investing in IT), what matters most (results, not effort), how we organize (restructure from the technologically abstract to the business concrete), how we execute and manage (replacing industrial with professional), how we plan (debunking the myth of control), and how we pay the bills (capital-intensive financing and budgeting in an agile world). I am increasingly interested in robustness over optimization.

I work for ThoughtWorks, the global leader in software delivery and consulting.

Sunday, September 30, 2012

The Quality Call Option

On time. On scope. On budget. The holy grail of industrial, cost-center IT. Plan your work and work your plan. Be predictable.

As a set of goals for software development, this troika is incomplete because it neglects quality. That is a costly omission. Poor quality will ultimately steamroll the other three project variables: we'll be late, we'll remove features, and we'll spend a lot more money than we want to when we are forced to chase quality. We assume quality at our own risk.

Why do so many otherwise smart people take quality for granted? Mainly because they can't see how easily quality is compromised in development. It's easy to assume that requirements and specifications are sufficiently thorough, the developer just needs to work to the spec. It's easy to assume that we're hiring competent developers who write high quality code, and most buyers don't know what to look for or how to look for technical quality. It's easy to assume our Quality Assurance phase will guarantee solution quality by inspecting it in, instead of agonizing over how to build it in through every action we take in development.

And we tend not to realize that quality is in the eye of the beholder. When we have long feedback cycles - when it could be days, weeks or even months before one person's work is validated by another - the person responsible for creation is de facto the person responsible for validation. When "on time / on budget / on scope" is the order of the day, nobody is going to cast a critical eye over their own work.

It takes a while before we find out that one person's "done" is another person's "work-in-progress". The moment when a quality problem makes itself known is the moment when our implicit assumptions about quality start to become expensive.

There are a few simple things managers can do to give quality the same prominence as time, cost and scope. The first is to reduce the number of hand-offs in the team, and to strengthen those hand-offs.

Every hand-off injects incremental risk of misunderstanding. The greater the number of hand-offs, the greater the risk that we will have integration problems or understand the problem differently or solve the wrong problem. Development teams have a lot of specialists, both role (business analyst, developer, user experience designer, quality assurance analyst) and technical (mobile client developer, server-side Java developer, web client developer, etc.) Our ideal team is staffed by generalists who can each create a whole business solution. If we don't have people with generalist skills today, we must invest in developing those capabilities. There's no way around this. And there's no reason to accept technical specialization: we don't need to have mobile client developers as well as services developers and database developers. We can invest in people to become mobile solution developers who can write client and server code. Generalization versus specialization is our choice to make.

It takes time to professionalize a development team that has too many industrial characteristics. Until we've got a team of generalists, we have to make do with our specialists. As long as we do, we have to improve the fidelity of our hand-offs.

Rather than performing technical tasks in isolation, we can have specialists collaborate on delivering functional requirements. Have developers - even if they are fluent in different technologies - actively work together on completing an entire functional solution. That's easy to write but difficult to do. We must expect that we'll need to facilitate that collaborative behavior, particularly in the early going: collaborating on delivery of a functional solution is not natural behavior for people accustomed to going flat out to complete code by themselves.

We can increase collaboration across roles, too. We can have business analysts and developers perform a requirement walk-through before coding starts. We can have a business analyst or QA analyst desk check software before a developer promotes it as "development complete". We can insist on frequent build and deployment, and iterative QA. All of these shorten our feedback cycles, and gives us an independent validation of quality nearer to the time of creation.

This type of collaboration might seem matter-of-fact, but it doesn't come naturally to people in software development. A manager has to champion it, coach people in it, witness it first hand to assess how effectively people are performing it, and follow through to make sure that it takes place on its own.

Stronger hand-offs will slow down coding. The act of creating code will take longer when we insist that BAs do walkthroughs with developers, that developers collaborate on business solutions, and that QA desk-checks before code is promoted. But the velocity at which a team can code is not the same as the velocity at which a team can deliver software at an acceptable level of quality. It's easy to cut code. It's not easy to cut code that is functionally and technically acceptable. That means the former is a lousy proxy for the latter, and our management decisions are wildly misinformed if we focus on "dev complete".

Knowing the velocity at which a team can deliver a quality product gives us far more useful managerial information. For one thing, we have a much better idea for how long it's likely going to take to deliver our desired scope to an acceptable level of quality. This makes us less vulnerable to being blindsided by an overrun caused by lurking quality problems. We also have a better idea for the impact we can expect by adding or removing people. A team focused just on delivering code might deliver more code in less time if it has more people. But quality problems tend to grow exponentially. When quality is not a consideration in project planning, we tend to forget the dichotomy that a larger team can need more time to complete software of acceptable quality than a smaller team.

Another thing we can do is make our quality statistics highly visible. But visibility of the data isn't enough; we have to visibly act on that data. It's easy enough to produce information radiators that profile the state of our technical and functional quality. But an information radiator that portrays a decaying technical quality picture will have a negative effect on the team: it will communicate that management doesn't really care. Something we're prepared to broadcast is something that we have to be prepared to act on. When we expose quality problems we have to insist on remediation, regardless if that comes at the cost of new requirements development. That puts us face-to-face with the "quality call option": developing more bad stuff is less valuable than developing less good stuff. That might seem obvious, but it doesn't fit into an "on time / on scope / on budget" world. It takes courage to exercise the quality call option.

Finally, we have to look not just at project status data, but at flow data. Projects throw off copious amounts of flow data - trends over time - that most managers ignore. These are a rich source of information because they indicate where there are collaboration - and therefore quality - problems. For example, we can easily track a build-up in the number of defects that developers have tagged as ready for retest, but are not retested in a timely fashion by QA. We can look for a pattern of defects raised that are dispositioned as "not a defect". We can monitor defects that are reopened after retest. And we can look at the overall ratio of defects to test activity as an indicator of fundamental asset quality. The patterns of change in this data over time will indicate performance problems that undermine quality.

As people who understand how to deliver software, we have to insist on quality being on an equal footing with cost, scope and time. Getting that footing is one thing, keeping it is another. We need to measure both asset quality and team performance in terms of results of acceptable quality. We need to scrutinize our project data for threats to quality. We need to foster individual behaviors that will reduce self-inflicted quality problems. And we have the courage to broadcast our state of quality and act decisively - to exercise the quality call option - to defend it.