The Agile Manager

I consult, write, and speak on running better technology businesses (tech firms and IT captives) and the things that make it possible: good governance behaviors (activist investing in IT), what matters most (results, not effort), how we organize (restructure from the technologically abstract to the business concrete), how we execute and manage (replacing industrial with professional), how we plan (debunking the myth of control), and how we pay the bills (capital-intensive financing and budgeting in an agile world). I am increasingly interested in robustness over optimization.

Friday, May 31, 2013

Self-Insuring Software Assets

When we buy a house, we also buy insurance on the house. Catastrophic events can not only leave the homeowner without shelter, they can be financially ruinous. In the event of a catastrophe, the homeowner has to be able to respond quickly to contain any problems, and be able to either enact repairs herself, or mobilize people and materials to fix the damage. Repairs require building materials and a wide variety of skills (electrician, mason, carpenter, decorator). Most homeowners don't have the construction skills necessary to perform their own repairs, so they would have to set aside large capital reserves as a "rainy day fund" to pay for such repairs. But homeowners don't have to do this, because we have homeowner's insurance. In exchange for a little bit of cash flow, insurance companies make policyholders whole in the event of a catastrophe by providing temporary shelter, and providing the capital and even the expertise to make repairs in a reasonable period of time.

When we build software, we don't always buy insurance on the software. For software we build, we underwrite the responsibility for the fact that it will be technically sound and functionally fit, secure and reliable, and will work in the client and server environments where it needs to operate. As builder-owners, we have responsibility for these things during the entire useful life of the software. This obligation extends to all usage scenarios we will encounter, and all environmental changes that could impair the asset. If regulation changes and we need to capture additional data, if a nightly data import process chokes on a value we hadn't anticipated, if our stored procedures mysteriously fail after a database upgrade, if the latest release of FireFox doesn't like the Javascript generated by one of our helper classes, it's our problem to sort out. There is nobody else.

In effect, we self-insure software assets that we create. When we build software, we underwrite the responsibility for all eventualities that may befall it. Self-insuring requires us to retain people who have the knowledge of the technology, configuration and code; of the integration points and functionality; of the data and its structures; and of the business and rules. It also requires us to keep sufficient numbers of people so that we are resilient to staff turnover and loss, and also so that we can be responsive during periods of peak need (the technology equivalent of a bad weather outbreak). Things may be benign for most of the time, but in the event of multiple problems, we must have a sufficient number of knowledgeable people to provide timely responses so that the business continues to operate.

The degree of coverage that we take out is a function of our willingness to invest in the asset to make it less susceptible to risk (preventative measures), and our willingness to spend on retaining people who know the code and the business to perpetuate the asset and to do nothing else (responsiveness measures). This determines the premium that we are willing to pay to self-insure.

In practice, this premium is a function of our willingness to pay, not of the degree of risk exposure that we are explicitly willing to accept. This is an important distinction because this is often an economic decision made in ignorance of actual risk. Tech organizations are not particularly good at assessing risks, and usually take an optimistic line: software works until it doesn't. If we're thorough, previously unforeseen circumstances are codified as automated tests to protect against a repeat occurrence. If we're not, we fix the problem and assume we'll never have to deal with it again. Even when we are good at categorizing our risks, we don't have much in the way of data to shed light on our actual exposure since most firms don't formally catalogue system failures. We also have spurious reference data: just as a driver's accident history excludes near-miss accidents, our assessments will also tend to be highly selective. Similarly, just as an expert can miss conditions that will result in water in the basement, our experts will misjudge a probable combination of events that will lead to software impairment (who in 2006 predicted the rise in popularity of the Safari browser on small screens?) And on top of it all, we can live in a high risk world but lead highly fortunate lives where risks never materialize. Good fortune dulls our risk sensitivity.

The result is that the insurance premium we choose to pay in the end is based largely on conjecture and feeling rather than being derived from any objective assessment of our vulnerability. Most people in tech (and outside of tech) are not really cognizant of the fact that we're self-insuring, what we're self-insuring against, the responsibility that entails, and the potential catastrophic risks that it poses. Any success at self-insuring software assets has little to do with thoughtful decision making, and more to do with luck. If operating conditions are benign and risks never manifest themselves, our premium looks appropriate, and even like a luxury. On the other hand, if we hit the jackpot and dozens of impairments affect the asset and we haven't paid a premium for protection, our self-insurance decision looks reckless.

Insuring against operating failures is difficult to conceptualize, more difficult to quantify and an even more difficult to pay for. We struggle to define future operating conditions, and the most sophisticated spreadsheet modeling in the world won't shed useful light on our real risk exposure. Willingness to pay a premium typically comes down to a narrative-based decision: how few people are we willing to keep to fix things? This minimal cost approach is risk ignorant. A better first step in self-insuring is to change to an outcome-based narrative: what are the catastrophes we must insure against and what is the income-statement impact should those happen? This measures our degree of self-insurance against outcomes, not on costs.

Tuesday, April 30, 2013

Zombie Businesses

The Lehman bankruptcy is best known as the event that triggered a financial crisis. For many firms, it also sowed the seeds of an operating crisis.

Revenues plummeted at the end of 2008. Companies retrenched by laying people off. Managers coped with smaller staffs by asking employees to perform multiple jobs and to work longer hours. With remaining employees grateful to have kept their jobs, and with the economy leveling off rather than staying in freefall, corporate profitability rebounded as early as 2009.

Smart business leaders knew this wouldn't last, because you can't run a business for very long by running people into the ground. True, jobs weren't a-plenty and a depressed housing market meant employees weren't going to chase dream jobs. Plus, economic indicators gave no reason to believe things were going to improve any time soon. Still, the employer's risk of losing people who held the business together increased every day that the "new normal" set in. Smart leaders got in front of this.

From early 2009, healthy companies boosted their capital spending.¹ They used their capital in three ways.

The first was defensive. Managers classified as much work activity as "capital improvement" as they could. Doing so meant labor costs could be capitalized for up to 5 years. This let businesses retain people without eroding profitability. This prevented companies from losing employees with systemic knowledge of the business and intimate knowledge of customers.

The second was offensive, investing in core business operations. Companies invested in technology² to lock in those post-layoff productivity gains, and to improve customer self-service offerings since they had fewer employees to service customers. These investments made operations a source of competitiveness by lowering costs, increasing efficiency, and making businesses more responsive to customers. This made these firms better able to compete for market share - essential in a slow-growth world.

The third use was financial restructuring and building reserves. This meant issuing debt, and retiring equity. Debt was cheap, as interest rates hit record lows during the crisis. Debt was also in demand, as market liquidity was making a "flight to quality" and many corporations sported high credit ratings and had large cash balances that could comfortably cover interest payments. Debt-for-equity restructuring lowered the total cost of capital. It also benefited boards and CEOs by concentrating ownership of the company in fewer people's hands.

Smart business leaders responded to the financial crisis not only by protecting operations, but by improving and reforming them, taking advantage of cheap capital during the crisis to pay for it.

But many small businesses, high-risk businesses, and poorly capitalized businesses had neither capital cushions nor creative accounting to protect their operations. All they could do was cut costs and hope for the best. And cut they did. The people who got salary bumps in the boom years from 2006 through 2008 became "high-salary outliers" in 2009. It didn't matter that those were the people at the core of the company's capability and drive. When facing financial ruin, the CFO calls the shots, and the pricey people are the first to go regardless the impact to operations.

These cuts may have staved off bankruptcy, but set the stage for an operating crisis by depleting firms of core operating knowledge and contextual business understanding. Cuts made at the beginning of the crisis left few people - and often no single person - fluent in the details of business operations. Those who remain can mechanically perform different tasks, but don't understand why they do the things they do. The business has continued to run, but it runs on momentum. It doesn't initiate change. It erodes a little bit here and there, as employees exit and clients find the offerings stale and go elsewhere. Costs increase as salaries and stay bonuses are showered on those with the most experience in the mechanics of the business. Pricing power decreases as employees lose the ability to articulate the value of what they provide to their customers. As more time passes, the more acute this crisis becomes: margins get squeezed while the business itself becomes operationally sclerotic.

Just as there are zombie loans (banks keep non-performing loans on their books because they don't want to take the writedown), there are zombie businesses. They transact business and generate revenue. They have years of history and long-standing client relationships. Such a business may look like it can be successful with some investment, but lacking that core operating knowledge, it's a zombie: animated but only semi-sentient.

These firms will only have attracted risk capital late in the post-Lehman investment cycle. Because they haven't been making investments in efficiency or customer service, their first use of fresh capital will be to hire new operational support people in an attempt to get caught up. That's costly and inefficient, and it just adds capacity of people who know how to perform the same mechanical tasks. It won't change the fact that austerity depleted the firm of fundamental operating knowledge. New managers brought in with this investment will struggle to unwind the piled on (and often undocumented) complications of the business, while new people in execution roles will get no further than replay of the mechanical processes for running the business.

Resuscitating these businesses - bringing much needed innovation and structural reform to a firm that has been starved of it for a long time - is a time-consuming and costly proposition. First, nobody has time to spare: they're constantly on fire trying to control operations and contain costs of the fragile machinery of the business. Second, they don't know what to do: because nobody knows the business context very well, there isn't anybody who can competently partner on initiatives to reform the business. Third, middle managers lack the will: the trauma of the cuts and years of thin investment will have rendered decision makers reactive (keep operational flare ups under control) instead of aggressive (reinvent the business and supporting systems). Fourth, those same middle managers can only conceptualize the business as what it has always been; they'll lack the imagination to see what it could be.

Surviving a prolonged downturn is not necessarily the mark of a strong business. As Nassim Taleb pointed out in The Black Swan, a lab rat that survives multiple rounds of experimental treatment (say, exposure to high dosages of radiation) isn't "stronger" than rats that do not. The survivor will be in pretty bad shape for the experience. The heroic story of the tough and resourceful survivor isn't necessarily applicable to the business that survives tough times. The apocryphal story of the zombie is a better fit.

¹ Many businesses reported a significant uptick in capital spending from 2009-2011 compared to 2006-2008.

² Strong corporate IT spending is one reason why the tech sector was counter-cyclical from 2009-2011.

Monday, March 18, 2013

The Return of Financial Engineering - and What it Means for Tech

Soon after the financial crisis began in 2008, companies shifted attention from finance to operations. Finance had fallen out of favour: risk capital dried up, asset prices collapsed and capital sought safe havens like US Treasurys. The crisis also ushered in a period of tepid growth undermined by persistent economic uncertainty. This limited the financial options companies could use to juice returns, so they focused on investing in operations to create efficiencies or fight for market share. In the years following the crisis, the "operations yield" - the return from investing in operations - outstripped the "financial yield" - the return from turning a business into a financial plaything.

Since the start of the crisis, central banks have pumped a lot of money into financial markets, principally by buying up debt issued by governments and financial assets held by banks. This was supposed to spur business activity, particularly lending and investing, by driving down the cost of both debt (lower lending rates) and equity (motivating capital to seek higher returns by pursuing riskier investments).

Whether lending and investing have increased as a result of these policies is debatable. One thing they have done is encourage companies to change their capital structure: low interest rates have made debt cheaper to issue than equity, so companies have been busy selling bonds and buying back their own stock. There are financial benefits to doing this, namely lowering capital costs. It benefits the equity financiers by concentrating ownership of the company in fewer hands. But beyond reducing the hurdle rate for investments by a few basis points - and not very much at that given low interest rates - this doesn't provide any operational benefit.

What's not debatable is that it's brought about a return of financial engineering. CLOs and hybrids are back. Messers Einhorn and Buffet are engineering what amount to ATM withdrawals through preferred share offerings from Apple and Heinz, respectively. The OfficeMax / Office Depot merger is addition through subtraction: projected synergies exceed the combined market cap of the two firms.

When financing yields are higher than operating yields, business operations take a back seat to financial philandering. Consider Dell Computer. Dell's business lines are either in decline (services) or violently competitive (PCs and servers). Does it matter whether Dell is funded with public or private money? Will being a private firm make Michael Dell better able to do anything he cannot do today? It is hard to see how it does. But it does let him play tax arbitrage.

This suggests a bearish environment for investment in operations. The shift to debt in favour of equity, the rise in share buybacks, dividend payouts and M&A suggest that companies have run out of ideas for how to invest in themselves. Cash flow that would otherwise be channeled into the business is betrothed to financiers instead. Debt yields are kept low and roll-overs made easier by stable and consistent cash flows, and equity easier to raise when cash flows from operations are both strong and consistent. This compels executives to set a "steady as she goes" agenda - not an aggressive investment agenda - for business operations.

A reduction in business investment is not particularly good news for tech firms selling hardware, software or services to companies. But it's not all bad news. Rewarding financiers by a starving business for investment makes a company sclerotic. That clears the way for innovators to disrupt and grow quickly. But until those innovators rise up, finance is positioned to stymie - not facilitate - business innovation.

Thursday, February 28, 2013

Investing in Software Through Experts, Analysis or Discovery

Whether investing in equities or in software, there are three distinct approaches to how we make an investment decision: based on our intuition or experience, based on our analysis of an opportunity, or based on our ability to rapidly discover and respond to market feedback. Let's look at each in turn.

When we invest based on experience, our decisions are rooted in the expertise we've gained from things we've done before. When based on intuition, our decisions are based on strongly held convictions. An equity investor familiar with retail might recognize demographic changes in specific geographic areas (e.g., migration of families with young children to Florida and Texas) and intuitively invest in firms exposed to those changes (such as commercial construction businesses operating in those areas). In the same way, somebody who has first hand experience can invest in developing a technology solution: the inspiration for Square, Inc. came from a small firm that couldn't close a sale because it couldn't accept credit cards. Although expert investing will likely involve a little bit of analysis, for the most part the investor relies on gut. Because we invest primarily on the courage of our convictions, capital controls on intuitive investments tend not to be very strict. In absolute terms, the capital commitment is generally small, although in relative terms the committed capital may be quite large especially if it is a private placement. As a result, the capital tends to be impatient: trust in an expert lasts only so long; investors will get cold feet quickly if there aren't quick results.

We can invest based on research and analysis. Value investors in equities study company fundamentals looking for firms with share prices that undervalue the assets or the durability of cash flows. In the same way, we can look for value gaps in business operations or market opportunities and identify ways that technology can deliver value to fill those voids. The foundation of the analysis are things such as value-stream mapping, or competitive analyses of solutions developed by sector and non-sector peers. From this, we can produce a financial model and, ultimately, a business case. We need expertise to develop a solution, but by and large we make our investment decision based on the strength of our analysis. In absolute terms, the amount of committed capital can be quite large. But, having rationalized our way to making an investment, the capital controls tend to be strict, and the capital tends to be patient.

Finally, we can invest based on our ability to discover and adjust based on market or user feedback. Traders move in and out of positions, adjusting with changes in the market and hedging based on how the market might change. Over a long period of time, the trader hopes to end up with large total returns even if any given position is held for only a short period of time. We can do something similar with software, using approaches like Continuous Delivery and Lean Startup. In this approach, we aren't just continuously releasing, but rapidly and aggressively acquiring and interpreting feedback on what we've released. We can also use things like A/B testing to hedge investments in specific features. When we invest this way, we do so based not so much on our expertise or analysis, but based on our willingness and ability to explore for opportunities. Capital controls are strict because we have to explain what features we're spending money on and how we're protecting against downside risk of making a mistake. The capital backing a voyage of discovery will be impatient, wanting frequent assurances of positive feedback and results. But at any given time, the amount of committed capital is small, because investors continually evaluate whether to make further investment in the pursuit.

Each of these approaches makes it easy to answer "why" we are making a particular investment. Why should we part with cash for this particular feature? Go ask the expert, or see how it fits in the business case, or go get user feedback on it. "Why" should drive every decision and action made in pursuit of an investment. Without the "why" there is no context for the "what". In its absence, the "what" will suffer.

No approach is universally superior to another. The approach we take has to play to our strengths and capabilities. Either we have people with expertise or we don't. Either we have people with analytic minds and access to data or we don't. Either we have the ability to rapidly deliver and interpret feedback or we don't. The approach we take must also be suitable to the nature of the investment we're making. A voyage of discovery is well suited for developing a product for a new market, but not for an upgrade to a core transactional system. The business case for investing in a customer self-service solution is going to be much more compelling than a business case for developing a product for an emerging market segment.

Just because we take one of these approaches is no guarantee of success. Not all investments are going to pay off: our experts may turn out to have esoteric tastes that aren't appealing to a mass audience. Our thoroughly researched market analysis might very well miss the market entirely. We might deliver lots of features but not find anybody compelled to use them.

Worse still, each of these approaches can be little more than a veneer of competency to unprofessional investing. A hired-in expert may be a charlatan. Many a manager has commissioned a model that inflates benefits to flatter an investment - only for those benefits to never be realized. Just because we can get continuous feedback from users does not mean that we can correctly interpret what that feedback really means.

Most of the time, of course, we take a hybrid approach to how we invest. We supplement expertise with a business case, or we charter an investment with a business case but use Continuous Delivery to get feedback. However we go about it, we need to get the essential elements of the approach right if we're to have any chance of success. Otherwise, we're just unprofessional investors: investing without experience, thoughtful analysis or an ability to respond quickly is reckless use of capital.

Entirely too much software investing fits this bill.

Thursday, January 31, 2013

Sector Seven Is Clear

Many years ago, there was a television ad that showed an intruder being chased through a building by two security guards. The guards chase him from room to room, and ultimately down a long hallway. At the mid-point of the hallway, there's a line painted on the floor and wall. On one side of the line is a large number 7, on the other is the number 8. The intruder runs down the hallway, over the line. The two security guards come to a sudden stop right at the line. They watch as the intruder continues to run down the hallway into some other part of the building. After a pause, one of the security guards grabs his walkie-talkie and announces: "sector seven is clear". The intruder is still in the building, but the security guards no longer consider him to be their responsibility.

Every now and again I reference this ad in a meeting or presentation, in the context of Industrial IT. I've been reminded of it again recently.

Industrial IT encourages specialization: It is easier for HR to hire, staff, train and procurement to contract for people in specialist roles. And specialization is comfortable for a lot of people. Managers - particularly managers who have a poor grip on the work being done - like specialization because it is easier to assign and track tasks done by specialists. People in execution roles take comfort in specialization. It's easy to become proficient in a narrow field, such as a specific database or programming language. Given the slow rate of change of any given technology, you don't have to work too hard to remain acceptably proficient at what you do. You only face a threat of obsolescence. A commercial technology with sufficient market share and investment mitigates that risk to the individual.

Specialization means the most critical information - systemic understanding of how a solution functions and behaves from end-to-end - will be concentrated in a few people's heads. This knowledge asymmetry means those few people will be overwhelmed with demands on their time, creating a bottleneck while others on the team are idle. There will be a lot of hand-offs, which increases the risk of rework through misunderstanding. Because no single specialist can see a solution through to completion, nobody will have ownership for the problem. At least, not beyond making sure Sector 7 is clear.

I've written about it many times before, but Industrial IT prioritizes scale over results, specialists over professionals, predictability over innovation, and technology over value. Industrial IT is large but fragile: it struggles to get anything done, there aren't enough heroes to go around, its delivery operations are opaque, and it produces high-maintenance assets.

Even when there is executive commitment to change, it takes a long time and concentrated effort to change the industrial mind-set at a grass roots level.

We have to reframe problems above the task level. Everything we do should be framed as a meaningful business result or outcome, complete with acceptance criteria against which we can verify success in the business context. For example, the problem isn't to fix the payload of a specific webservice, the problem is to allow multiple systems to integrate with each other so that sales transactions can be exchanged. Agile Stories are particularly helpful for defining actions this way, whether new feature or defect. Stories make it possible for each person to explain why something is important, why something is valuable, why they are working on it. Back to our example, I'm fixing this webservice because until I do, there won't be order flow from a principal commercial partner. Stories are also helpful as they let us measure the things we do in terms of results, and not effort.

But there's more to this than process. Each person must feel personal ownership for the success of their actions. The job isn't to code to a specification, or to test against a test case. The job is to create a solution that enables a business outcome. Each person must ask questions about the completeness of the solution, and be motivated to verify them in the most complete manner possible.

Which makes the point that this is, fundamentally, a people challenge. We're asking people to change how they understand the problems they're solving and what "done" means. We're asking them to change their behaviours in how they investigate, test and verify what they do. More broadly, we're asking them to build a contextual understanding for the work they do, and more importantly why they are doing it. And we are asking them to take responsibility for the outcome: I will know this is complete when ...

Do not under-estimate the challenge this represents. The transition from industrial to professional is amorphous. There are false positives: people who sign up for this too quickly don't understand what's going to be required of them. It isn't long before the never ending chorus of "I don't" starts: I don't know how to do something, I don't have that information, I don't know how to find that out. And we can't take anything for granted. We must constantly challenge people's contextual understanding: can they explain, in a business context, why they are working on something, who it benefits, why it is important.

Not everybody will make this transition. For some, because this isn't their calling: not all assembly line workers can become craftsmen. Others will self-select out, preferring the comforts afforded by a specialist's cocoon.

All of these things - changes in process, practice, behaviours and people - require a tremendous amount of intestinal fortitude. The would-be change agent must be prepared for a frustratingly slow rate of change, and to invest copious amounts of time into people to help them develop new context and new muscle memories. On top of it, leaders are in short supply and mentors are even scarcer in Industrial IT shops. Legacy assets and systems (and their legacy of patchwork integration, bandaged maintenance and situational knowledge) will slow the rate at which you can make new hires productive.

The benefits of changing from industrial to professional are obvious. While the destination is attractive, the journey is not - and be under no illusions that it is. But who we work with, how we work, and what we get done in a professional IT business make it worth doing.

Monday, December 31, 2012

Engaging Auxiliary Forces for Strategic Software Solutions (Part II)

I wrote previously that "if one is to compete, one has no choice but to rely on auxiliaries or mercenaries" when you have to respond quickly to a competitive threat. Before looking further at engaging auxiliaries, it's worth considering the "if one is to compete" statement. As unattractive as it may sound, a firm can opt out of competition. Sometimes, the most compelling business option is to run the business for cash. How have late moving legacy firms fared in industries that have undergone strategic shift? E.g., will RIM and Nokia destroy more value than they will create by being late followers of Apple and Android?

Opting out is strategically unattractive. (In fact, it's downright Machiavellian.) But it should never be ruled out. And there isn't much to be said for "putting up the good fight" if you're ill prepared to bring it. It's simply a very public form of corporate seppuku that vapourizes equity and destroys careers.

However, if a firm chooses to compete, and has neither the capability nor the luxury of time to create a capability, it has no choice but to rent that capability by entering into an agreement with an auxiliary or mercenary force. As I pointed out in Part I, this relationship favours the seller.

How can the buyer mitigate the risks? By knowing when and how he or she wants to exit the relationship.

Buyers of auxiliary forces are tempted by what appears to be the best of both worlds: contracting allows the buyer to get an asset developed with minimal effort on behalf of the buyer. But the economics work against the buyer as time passes. The longer a supplier relationship lasts, the greater the dependency of the buyer on the seller, the stronger the seller's negotiating position over time. And development doesn't end with a single act of delivery: there is considerable activity required at the margins, things ranging from data migration to usage analytics. These costs are the buyer's to underwrite. Suppliers are thus able to expand their range of services and derive more cash from the relationship. This increases the costs to the buyer, which erodes the viability of the sourcing strategy.

Anticipating the terms and conditions of the exit are subsequently of prime importance to the buyer.

If the buyer has no alternative but to engage auxiliaries - if, for example, the buyer is purely a marketing company taking a flyer on a technology investment - it faces a long-term relationship with a supplier. The buyer's best bet is to engage auxiliaries as long term equity holders with minority rights in the relationship. This aligns the seller with the buyer and reduces (but most likely will not eliminate) cash bled from the buyer to the seller. By contrast, if the buyer intends to derive significant benefit from the intangible (technology) assets and, by extension, leverage its capability in technology, the buyer must engage auxiliaries for a short period on a fixed income basis, all the while preparing to transition away from the seller to "one's own" forces.

Supplier relationships are economically sticky. Switching from one supplier to another is generally a poor exit strategy. Equity relationships are difficult to unwind amiably (that is, without attorneys). Fixed income relationships come at the cost of the buyer, who will be bled to death pumping cash into multiple suppliers who will not underwrite the cost of a transition.

Thus the onus is on the buyer to make quick but considerate decisions when engaging an auxiliary. In the case of an equity relationship, the buyer must convince the seller to accept a minority equity position, and determine the viability of the investment quickly (reward or wind it up) so that minority position doesn't languish. In the case of a fixed-income relationship, the buyer needs to be able to exit the supplier relationship for a team of "one's own" forces, relegating the supplier to a minimal role at a transitional moment.

But there are often circumstances that muddle a buyer's judgment. With a gullible or desperate supplier, a buyer can prolong a supplier relationship in the hope that an investment will prove viable. By playing labour arbitrage, a buyer can defer the difficult task of building one's own forces. But whether equity or fixed income, the buyer has to remember that the economics of an auxiliary relationship are in decline for the buyer the minute the ink is dry on a contract. When engaging auxiliaries, the buyer must make quick investment decisions and take quick action. The longer a supplier relationship lasts, the more the power in the relationship shifts to the seller.

No matter the nature of the relationship - equity or fixed income - the buyer must not enter into an "arms length" relationship with the seller. The buyer must be engaged with the seller, constantly monitoring and auditing the deliverables over the life of the relationship. A buyer must be capable of competently auditing the work being done by the supplier before entering into a supplier relationship. If he can't, he is not a qualified buyer and must be regarded as a principal source of risk to the capital being invested. Investors in strategic software are wise to challenge the capability of the buyer as well as the seller.

Entering into a supplier relationship buys time for the buyer to build his or her own forces. These forces must be of equivalent or superior capability to those of the supplier. It might not be of equal size - an indigenous team may be smaller in size than a rented team - but it must be of equal capability. An own force inferior to auxiliaries will be manipulated by the auxiliary to the disadvantage of the buyer. With a team of equal capability, the buyer can quickly eclipse the influence of the supplier in the fulfillment chain. A buyer can preserve credibility with a slower velocity from an indigenous team, but not lower quality.

Auxiliaries can be useful to buyers to compensate for a short-term vulnerability, but only if the buyer has an exit strategy. Sellers, not buyers, benefit from long-term supplier relationships for strategic solutions. Buyers must make quick decisions about investment viability, and take competent actions in building the forces necessary to sustain them.

Friday, November 30, 2012

Engaging Auxiliary Forces for Strategic Software Solutions (Part I)

At one point or another, most firms will engage "auxiliary forces" - contract with firms for development teams - to develop software for them. If Machiavelli were sourcing software projects, he wouldn't approve.

"These arms [auxiliaries] may be useful and good in themselves, but for him who calls them in they are always disadvantageous; for losing, one is undone, and winning, one is their captive."
Niccolo Machiavelli, The Prince, Chapter XIII

Machiavelli counsels the prince against the use of auxiliary forces in battle. An auxiliary is not the same thing as a mercenary. A mercenary is a hired gun, loyal only to him or her self. A prince engages an auxiliary when he arranges for another principality to supply forces. The members of an auxiliary force will be loyal to their leader. The risk to the warring prince is that the kingdom with which he has entered into partnership will change terms during or after the battle.

Thus Machiavelli favours the use of one's own forces for important work. It is easier for the prince to align his force's interests with his own interests, and subsequently count on their loyalty and service because everybody stands to gain the same things for the same risks. In the heat of battle, mercenaries are likely to flee, while an auxiliary is likely to seek negotiations that minimize losses. After the battle, the auxiliary is likely to seek negotiations with the prince who invited them into the campaign. (More about this, below.)

Of course, business isn't warfare. But there are some business lessons we can learn just the same.

In software development, auxiliary forces have their place, particularly for utility services sourced for utility solutions. Consider an ERP implementation, complete with code customization. There is a large, diversified sell-side labor market, many alternative sources of that labor, pricing is more formulaic, risks and success criteria are known to some degree of specificity, and the work is not deemed strategically critical (not when everybody has an ERP). This commoditizes the sell-side offering, which favours the buyer.

The sell side has power in utility work only in times of acute labour shortage, which gives the sell side pricing power. But tight supply doesn't make a commodity into a strategic resource; it simply makes it a more expensive commodity. Like any commodity, shortages tend not to last very long: buyers will be priced out of the market (curtailing demand), while suppliers will find ways to create new labour capacity such as training new people (increasing supply). And sellers of services deal in a perishable commodity: time. Only in periods of very high demand will sellers press forward a negotiating advantage borne of tight supply. A billing day lost forever is strong incentive to a seller to do a deal. Whether a buyer engages for a utility service by contracting mercenaries (hired guns under the direction of the buyer) or by hiring an auxiliary (a partnership for delivery of the solution as a whole), the buyer has the upper hand when buying utility services. Or, perhaps, it is more accurate to say that the advantage is the buyer's to lose.

On consideration, Machiavelli wouldn't mind auxiliary forces when they're deployed for utility purposes. It is foolish to distract your best people with non-strategic pursuits.

But he would not approve the use of auxiliaries to achieve strategic solutions. The buyer of a strategic, unique, competitively differentiated solution must put more scrutiny on suppliers, navigate custom pricing, underwrite general and ambiguous risks, and has less certainty of the outcome and the business impact. The buyer will also be weighing intangible advantages in potential sellers, such as business domain or advanced technology knowledge. This makes for a small, opaque and poorly diversified market for any given strategic investment. This favors the seller.

By engaging auxiliary forces, the buyer can get a head start on a strategic solution. But then, it doesn't matter how you start, but how you finish.

A territory conquered by an auxiliary leaves the auxiliary with a strong negotiating position: as the saying goes, possession is nine-tenths of the law. The auxiliary force, in possession of the conquered land, can set terms for turning it over to the prince. This can also be true for strategic software investments. Strategic software doesn't end with a single act of delivery. It will need further investment and development. Their knowledge of the created software - the business domain, the code, the deployment scripts, and so forth - give the auxiliary force a familiarity of "the terrain" that the buyer simply doesn't have. This gives the auxiliary firm an outsized position of power to renegotiate terms with the buyer (e.g., usurious terms for ongoing development and maintenance), or to negotiate with firms who are competitors to the buyer to create a similar solution.

But what about intellectual property and contract laws? Don't they protect the buyer? When the focus shifts to contracts, it means both parties have brought in the lawyers. Lawyers are simply another type of armed force, one that can be both highly destructive and very costly. Lawyering-up simply escalates an arms race between the buyer and seller (or in Machiavelli's parlance, the prince and the auxiliary).

It's no surprise that Machiavelli concluded that auxiliary forces were not worth the risk in strategic undertakings. Anything of strategic importance to the prince must ultimately be done by the prince and his own forces.

This is an easy rule to write, but not an easy rule to live. Building a force (or in software terms, a team) capable of creating a strategic software solution requires world-class knowledge, experience, and facilities. It requires trained personnel and the ability to continue training. It requires the ability to recognize greatness and potential, and the ability to hire both. It requires innovation in tools, techniques and results.

It was Peter the Great's aspiration for Russia to have an ocean going navy. But he didn't start by emptying his treasury on local contractors to build his boats and hire people off the streets to serve as his officers. He studied English ship building and hired German Naval officers to train his officers and enlisted ranks. He invested in developing a capability. He played the long game to allow Russia to become a capable naval power, and not simply a nation with a floating armed force. When time permits, when one has the luxury of playing the long game of disruption, one can own, rather than rent, capability.

Time does not always permit the long game, especially in businesses whose models are being stampeded by software. If the competition has moved first, if one has to respond swiftly, one has no choice but to rely on auxiliaries or mercenaries if one is to continue to compete.

How best, then, for the buyer to engage an auxiliary or mercenary force, given Machiavelli's counsel? We'll look at the terms for this kind of an engagement in Part II.

Wednesday, October 31, 2012

Patterns for Buying and Selling Software Services

How we contract for software services has a big impact on how successful a software team will be.

There are three common ways of buying development and related services: we can contract for bodies, experts or a solution.

Contracting for Bodies

The simplest way to contract for services is a body shop or staff-augmentation contract. We're developing software in Java, we need Java developers, so we ring up a staff-aug firm and we rent Java developers. Fortune 100 companies rent people by the tens of thousands on staff aug contracts. Staff aug represents a substantial amount of revenue to firms ranging from the largest IT consultancies to local boutique shops. Body shopping is a big business: the margins are thin, but the volumes more than make up for it.

A volume buyer will have Vendor Management Office that consolidates buying requests from across the company, validates the time for they'll be needed (either project based or BAU based), makes the request consistent with position specs they put out to tender (Senior Developer, Lead QA, that sort of thing), and sources a body from the cheapest provider. This gives the scale buyer the appearance of quality control over the bodies they rent. It also eliminates all negotiation: sellers are given a position to fill at a price point; negotiation is simply "take it or leave it" to the seller.

In staff-aug, the seller isn't underwriting delivery risk (that is, of a project), they're only underwriting competency risk of the person they staff. Competency is established relatively quickly, within the first month or so: the staffed body is found to be either acceptable or, at the minimum, not unacceptable. For the seller, although the margins aren't very good, body shop contracts are low touch, it's consistent cash flow, and the easy money is hard for many to resist. Best of all to the seller, delivery risk is completely underwritten by the buyer: the buyer is hiring people on a contract basis to execute the tasks they've determined need to be done. If the project tanks, the buyer has to orchestrate and cover the costs of the rescue, whereas the seller stands to gain through an extension or expansion, and faces no downside risk of a clawback. Terms are often unfavourable to the seller: net 60 isn't uncommon. But cash flows are temporal: once or twice a month, just like payroll. You can set your watch by it. Body shop contracts are like ATMs to sellers.

Contracting for Experts

Another way to contract for services are expert or consulting contracts. A team wants an expert to solve a particular problem, something quite difficult or esoteric: we need a build guy, so we contract for a build guy. The contract may or may not be results based: in general terms, the buyer knows the outcome they want (a reliable, consistent build), but doesn't know exactly what that should look like (a build triggered with every commit, a build pipeline to stage automated test execution) or how to achieve it (developers need to code to keep the build working, not change the build so their code compiles). Execution tends to be highly collaborative: the buyer wants the expert rubbing elbows with the people in the team so that his knowledge gets transferred.

Risk is shared between buyer and seller. If the buyer sees things improve or if the expert gives insight that the team were never going to come up with on their own, they believe they got value for money and they pay. If the buyer doesn't see things improve or if the expert tells them things that they already knew, they don't believe they got value for money and they don't pay. The seller runs the risk that their expertise is sufficient for the situation, that there won't be personality conflicts that undermine their effectiveness, and that the buyer is sophisticated enough to consume the seller's expertise. Thus the risk is shared, and it's up to both parties to get to an acceptable outcome.

Expert consulting is appealing to independent contractors or boutique firms that are essentially collections of branded independent contractors. It is also appealing to large sellers if they believe leasing out an expert will lead to a large body shop or solution contract later.

Buyers pay dearly for experts, which means sellers get fat margins off the work. But sellers charge a premium for a lot of reasons. It is mercenary work to large firms because it doesn't build the seller's business (renting experts generally doesn't scale) and comes at high risk that the expert will quit to join or contract directly with the buying firm. Not everybody is cut out for expert consulting: in addition to needing marketable expertise, a good consultant requires a high EQ and a high degree of situational awareness. Contracts tend to be short duration, but take a long time to define (specify deliverables) and negotiate (intense rate negotiation). They are also high-touch, since the price attracts scrutiny from a lot of people on the buy side. On small contracts (measured in days or weeks), contracts are fixed price and cash flows are end-of-contract. When they're low six figures (such as a large team for a short time), it's money down with the remainder at completion. They're temporal if the engagement lasts for a long time.

Contracting for Solutions

The third form of contract is a project or solution contract. We're implementing SAP, so we need both the core modules and customization done to those modules. We know accounting, but we don't know how to implement SAP, or migrate our data into it, or code in ABAP. So we contract with an implementation and development firm. The ideal firm knows our line of business: e.g., if we're a retailer, we want a solution firm that knows how to configure SAP for retail, knows some really neat shortcuts and tricks, and some obscure 3rd party vendors with some interesting tools targeted at retail. That vertical expertise isn't strictly necessary, but at the least the buyer expects the seller will come to grips with the business domain pretty quickly.

In solution contracts, the sell side underwrites the risk of delivery. Of course, that's true only as far as the contract is concerned: in the event of a project failure, even if the buyer can claw back money from the supplier, the buying business is denied the working software it wanted at the time it wanted it. And the seller will have some people involved in development - people who know legacy systems, and various SMEs. But within the context of the services contract itself, the buyer turns over responsibility for development entirely to the supply firm: project management, user experience, requirements analysis, development, testing and sometimes even deployment and post-implementation services. If the project craters, it is the seller who must orchestrate and cover the costs of the rescue. Whatever marginal amounts the seller can squeeze out of the buyer during the rescue won't be enough to cover the entirety of the downside. Thus the seller underwrites the risk.

To the sell side, the margins are generally good but depend on the seniority of who gets staffed. The temptation is there to juice margins by staffing the delivery team with less experienced people (skill leverage is no different from financial leverage this way). But by the same token, sellers can be talked into "buying the business" - agreeing to enter a solution contract at a lower amount - if they believe that an initial solution sale will lead to subsequent (and significant) revenue. Cash flows can be either lump sums that start with prepayment, or time & material. Very large T&M projects will have a hold-back on each invoice and a rate discount at specified spend thresholds. Sellers can be rewarded for early delivery.

The Right Contract for the Right Situation

Each of these types of contracts have their place. A buyer who is managing and staffing a project team can successfully rent people, especially if the buyer has the means of thoroughly screening and assessing candidates and the willingness to reject people who are not up to scratch. A buyer who recognizes an acute need can contract an expert for a diagnosis and a solution path. Expert engagements are successful when the buyer understands what they're buying, is willing to accept that their self-diagnosis may be wrong and that the expert may steer them into a completely different direction than expected, and has the intestinal fortitude to follow-through. Solution contracts work when the supplier firm has the expertise in end-to-end delivery, a full set of capabilities, and depth of experienced personnel to come to their own rescue if the project ends up in trouble.

Commercial relationships hit the skids when buyer or seller - or both - fail to recognize the appropriate commercial relationship and the nature and responsibility for the risk. A seller who only employs developers - no UX, BAs, PMs, QA, etc. - cannot responsibly enter into a solution contract because they are not contributing enough to the entire solution. A buyer who wants to cherry pick people from multiple suppliers to form a single project team cannot hold any one supplier firm responsible for delivery. A seller supplying experts or bodies cannot compensate for inadequacies of the buyer, whether that's subject matter expertise or technical knowledge.

Sellers have to have enough self awareness to know the types of contracts they can responsibly enter into. Solution firms struggle with bodyshop contracts because the firm's unique value (in what they know or how they work) is neutered in a "rent a body" commodity sale. Solution firms also have low tolerance for long-running expert engagements, as they tie up expertise needed in high-risk solution contracts. Body shop firms tend to have very narrow experts who struggle in consulting engagements. They also lack management expertise or any "greater than the sum of our parts" value to be able to provide a business solution.

Buyers need to understand what's appropriate for them to buy. A buyer with little experience managing software development can't rent bodies because they can't manage them effectively. A buyer with a large, complex captive IT organization that contracts with a solution firm will suffer endless headaches caused by turf battles and disputes over "how to do things" among the buyer's and seller's staff.

Commercial relationships work best when both buyer and seller clearly understand the extent of each party's obligations, the risks that each are assuming, and how they're each mitigating those risks. In the end, a seller may not make a sale, and a buyer has to look for another supplier, because what's good for one is not good for the other. But it's a far, far better thing, over short term and long, that buyer and seller do good business than bad business together.

Sunday, September 30, 2012

The Quality Call Option

On time. On scope. On budget. The holy grail of industrial, cost-center IT. Plan your work and work your plan. Be predictable.

As a set of goals for software development, this troika is incomplete because it neglects quality. That is a costly omission. Poor quality will ultimately steamroll the other three project variables: we'll be late, we'll remove features, and we'll spend a lot more money than we want to when we are forced to chase quality. We assume quality at our own risk.

Why do so many otherwise smart people take quality for granted? Mainly because they can't see how easily quality is compromised in development. It's easy to assume that requirements and specifications are sufficiently thorough, the developer just needs to work to the spec. It's easy to assume that we're hiring competent developers who write high quality code, and most buyers don't know what to look for or how to look for technical quality. It's easy to assume our Quality Assurance phase will guarantee solution quality by inspecting it in, instead of agonizing over how to build it in through every action we take in development.

And we tend not to realize that quality is in the eye of the beholder. When we have long feedback cycles - when it could be days, weeks or even months before one person's work is validated by another - the person responsible for creation is de facto the person responsible for validation. When "on time / on budget / on scope" is the order of the day, nobody is going to cast a critical eye over their own work.

It takes a while before we find out that one person's "done" is another person's "work-in-progress". The moment when a quality problem makes itself known is the moment when our implicit assumptions about quality start to become expensive.

There are a few simple things managers can do to give quality the same prominence as time, cost and scope. The first is to reduce the number of hand-offs in the team, and to strengthen those hand-offs.

Every hand-off injects incremental risk of misunderstanding. The greater the number of hand-offs, the greater the risk that we will have integration problems or understand the problem differently or solve the wrong problem. Development teams have a lot of specialists, both role (business analyst, developer, user experience designer, quality assurance analyst) and technical (mobile client developer, server-side Java developer, web client developer, etc.) Our ideal team is staffed by generalists who can each create a whole business solution. If we don't have people with generalist skills today, we must invest in developing those capabilities. There's no way around this. And there's no reason to accept technical specialization: we don't need to have mobile client developers as well as services developers and database developers. We can invest in people to become mobile solution developers who can write client and server code. Generalization versus specialization is our choice to make.

It takes time to professionalize a development team that has too many industrial characteristics. Until we've got a team of generalists, we have to make do with our specialists. As long as we do, we have to improve the fidelity of our hand-offs.

Rather than performing technical tasks in isolation, we can have specialists collaborate on delivering functional requirements. Have developers - even if they are fluent in different technologies - actively work together on completing an entire functional solution. That's easy to write but difficult to do. We must expect that we'll need to facilitate that collaborative behavior, particularly in the early going: collaborating on delivery of a functional solution is not natural behavior for people accustomed to going flat out to complete code by themselves.

We can increase collaboration across roles, too. We can have business analysts and developers perform a requirement walk-through before coding starts. We can have a business analyst or QA analyst desk check software before a developer promotes it as "development complete". We can insist on frequent build and deployment, and iterative QA. All of these shorten our feedback cycles, and gives us an independent validation of quality nearer to the time of creation.

This type of collaboration might seem matter-of-fact, but it doesn't come naturally to people in software development. A manager has to champion it, coach people in it, witness it first hand to assess how effectively people are performing it, and follow through to make sure that it takes place on its own.

Stronger hand-offs will slow down coding. The act of creating code will take longer when we insist that BAs do walkthroughs with developers, that developers collaborate on business solutions, and that QA desk-checks before code is promoted. But the velocity at which a team can code is not the same as the velocity at which a team can deliver software at an acceptable level of quality. It's easy to cut code. It's not easy to cut code that is functionally and technically acceptable. That means the former is a lousy proxy for the latter, and our management decisions are wildly misinformed if we focus on "dev complete".

Knowing the velocity at which a team can deliver a quality product gives us far more useful managerial information. For one thing, we have a much better idea for how long it's likely going to take to deliver our desired scope to an acceptable level of quality. This makes us less vulnerable to being blindsided by an overrun caused by lurking quality problems. We also have a better idea for the impact we can expect by adding or removing people. A team focused just on delivering code might deliver more code in less time if it has more people. But quality problems tend to grow exponentially. When quality is not a consideration in project planning, we tend to forget the dichotomy that a larger team can need more time to complete software of acceptable quality than a smaller team.

Another thing we can do is make our quality statistics highly visible. But visibility of the data isn't enough; we have to visibly act on that data. It's easy enough to produce information radiators that profile the state of our technical and functional quality. But an information radiator that portrays a decaying technical quality picture will have a negative effect on the team: it will communicate that management doesn't really care. Something we're prepared to broadcast is something that we have to be prepared to act on. When we expose quality problems we have to insist on remediation, regardless if that comes at the cost of new requirements development. That puts us face-to-face with the "quality call option": developing more bad stuff is less valuable than developing less good stuff. That might seem obvious, but it doesn't fit into an "on time / on scope / on budget" world. It takes courage to exercise the quality call option.

Finally, we have to look not just at project status data, but at flow data. Projects throw off copious amounts of flow data - trends over time - that most managers ignore. These are a rich source of information because they indicate where there are collaboration - and therefore quality - problems. For example, we can easily track a build-up in the number of defects that developers have tagged as ready for retest, but are not retested in a timely fashion by QA. We can look for a pattern of defects raised that are dispositioned as "not a defect". We can monitor defects that are reopened after retest. And we can look at the overall ratio of defects to test activity as an indicator of fundamental asset quality. The patterns of change in this data over time will indicate performance problems that undermine quality.

As people who understand how to deliver software, we have to insist on quality being on an equal footing with cost, scope and time. Getting that footing is one thing, keeping it is another. We need to measure both asset quality and team performance in terms of results of acceptable quality. We need to scrutinize our project data for threats to quality. We need to foster individual behaviors that will reduce self-inflicted quality problems. And we have the courage to broadcast our state of quality and act decisively - to exercise the quality call option - to defend it.

Friday, August 31, 2012

Strategic Software Investing: Chase the Ends, Not the Means

When a company engages in custom software development, it is making an investment into itself. Company executives decide to convert capital into software because they believe that the resulting software will make the business better.

How we make that investment decision is crucial to the success of the investment.

Our first investment decision is based solely on an idea. We spend some money to research an idea that looks promising to see if there is a business case for making further commitment. That business case comes down to the expected impact from the features we want at an expected cost by an expected time. We also want to understand the investment strategy: How will it be developed? By whom? Where will they be working? How well do they know these technologies? How well do they understand our business domain? What is the release schedule? How will we assess quality of the asset? We also want a risk analysis of both our business case and investment strategy, so that we understand the many different forms of risk there are to the capital the firm is committing.

This gives us enough information to make our second (and usually larger) investment decision: either we have an opportunity that meets our investment criteria that can be fulfilled at reasonable risk, or we don't.

Even if we secure financing and approval, sign the contracts, and get on with development, that's not the end of our investment decision-making. The investment rationale needs to be revisited continuously over the course of the investment. Once we begin executing on an investment, our objective is to maintain its viability. If reality turns out to be significantly different from the strategy (people quit, scope turns out to be larger than anticipated, business needs change), we either go back to our investment committee to ask for an adjustment to the investment (more money, rescope functionality, recast the expected business impact), or we have to recommend that they scuttle the investment.

We do this with any financial investment we make. Before we invest in a company, we do our research into the industry, business, and management team. We scrutinize quarterly reports for results and indicators. We withdraw our investment if we are not confident that management can turn it around. This is classic investing behaviour, and it is applicable to strategic software investments.

IT projects tend to lack investment rigour. Managers beholden to an annual budget cycle will concoct wildly optimistic cost forecasts based on flimsy details in the hope of getting a project funded in the annual budget lottery. Instead of using project inception as an investment qualifying activity, it's used strictly as a way to refine scope, time and cost estimates as a precursor to development. And once development is under way, the management mantra is to be "on time / on budget / on scope"; management will do whatever they can to bend reality to match the project plan to give the appearance of control.

What we end up with is budget money chasing any means of development that pledges it can fit under the spending cap, rather than corporate capital chasing a business outcome. Development isn't invited into the process of shaping the investment parameters, they're expected to live with them. That leads to compromise in team construction (availability over capability), vendor partners (low bid), and standards of performance (showing progress is more important than verifying completeness).

When money is chasing the means and not the ends, quality is the first casualty, followed quickly by scope. Costs will rise to meet budget, at which point a project sponsor is forced either to dial up more money or dial down expectations.

Software is an investment, not a cost. Good investment outcomes require good investment rigour.

Tuesday, July 31, 2012

Trust, but Verify

“If a builder builds a house for a man and does not make its construction firm, and the house which he has built collapses and causes the death of the owner of the house, that builder shall be put to death.”

Clearly, the Babylonians understood that the builder will always know more about the risks than the client, and can hide fragilities and improve his profitability by cutting corners—in, say, the foundation. The builder can also fool the inspector (or the regulator). The person hiding risk has a large informational advantage over the one looking for it.

-- Nassim Taleb and George Martin, How to Prevent Other Financial Crises, SAIS Review, Winter-Spring 2012

The interests of buyer and builder of any asset are fundamentally divergent. The builder, who has to keep workers engaged and cash flowing into the business, has a shorter horizon than the buyer, who enjoys the fruits of the asset for a long period of time. The buyer knows only whether the asset appears to do what it is expected to do after it is delivered, whereas the builder has more knowledge of the care that has gone into the asset: the quality of raw materials and the thoroughness of the workmanship. By necessity, there is a fundamental trust relationship between buyer and seller. The Babylonians understood the criticality of this trust relationship to society and thus set a very high expectation that it would not be violated.

In software development, information is highly asymmetric between buyer and seller. The buyer knows a particular business such as selling toys or flying people from place to place, while the seller knows how to develop and deliver software. A buyer knows whether software is functionally complete (can I transact business using this software?) but most buyers are not experts at software or software development. Few know the questions to ask relative to non-functional requirements, or how to validate that the answers to those questions are satisfactory. How does a buyer know that a solution is being developed so that it is highly secure from attack? or will perform reliably and satisfactorily? or will scale? Buyers assume the asset will have these characteristics. Yet buyers tend to find out whether they do only after the fact, and it is the buyer who is stuck with the outcome: delays due to unforeseen technical work, or higher than expected hardware costs. Most buyers of custom software are underwriting risks which they do not fully appreciate.

There is not an equivalent in software development to the ancient Roman practice of bridge engineers spending nights sleeping under their construction. The closest thing we have are commercial clawbacks and the threat of long and painful slogs to rescue a distressed investment. Clawbacks can be painful for a selling firm, but unless both parties contractually agreed a definition of a possible "rescue" state and set as a term that the seller is responsible for all costs incurred during rescue, the full force of the clawback is easily blunted through negotiation. Rescue operations are painful for the individuals involved in them, but are not not often led or staffed by the people who created the mess in the first place. Thus the builders responsible for the slipshod quality rarely feel the full force of the consequences.

Therefore, Mr. Kay sensibly suggests that active investors should have concentrated portfolios. This encourages long-term thinking, and detailed research. Such a concentrated investor will be an active steward of the company, making life difficult for the management if it errs. In short, such managers may behave like "owners" of the stocks in the portfolio.

-- John Authers, writing in the FT

If we can't get builders more closely aligned with buyers, perhaps we can get buyers more closely aligned with builders. Investing in custom software is active by nature. There are no passive IT investments, no index funds or ETFs through which we can invest in software. Investing in custom software is a business of picking winners. The things that Mr. Authers and Mr. Kay point out about financial investing are applicable to IT investing. We benefit when buyers or buyer's agents (e.g., IT portfolio managers) behave as "active stewards" of the investments they make.

To be effective stewards, a buyer must do their research on the things they are investing in. Part of that research is knowing the domain of software and software development: about technologies the team is using, about the level of confidence behind progress reports (can we validate results or can we merely validate effort has been expended?), about the fullness of the solution (what specific NFRs matter to us and how specifically are we validating that they are satisfied), and above all about the people responsible for delivery (do we have people with sufficient capability on the team). Where the buyer may not have the knowledge, they can assemble a board to oversee their investments, consisting of vested stakeholders and independent directors with expertise in areas where the buyer's knowledge and experience are lacking.

This puts the buyer closer to the builder while the asset is being produced. Not in the weeds of technical decisions, but in a position to set and reaffirm expectations with the team, and to validate that results are consistent with those expectations. The buyer trusts, but verifies.

As Mr. Authers points out, such a buyer will "[make] life difficult for the management if it errs." Better for an errant management to be corrected while there is value in an investment to the buyer, not after that value has been lost by the builder.

Saturday, June 30, 2012

Resiliency, Not Predictability

A couple of months back, I wrote that it is important to shorten the results cycle - the time in which teams accomplish business-meaningful things - before shortening the reporting cycle - the time in which we report progress. Doing the opposite generates more heat than light: it increases the magnification with which we scrutinize effort while doing nothing to improve the frequency or fidelity of results.

But both the results cycle and reporting cycle are important to resiliency in software delivery.

A lot of things in business are managed for predictability. Predictable cash flows lower our financing costs. Predictable operations free the executive team to act as transformational as opposed to transactional leaders. Predictability builds confidence that our managers know what they're doing.

The emphasis on predictability hasn't paid off too well for IT. If the Standish (and similar) IT project success rate numbers are anything to go by, IT is at best predictable at underperforming.

When we set out to produce software, we are exposed to significant internal risks: that our designs are functionally accurate, that our tasks definitions are fully defined, that our estimates are informationally complete, that we have the necessary skills and expertise within the team to develop the software, and so forth. We are also subject to external risks. These include micro forces such as access to knowledge of upstream and downstream systems, and macro forces such as technology changes that obsolete our investments (e.g., long-range desktop platform investments make little sense when the user population shifts to mobile), and labor market pressures that compel people to quit.

We can't prevent these risks from becoming real. Estimates are informed guesses and will always be information deficient. Two similarly skilled people will solve technical problems in vastly different ways owing to differences in their experience. We have to negotiate availability of experts outside our team. People change jobs all the time.

Any one of these can impair our ability to deliver. More than one of these can cause our project to crater. Unfortunately, we're left to self-insure against these risks, to limit their impact and make the project whole should they occur. We can't self-insure through predictability: because these risks are unpredictable, we cannot be prepared for each and every eventuality. The pursuit of predictability is a denial of these risks. We need to be resilient to risks, not predictable in the face of them.

This brings us back to the subject of result and reporting cycles: the shorter they are, the more resilient we are to internal and external risks.

Anchoring execution in results makes us significantly less vulnerable to overstating progress. Not completely immune, of course: even with a short results cycle we will discover new scope, which may mean we have more work to do than we previously thought. But scope is an outcome, and outcomes are transparent and negotiable. By comparison, effort-based execution is not: "we're not done coding despite being at 125% of projected effort" might be a factual statement, but it is opaque and not negotiable.

In addition, a short results cycle makes a short reporting cycle more information rich. That, in turn, makes for more effective management.

But to be resilient, we need to look beyond delivery execution and management. A steady diet of reliable data about the software we're developing and how the delivery of that software is progressing allow a steering committee to continuously and fluidly perform its governance obligations. Those obligations are to set expectations, invest the team with the authority to act, and validate results.

When project and asset data are based in results and not effort, it is much easier for a steering committee to fulfill its duty of validation. And it helps us out with our other two obligations as well. We can scrutinize which parts of the business case are taking greater investment than we originally thought, and whether they are still prudent to pursue as we are making them. We also see if we are taking technical shortcuts in the pursuit of results and can assess the long term ramifications of those shortcuts nearer the time they are made. We are therefore forewarned as to whether an investment is in jeopardy long before financial costs and technical debt rise, and change or amplify our expectations of the team as we need to. This, in turn, gives us information to act on the last governance responsibility, which we do by changing team structure, composition and even leadership, and working with the investment committee to maintain viability of the investment itself.

A short results cycle, reporting cycle and governance cycle make any single investment more resilient. They also enable a short investment cycle, which makes our entire portfolio more robust. From an execution perspective, we can more easily move in and out of projects. Supported by good investment flow (new and existing opportunities, continuously reassessed for timeliness and impact), hedging (to alleviate risks of exposure to a handful of "positions" or investments), and continuous governance assessing - and correcting - investment performance, we can make constant adjustments across our entire portfolio. This makes us resilient not only at an individual investment level, but at a portfolio level, to micro and macro risks alike.

IT has historically ignored, abstracted or discounted risks to delivery. Resiliency makes us immune to this starry-eyed optimism which is at the core of IT's chronic underperformance. That makes IT a better business partner.

Thursday, May 31, 2012

Rules versus Results

Assessments are based, not on whether the decisions made are any good, but on whether they were made in accordance with what is deemed to be an appropriate process. We assume, not only that good procedure will give rise to good outcome, but also that the ability to articulate the procedure is the key to good outcomes.

-- John Kay, writing in the Financial Times

A common error in both the management and governance of IT is an over-reliance on rules, process and "best practices" that portend to give us a means for modeling and controlling delivery. We see this in different ways.

Project managers construct elaborately detailed project plans that show business needs decomposed into tasks which are to be performed by specialists in a highly orchestrated effort. And detailed they are, often down to the specific task that will be performed by the specific "resource" (never "person", it's always "resource") on a specific day. This forms a script for delivery. The tech investor without a tech background cannot effectively challenge a model like this, and might even take great comfort in the level of detail and specificity. They have this broken down into precise detail; they must really know what they're doing.
Companies invest in "process improvement" initiatives, with a focus on establishing a common methodology. The belief is that if we have a methodology - if we write unit tests and have our requirements written up as Stories and if we have a continuous build - we'll get better results. Methodology becomes an implicit guarantor of success. If we become Agile, we'll get more done with smaller teams.
"Best practices" are held out as the basis of IT governance. Libraries of explicit "best practices" prescriptively define how we should organize and separate responsibilities, manage data centers, contract with vendors and suppliers, and construct solutions. If there is a widely accepted codification of "best practices", and we can show that we're in compliance with those practices, then there's nothing else to be done: you can't get better then "best". We'll get people certified in ITIL - then we know we'll be compliant with best practices.

* * *

To see how stultifying such behaviour can be, imagine the application of this emphasis on process over outcome in fields other than politics or business. Suppose we were to insist that Wayne Rooney explain his movements on the field; ask Mozart to justify his choice of key, or Van Gogh to explain his selection of colours. We would end up with very articulate footballers, composers and painters, and very bad football, music and art.

-- John Kay

It is tempting to try to derive universal cause-and-effect truths from the performance of specific teams. Because this team writes unit tests, they have higher code quality. Or, Because we ask users to review security policies and procedures at the time their network credentials expire, we have fewer security issues. Does unit testing lead to higher quality? Does our insistence on policy result in tighter security? It may be that we have highly skilled developers who are collaborative by nature, writing relatively simple code that is not subject to much change. It may be that our network has never been the target of a cyber attack. A "rule" that dictates that we write unit tests or that we flog people with security policy will be of variable impact depending on the context.

There are no such things as best practices. There are such things as practices that teams have found useful within their own contexts.

@mikeroberts

Suppose a team has low automated unit test coverage but high quality, while another team has high automated unit test coverage but low quality. A rule for high unit test coverage is no longer indicative of outcome. The good idea of writing unit tests is compromised by examples that contradict lofty expectations for going to the trouble for writing them.

It isn't just the rules that are compromised: rule makers and rule enforcers are as well. When a team compliant with rules posts a disappointing result, or a team ignorant of rules outperforms expectation, rule makers and rule enforcers are left to make ceteris peribus arguments of an unprovable counterfactual: had we not been following the rule that we always write unit tests, quality would have been much lower. Maybe it would. Maybe it would not.

Rules are not a solid foundation for management or governance, particularly in technology. For one thing, they lag innovation, rather than lead it: prescriptive IT security guidelines circa 2007 were ignorant of the risks posed by social media. Since technology is a business of invention and innovation, rules will always be out of date.

For another, rules create the appearance of control while actually subverting it. Rules appear to be explicit done-or-not-done statements. But rules shift the burden of proof of compliance to the regulator (who must show the regulated isn't following the rule), and away from the regulated (who gets to choose the most favourable interpretation of the rules which apply). Tax law which explicitly defines how income or sales transactions are to be taxed encourage the individual to seek out the most favourable tax treatment: e.g., people within easy driving distance of a lower cost sales tax jurisdiction will go there to make large purchases. The individual's tax obligation is minimized, but at a cost to society which is starved for public revenues. In IT terms, the regulated (that is, the individual responsible for getting something done) holds the power to determine whether a task is completed or a governance box can be ticked. I was told to complete this technical task; I have completed it to a point where nobody can tell me I am not done with it. The individual effort is minimized, but at a cost to the greater good of the software under development which is starved for attention. It is up to the regulator (e.g., the business as the consumer of a deliverable, or a steering committee member as a regulator) to prove that it is not.

When the totality of the completed technical tasks does not produce the functionality we wanted, or the checboxes are ticked but governance fails to recognize an existential problem, our results fall short of our expectations despite the fact that we are in full compliance with the rules.

* * *

Instead, we claim to believe that there is an objective method by which all right thinking people would, with sufficient diligence and intelligence, arrive at a good answer to any complex problem. But there is no such method.

-- John Kay

Just because rules-based mechanisms are futile does not mean that all our decisions - project plans, delivery process, governance - are best made ad hoc.

Clearly, not every IT situation is unique, and we can learn and apply across contexts. But we do ourselves a disservice when we hold out those lessons to be dogma. Better that we recognize that rigorous execution and good discipline are good lifestyle decisions, which lead to good hygiene. Good hygiene prevents bad outcomes more than it enables good ones. Not smoking is one way to prevent lung cancer. Not smoking is not, however, the sole determinant of good respiratory health.

Principles give us an important intermediary between prescriptive rules and flying by the seat of our pants. Principles tend to be less verbose than rules, which makes them more accessible to the people subject to them. They reinforce behaviours which reinforce good lifestyle decisions. And although there is more leeway in interpretation of principles versus rules, it is the regulator, not the regulated, who has the power to interpret results. Compliance with a principle is a question of intent, which is determined by the regulator. If our tax law is a principle that "everybody must pay their fair share", it is the regulator who decides what constitutes "fair share" and does so in a broader, societal context. Similarly, a business person determines whether or not software satisfies acceptance criteria, while a steering committee member assesses competency of people in delivery roles. Management and governance are not wantonly misled by a developer deciding that they have completed a task, or an auditor confirming that VMO staffing criteria are met.

"We always write unit tests" is a rule. "We have appropriate test coverage" is a principle. "Appropriate" is in the eye of the beholder. Coming to some conclusion of what is "appropriate" requires context of the problem at hand. Do we run appreciable risks by not having unit test coverage? Are we gilding the lily by piling on unit test after unit test?

We are better served if we manage and govern by result, not rule. Compliance with a rule is at best an indicator; it is not a determinant. Ultimately, we want people producing outstanding results on the field, not to be dogmatic in how they go about it. Rules, processes, and "best practices" - whether in the form of an explicit task order that tells people what they are to do day in and day out or a collection of habits that people follow - do not compensate for a fundamental lack of capability.

Adjudication of principles requires a higher degree of capability and situational awareness by everybody in a team. But then we would expect the team with the best players to triumph over a team with adequate players in an adequate system.

Monday, April 30, 2012

Shorten the Results Cycle, not the Reporting Cycle

A big software development project collapses just weeks before it is expected to be released to QA. According to the project leadership team, they're dogged by integration problems and as a result, the software is nowhere close to being done. It's going to take far more time and far more money than expected, but until these integration problems get sorted out, it isn't clear how much more time and how much more money.

The executive responsible for it is going to get directly involved. He has a background in development and he knows many of the people and suppliers on the team, but he doesn't really know what they're doing or how they're going about doing it.

The team situation is complicated. There are the consultants from a product company, consultants from two other outsourcing firms, several independent contractors, plus a few of our own employees. QA is outsourced to a specialist firm, and used as a shared service. All these people are spread out across the globe, and even where people are in the same city they may work out on different floors, or in different buildings, or simply work from home. Teams are organized by technology (e.g., services developers) or activity (analysts, developers, QA). Project status data is fragmented: we have a project plan, organized by use cases we want to assemble and release for staged QA. We have the developer tasks that we're tracking. We have the QA scripts that need to be written and executed. We have test data that we need to source for both developers and QA. And we have a defect list. Lots of people, lots of places, lots of work, lots of tracking, but not a lot to show for it.

The executive's first action will be to ask each sub-team to provide more status reports more frequently, and report on the finest details of what people are doing and how long it will be before they're done. New task tracking, daily (perhaps twice daily) status updates, weekly status updates to stakeholders to show progress.

* * *

The common reaction to every failure in financial markets has been to demand more disclosure and greater transparency. And, viewed in the abstract, who could dispute the merits of disclosure and transparency? You can never have too much information.

But you can.

So wrote John Kay in the Financial Times recently. His words are applicable to IT as well.

Gathering more data more frequently about an existing system merely serves to tell us what we already know. A lot of people are coding, but nothing is getting done because there are many dependencies in the code and people are working on inter-dependent parts at different times. A lot of use cases are waiting to be tested but lack data to test them. A lot of functional tests have been executed but they're neither passed nor failed because the people executing the tests have questions about them. The defect backlog is steadily growing in all directions, reflecting problems with the software, with the environments, with the data, with the requirements, or just mysterious outcomes nobody has the time to fully research. When a project collapses, it isn't because of a project data problem: all of these things are in plain sight.

If getting more data more frequently isn't going to do any good, why do the arriving rescuers always ask for it? Because they hope that the breakthrough lies with adjusting and fine tuning of the existing team and organization. Changing an existing system is a lot less work - not to mention a lot more politically palatable and a lot less risky - than overhauling a system.

* * *

There are costs to providing information, which is why these obligations have proved unpopular with companies. There are also costs entailed in processing information – even if the only processing you undertake is to recognise that you want to discard it.

More reporting more often adds burden to our project managers, who must now spend more time in meetings and cobbling together more reports. Instead of having the people close to the situation look for ways to make things better, the people close to the situation are generating reports in the hope that people removed from the situation will make it better. It yields no constructive insight into the problems at hand. It simply reinforces the obvious and leads to management edicts that we need to "reduce the number of defects" and "get more tests to pass."

This reduces line leaders into messengers between the team (status) and the executive (demands). As decision making authority becomes concentrated in fewer hands, project leadership relies less on feedback than brute force.

* * *

[M]ost of the rustles in the undergrowth were the wind rather than the meat.

Excessive data can lead to misguided action, false optimizations and unintended outcomes. Suppose the executive bangs the drum about too many defects being unresolved for too long a period of time. The easiest thing for people to do isn't to try to fix the defects, but to deflect responsibility for them, which they can do by reassigning those in their queue to somebody else. Some years ago, I was asked by a client to assess a distressed project that among other things had over 1,000 critical and high priority defects. It came as no surprise to learn that every last one of them was assigned to a person outside the core project team. The public hand wringing about defects resulted in behaviours that deferred, rather than accelerated, things getting fixed.

* * *

The underlying profitability of most financial activities can be judged only over a complete business cycle – or longer. The damage done by presenting spurious profit figures, derived by marking assets to delusionary market values or computed from hypothetical model-based valuations, has been literally incalculable.

Traditional IT project plans are models. Unfortunately, we put tremendous faith in our models. Models are limited, and frequently flawed. Financial institutions placed faith in Value at Risk models, which intentionally excluded low probability but high impact events, to their (and to the world's) detriment. Our IT models are similarly limited. Most IT project plans don't actively include the impact analysis of reasonably probable things like the loss of key people, of mistakes in requirements, or business priority changes.

In traditional IT, work is separated into different phases of activity: everything gets analyzed, then everything gets coded, then everything gets tested, then it gets deployed. And only then do we find out if everything worked or not. It takes us a long, long time - and no small amount of effort - to get any kind of results across the finish line. That, in turn, increases the likelihood of disaster. Because effort makes a poor proxy for results, interim progress reports are the equivalent of marking to model. Asking project managers for more status data more frequently burns the candle from the wrong end: reducing the project status data cycle does nothing if we don't shorten the results cycle.

* * *

It is time companies and their investors got together to identify information [...] relevant to their joint needs.

We put tremendous faith in our project plans. Conventional thinking is that if every resource on this project performs their tasks, the project will be delivered on time. The going-in assumption of a rescue is that we have a deficiency in execution, not in organization. But if we are faced with the need to rescue a troubled project, we must see things through the lens of results and not effort. Nobody sets out to buy effort from business analysts, programmers and QA analysts. They do set out to buy software that requires participation by those people. This is ultimately how any investment is measured. This and this alone - not effort, not tasks - must be the coin of the realm.

As projects fail for systemic more than execution reasons, project rescues call for systemic change. In the case of rescuing a traditionally managed IT project, this means reorganizing away from skill silos into teams concentrating on delivering business-specific needs, working on small business requirements as opposed to technology tasks, triggering a build with each code commit and immediately deploying it to an environment where it is tested. If we do these things, we don't need loads of data or hyper-frequent updates to tell us whether we're getting stuff done or not. Day in and day out, either we get new software with new features that we can run, or we don't.

It is irrational to organize a software business such that it optimizes effort but produces results infrequently. Sadly, this happens in IT all the time. If we believe we can't deliver meaningful software in a short time frame - measured in days - we've incorrectly defined the problem space. Anything longer than that is a leap of faith. All the status reports in the world won't make it anything but.