I consult, write, and speak on running better technology businesses (tech firms and IT captives) and the things that make it possible: good governance behaviors (activist investing in IT), what matters most (results, not effort), how we organize (restructure from the technologically abstract to the business concrete), how we execute and manage (replacing industrial with professional), how we plan (debunking the myth of control), and how we pay the bills (capital-intensive financing and budgeting in an agile world). I am increasingly interested in robustness over optimization.

I work for ThoughtWorks, the global leader in software delivery and consulting.

Thursday, December 31, 2020

There Might Be No Grand Lessons, But There Are Plenty of Darn Good Ones

Janen Ganesh wrote in the FT this past week that “COVID offers no grand lesson”. His point is that no political and economic system in any nation has consistently outperformed all others as far as COVID policy is concerned. While his inconclusiveness appears to be just a few column inches filled to meet a deadline at a time of the year when most aren’t reading their news sources too closely, it reinforces something that I wrote last month: that the “grand question” analyses show the bankruptcy of the question itself.

Just the same, there are some pretty good lessons from COVID-19.

Corporate income statements are bloated with expenses, and the beneficiaries of that bloat are in for a reckoning. Bloomberg reported this week: “Scars inflicted on travel are looking permanent. Companies are shifting away from massive expense accounts and the experiential lifestyle has become a memory.”

Corporate balance sheets are bloated and the writedowns are going to be painful. The WSJ reported this week: “Oil industry has written down about $145 billion in assets this year amid an unprecedented downturn and long-term questions about oil prices.”

Devolved decision making is superior to command and control. In the FT this week: “Devolution rules.” “[S]enior executives had to accept that management decisions were best taken on the front line and not in head office - often reversing a more traditional top-down line of command.”

Municipalities are changing in extraordinary ways as companies and their employees leave high cost of living (and high-tax) cities and states for less expensive ones. In the WSJ this week: “Accelerated Growth Strains Austin.” Rents are up, commercial areas are being significantly redeveloped, and residential prices are skyrocketing as the population surges. And with it, the culture and character of the town will change, too. “What happened in San Francisco with the tech boom was something nobody saw coming until it was too late.”

COVID-19’s lessons are that corporate spending was too high, a lot of valuations were too great, decision rights too concentrated, and many companies don’t need to be based in high cost geographies. Activist investors have already been pushing companies on the second point, challenging oil majors for overstating their reserves. Activists will scrutinize P&Ls to look for expense bloat and taxes to free cash to return to investors, and command and control management practices that stifle local efficiencies and innovation.

Societies, economies and businesses don’t evolve through grand design in response to big questions; they evolve through the crowdsourced effort of their member’s ability to muddle through. In benign times - stable monetary policy, low inflation and slow growth - the muddling through isn’t an obvious phenomenon. In extraordinary times, it is. So it turns out that corporate spending was too high, valuations were too great, control was too concentrated, and companies chose to locate in labor bubbles. The shock of this realization raises big questions. Those questions will be resolved through millions of small answers, not the narrative fallacy of a big narrative.

Monday, November 30, 2020


For two decades now, we’ve heard about the threat of tech disruption to established industries and incumbent firms. Yet it isn’t the tech that disrupts, it’s socio-economic change that creates conditions that a technology can exploit. Tech isn’t the catalyst, but it can be the beneficiary.

COVID may turn out to be the greatest global catalyst of socio-economic change since the middle of the 20th century. As the pandemic has continued and the numbers have risen, the chattering classes are now asking what the lasting changes will be. These can be useful exercises, certainly to the business leaders who’ve got to find their customers or compete against rivals with slimmed down cost structures. Not to mention, the acceleration of innovation - a WSJ article recently cited a McKinsey study that had suggested 10 years of innovation was compressed into a 3 month window - has created opportunities that were not practical just a year ago.

No surprise that the analyses range from the very narrow to the very broad. The narrow ones are easy to comprehend and useful for specific industries. For example, I’ve read projections that anywhere from 15% to 50% of all business travel isn’t going to return. Although a wide range, it suggests that airlines and hotels will have to appeal to leisure travelers to fill seats and beds. Leisure travelers are more price sensitive and less brand loyal than business travelers, so even if volume recovers, revenue will lag, which portends more cost cutting or in-travel sales or on-demand activations (you have to swipe a credit card to get the TV to work in coach on the airline, why not require a customer to swipe a credit card to get the TV to work in the discounted-rate hotel room?) It also suggests that a startup airline with a clean balance sheet, a fleet of fresh planes requiring little maintenance (there’s a desert parking lot loaded with low mileage 737MAX jets), able to draw on a large experienced labor force of laid off travel workers could create significant heartache for incumbents.

At the other end of the extreme are the macro analyses asking The Big Questions. Are cities dead? Is cash dead? Is brick-and-mortar retail dead? These are less useful. The Big Questions are too big. They require far more variables and data than can be acquired let alone thoughtfully considered in a coherent analysis. The authors traffic in interesting data, but either lack the courage to draw any conclusion beyond Things Might Change But Nobody Knows (thanks for that, so helpful), or use the data selectively to present defenses for their preference of what the future will be.

In the middle are Big Question headlines with narrow questions posed, even if not answered. Analyses on “the future of work” cite specific employer examples to posit what is now possible (e.g., specific roles that gain nothing from being in an office and lose nothing by being distributed) and broad employee survey data to suggest their potential scale (e.g., 25% of employees in such-and-such industry want working from home to be a permanent option on a part- or full-time basis). These are useful analyses when they highlight future challenges on management and supervision, collaboration and communication. Economically, employer and employee alike win when a person chooses to relocate to a lower cost-of-living area for quality of life purposes. But that only works if the physical separation causes minimal, if any, impact to career growth, skill acquisition, productivity and participation, and corporate culture. A company believing it can espouse even moderately aggressive distributed workforce policies must be aware that these are specific problems to be solved.

What I’ve yet to see is an analysis of how the institutions that are benefitting and the institutions that are suffering will influence the micro-level trends and, by extension, influence the answers to The Big Questions.

Consider a large universal bank that employs hundreds of thousands of people in cities round the world. One way its retail bank makes money by converting deposits into loans. One way its commercial bank makes money is by making mortgage loans to businesses. One way its investment bank makes money is by underwriting debt issued by municipalities. It may look as if the bank can reduce its operating costs by institutionalizing a work-from-home policy for a large portion of its workforce. But doing so is self-destructive to its business model. Fewer employees in office towers means fewer people to patronize the businesses to which the bank lends, fewer public transport and sales tax receipts to the municipalities whose debt they underwrite, and less demand for construction and renovation of mixed use commercial properties. The bank stands to lose a lot more in revenue than it would gain in reduced costs, so as a matter of policy, a universal bank will want its employees back in their offices in full numbers. The bank will set the same expectation to vendors, particularly those supplying labor.

But other companies are benefiting from this change and will want permanency of these new patterns. Oracle provides the cloud infrastructure that Zoom operates on. More Zoom meetings not only means more revenue for Oracle’s cloud business (investors will pay a premium for a growth story in cloud services), it gives their cloud infrastructure business a powerful reference case as they pursue new clients. It comes as no surprise that Oracle’s executive chairman Larry Ellison is a vocal proponent of lasting change.

And, of course, nobody knows what public policy will look like, which will play a huge role in what changes are permanent and what reverts to the previous definition of normal. State and municipal governments are facing significant tax receipt shortfalls as a result of COVID policies. Many have also suffered a depletion of residents and small businesses. They may offer aggressive tax incentives to encourage new business formation or expansion as well as commercial property development. At the same time, there are states that have received an influx of population and cities that have seen residential property price increases. They will be reluctant to see their newly arrived neighbors leave, so they, too, will offer incentives for them to stay.

It isn’t difficult to imagine there will be aggressive new forms of competition. Suppose firm A is adamant about employees returning to the office. If the employee survey data is to be believed, it’s possible that as much as 25% of firm A’s labor force prefers to work from home a majority of the time. Firm B can aggressively use that as a recruiting wedge to not only lure away firm A’s talent, but offer them relocation packages to lower cost-of-living areas, expanding and potentially upgrading their talent pool at a lower price.

Or, suppose that city C imposes putative taxes on companies employing a distributed workforce. It’s not unprecedented. Several cities already charge a “commuter tax” (also known as a “head tax”) on employers with workers who travel into the city. This would instead be a “can’t-be-bothered-to-commute tax” levied on employers in a city whose workers do not travel into the city. Meanwhile, near-west suburb D of city C entices a WeWork-like firm to develop a property that can house several businesses with partially distributed workforces, offering a smaller physical office space with fully secure physical and digital premises. This would lure midsized employers whose labor force lives largely in the western suburbs, reducing not only their rents but avoiding the “headless tax” imposed by city C.

The analyses of what will or will not change and why it will or will not change is only going to increase in the coming months. And, because some stand to lose significantly from change while others stand to benefit handsomely, the debate will only intensify. For those without the balance sheet and political clout to write the future, a firmly held opinion about the future isn’t worth very much. But the ability to study, process, absorb, investigate and prove ways of exploiting heretofore unrealizable opportunities is priceless.

Saturday, October 31, 2020

Playing the Cards You're Dealt

Some years ago, I was working with a company automating its customer contract renewal process. It had licensed a workflow technology and contracted a large number of people to code and configure a custom solution around it. This was no small task given the mismatch between a fine granularity of rules on the one hand and a coarse granularity of test cases on the other. The rules were implemented as IFTTT statements in a low-code language that did not allow them to be tested in isolation. The test cases consisted of clients renewing anywhere from one to four different types of contracts, each of which had highly variable terms and interdependencies on one another.

At the nexus of this mismatch was the QA team, which consisted almost entirely of staff from an outsourcing firm. An vendor had sold the company on QA capacity at a volume of 7 test scripts executed per person per day. They had staffed 50 total people to the program team, while the company had staffed four QA leads (one for each contract team). The outsourcing vendor was reporting no less than 350 test scripts executed by their staff every day, yet the QA managers were reporting very low test case acceptance and the development team was reporting the test case failures could not be replicated.

A little bit of investigation into one of the four teams exposed the mismatch. The outsourcing staff of this one team consisted of 10 people, contractually obligated to execute 70 test scripts. The day I joined, the team reported 70 test scripts executed, of which 5 passed and 6 failed.

Eleven being a little short of seventy, I wanted to understand the discrepancy. The answer from the contracted testers was, "we have questions about the remaining 59." The lead QA analyst - an employee, not a contractor - spent the entire day plus overtime investigating and responding to the questions pertaining to the 59. And then the cycle would start all over again. The next day it was 70 executed with 3 passed and 4 failed. The day after it was 70 executed with 1 passed and 9 failed. And the lead QA would spend the day - always an overtime day - responding to the questions from the outsourced team.

Evidently, this cycle had been going on for some time before I arrived.

We investigated the test cases that had been declared passed and failed. Turns out, those tests that were reported as having passed hadn't really passed: the tester had misinterpreted the results and reported a false positive. And those reported as failed hadn't actually failed for the reason stated: the tester had misinterpreted those results as well. On some occasions, it was the wrong data to test the scenario; in others, it had failed, but it was because a different rule should have executed. In just about every circumstance, it was false results. The outsourced testers were expending effort but yielding no results whatsoever. A brief discussion with the QA lead in each of the other three teams confirmed that they were experiencing exactly the same phenomenon.

After observing this for a week and concluding that no amount of interaction between the QA lead and the outsourced staff was going to improve either the volume of completions or fidelity of the results, I asked the one lead QA to stop working with the outsourced team, and instead to see how many test cases she could disposition herself. The first day, she conclusively dispositioned 40 test scripts (that is, they had a conclusive pass or fail, and if they failed it was for reasons of code and not of data or environment). The second day, she was up to 50. The third, she was just over 50. She was able to produce higher fidelity and higher throughput at lower labor intensity and for lower cost. And she wasn't working overtime to do so.

The outsourced testing capacity was net negative to team productivity. That model employed eleven people to do less than the work of one person.

This wasn't the answer that either the outsourcing vendor or the program office wanted. The vendor was selling snake oil - the appearance of testing capacity that simply did not exist in practice - and was about to lose a revenue stream. The program office was embarrassed for managing the maximization of staff utilization rather than outcomes (that is, relying on effort as a proxy for results).

The reaction of both vendor and program office weren't much of a surprise. What was a surprise was the fact that nobody had called bullshit up to that point. Experimenting with change wasn't a big gamble. The program had nothing to lose except another day of frustration rewarded by completely useless outputs from the testing team. So why hadn't anybody audited the verifiable results? Or made a baseline of testing labor productivity without the participation of the outsourcing team?

This wasn't a case of learned helplessness. The QA leads knew they were on the hook for meaningful testing throughput. The program office believed they had a lot of testing capacity that was executing. The vendor believed the capacity they had sold was not properly engaged. Nobody was going the motions, and everybody believed it would work. The trouble was, they were playing the cards they'd been dealt.

Some years later, I was working with a corporate IT department trying to contain increasing annual spend on ERP support. Although they had implemented SAP at a corporate level and within a number of their subordinate operating companies, they still had some operating companies using a legacy homespun ERP and all business units still relied on decades of downstream data warehouses and reporting systems. Needless to say, there were transaction reconciliation and data synchronization problems. The corporate IT function had entered into a contract with a vendor to resolve these problems. In the years following the SAP implementation, vendor support costs had not gone down but had gone up, proportional to the increase in transaction volume. The question the company was asking was why the support labor couldn't respond to more discrepancies given they had so many years experience with resolving them?

It didn't take a stroke of genius to realize that the vendor stood to gain from their customer's pain: the greater the volume of discrepancies, the more billing opportunities there were for resolution. Worse still, the vendor benefited from the same type of failure recurring again and again and again. The buyer had unwillingly locked themselves into a one-way contract: their choices were to live with discrepancies or pay more money to the vendor for more labor capacity to correct them. The obvious fix was to change the terms of the contract, rewarding the vendor for resolving the discrepancies at their root cause rather than rewarding the vendor for solving the same problem over and over and over. This they did, and the net result was a massive reduction of recurring errors, and a concomitant reduction in the contract labor necessary to resolve errors.

This was, once again, a problem of playing the cards that had been dealt. For years, management defined the problem of containing spend on defect / discrepancy resolution. They hadn't seen it as a problem of continuous improvement in which their vendor was a key partner in that improvement rather than a cost center to be contained.

There are tools that can help liberate us from constraints, such as asking the Five Why's. But such tools are only as effective as the intellectual freedom as we're allowed to pursue them in the first place. If the root question is "why is test throughput so low given the high volume of test capacity and the high rate of test execution", or "how can the support staff resolve defects more quickly to create more capacity", the exercise begins with confirmation bias, in this case that the operating model (the test team composition, the defect containment team mission) is correct. The Five Why's are less likely to lead to an answer that existentially challenges the paradigm in place if the primary question is too narrowly phrased. When that happens, the exercise tends to yield no better than "less bad."

It's all well and good for me to write about how I saw through a QA problem or a support problem, but the fact of the matter is we all fall victim to playing the cards that we're dealt at one time or another. A vendor paradigm, a corporate directive, a program constraint, a funding model, an operating condition limits our understanding of the actual problem to be solved.

But reflecting on it is a reminder that we must always be looking for different cards to play. Perhaps now more than ever, as low contact and automated interactions permanently replace high contact and manual ones in all forms of business, we need to be less intellectually constrained before we can be more imaginative.

Wednesday, September 30, 2020

All In

Immediately after World War II, Coca-Cola had 60% of the soft drinks market in the United States. By the early 1980s, it had about 25%. Not only had Coca-Cola been outmanouvered in product marketing (primarily by Pepsi), consumer concerns over sugar and calories drove consumers to diet soft drinks and to refreshment options outside of the soft drink category. The fear in the executive ranks was evidently so great that Coca-Cola felt it necessary to change its formula. Coca-Cola did the market research and found a formula that fizzy-drink consumers preferred over both old Coke and Pepsi. Coca-Cola launched the new product as an in-place replacement for their flagship product in 1985.

New Coke flopped.

Within weeks, consumer blowback was comprehensive and fierce (no small achievement when media was still analogue). Turns out there is such a thing as bad publicity if it causes people to stop buying your product. Sales stalled. Before three months were out, the old Coca-Cola formula was back on the shelves.

Why did Coca-Cola bet the franchise? The data pointed to an impending crisis of being an American institution that would soon be playing second fiddle to a perceived upstart (ironically an upstart founded in the 19th century). Modern marketing was coming into its own, and Pepsi sought to create a stigma among young people who would choose Coke by using a slogan and imagery depicting their cohort as "the Pepsi generation." Coca-Cola engineered a replacement product that consumers rated superior to both the classic Coke product and Pepsi. The data didn't just indicate New Coke was a better Coke than Coke, the data indicated New Coke was a better Pepsi than Pepsi. New Coke appeared to be The Best Soft Drink Ever.

Still, it flopped.

There are plenty of analyses laying blame for what happened. One school of thought is that the testing parameters were flawed: the sweeter taste of New Coke didn't pair as well with food as classic Coke, nor was a full can of the sweeter product as satisfying as one sip. Another is sociological: people had a greater emotional attachment to the product that ran deeper than anybody realized. Most of it is probably right, or at least contains elements of truth. There's no need to rehash any of that here.

New Coke isn't the only New thing that flopped in spectacular fashion. IBM had launched the trademarked Personal Computer in 1981 using an open architecture of widely available components from 3rd party sources such as Intel and the fledgling Disk Operating System from an unknown firm in Seattle called Microsoft. Through sheer brand strength, IBM established dominance almost immediately in the then-fragmented market for microcomputers. The open hardware architecture and open-ended software licensing opened the door for inexpensive IBM PC "clones” that created less expensive, equally (and sometimes more) advanced, and equally (if not superior) quality versions of the same product. IBM created the standard but others executed it just as well and evolved it more aggressively. In 1987, IBM introduced a new product, the Personal System/2. It used a proprietary hardware architecture incompatible with its predecessor PC products, and a new operating system (OS/2) that was only partially compatible with DOS, a product strategy not too dissimilar to what IBM did in the 1960s with the System/360 mainframe. IBM rolled the dice that it could achieve not just market primacy, but market dominance. They engineered a superior product. However, OS/2 simply never caught on. And, while the hardware proved initially popular with corporate buyers, the competitive backlash was fierce. In a few short years, IBM lost its status as the industry leader in personal computers, had hundreds of millions of dollars of unsold PS/2 inventory, laid off thousands of employees, and was forced to compete in the personal computer market on the standards now set by competitors.

These are all-in bets taken and lost, two examples of big bets that resulted in big routs. There are also the all-in bets not taken and lost. Kodak invented digital camera technology but was slow to commercialize it. The threat of lost cash flows from their captive film distribution and processing operations in major drug store chains (pursuit of digital photography by a film company meant loss of lucrative revenue to film distributors and processors) was sufficient to cow Kodak executives into not betting the business. Polaroid was similarly an leader in digital cameras, but failed to capitalize on their early lead. Again, there have been plenty of hand-wringing analyses as to why. Polaroid had a bias for chemistry over physics. Both firms were beholden to cash flows tied to film sales to distributors with a lot of power. While each firm recognized the future was digital, neither could fathom how rapidly consumers would abandon printed pictures for digital.

We see similar bet-the-business strategies today. In the early 2000s, Navistar bet on a diesel engine emission technology - EGR, or exhaust-gas-recirculation - that was contrary to what the rest of the industry was adopting - SGR, or selective catalytic reduction. It didn't pan out, resulting in market share erosion that was both substantial and rapid, while also resulting in payouts of hundreds of millions of dollars in warranty fees. Today, GM is betting its future on electronic vehicles: the WSJ recently reported that quite a few internal-combustion based products were cut from the R&D budget, while no EV products were.

All in.

The question isn't "was it worth betting the business." The question is, "how do you know when you need to bet the business."

There are no easy answers.

First, while it is easy to understand what happened after the fact, it is difficult to know what alternative would have succeeded. It isn't clear that either Kodak or Polaroid had the balance sheet strength to withstand a massive erosion in cash flows while flopping about trying to find a new digital revenue model. The digital photography hardware market was fiercely competitive and services weren't much of a thing initially. Remember when client/server software companies like Adobe and SAP transitioned to cloud? Revenues tanked and it took a few years for subscription volume to level up. It was, arguably, easier for digital incumbents to make a digital transition in the early 2010s than it was for an analogue incumbent to make the same move in the late 1990s. Both firms would have been forced to sacrifice cash flows from film (and Kodak in film processing) in pursuit of an uncertain future. As the 1990s business strategy sage M. Tyson observed, "Everyone has a plan until they're punched in the face."

To succeed in the photography space, you would have needed to anticipate that the future of photography was as an adjunct to a mobile computing device, twined with as-of-yet unimagined social media services. Nobody had that foresight. Hypothetically, Kodak or Polaroid execs could (and perhaps even did) anticipate sweeping changes in a digital future, but not one that anticipated the meteoric rise in bandwidth, edge computing capabilities, AI and related technologies. A "digital first" strategy in 1997 would have been short-term right, only to have been proven intermediate- and long-term wrong without a pivot to services such as image management and a pivot a few short years after that to AI. It's difficult to believe that a chemistry company could have successfully muddled through a physics, mathematics and software problem space. It's even more difficult to imagine the CEO of that company could successfully mollify investors again and again and again when asking for more capital because the firm is abandoning the market it just created because it's doomed and now needs to go after the next - and doing that three times over the span of a decade. In theory, they could have found a CEO who was equal parts Marie Curie, Erwin Schrödinger, Issac Newton, Thomas Watson, Jr., Kenneth Chenault, and Ralph Harrison. In practice, that's a real easy short position to take.

Second, it's all well and good when the threat is staring you in the face or when you have the wisdom of hindsight, but it's difficult to assess a threat let alone know what the threats and consequences really are, and are not. A few years ago, a company I was working with started to experience revenue erosion at the boundaries of their business, with small start-up firms snatching away business with faster performance and lower costs. It was a decades-old resource-intensive data processing function, supplemented with labor-intensive administration and even more labor-intensive exception handling. Despite becoming error-prone and slow, they had a dominant market position that was, to a certain extent, protected by exclusive client contracts. While both the software architecture and speed prevented them from entering adjacent markets with their core product, the business was a cash cow and financed both dividends and periodic M&A. They suffered from an operational bias that impaired their ability to imagine the business any differently that it was today, a lack of ambition to organically pursue adjacent markets, and a lack of belief that they faced an existential threat from competitors they saw as little more than garage-band operators. Yet both the opportunities and the threats looked very plausible to one C-level exec, to a point that he believed failure to act quickly would mean significant and rapid revenue erosion, perhaps resulting in there not being a business at all in a few years. Unfortunately, all unproveable, and by the time it would be known whether he was prophet or crazy street preacher, it would be too late to do anything about it: remaining (depleted) cash flows would be pledged to debt service, inhibiting any re-invention of the business.

Third, even the things you think you can take for granted that portend future change aren't necessarily bankable on your timeline. Some governments have already created legislation that all new cars sold must be electric (or perhaps more accurately, not powered by petroleum) by a certain date. A lot of things have to be true for that to be viable. What if electricity generation capacity doesn't keep up, or sufficient lithium isn't mined to make enough batteries? Or what if hydrocarbon prices remain depressed and emissions controls improve for internal combustion engines? Or what if foreign manufacturers make more desirable and more affordable electronic vehicles than domestic ones can? If they were to happen, it would increase the pressure that legislatures would feel to postpone the date for full electrification. For a business, going all-in too late will result in market banishment, but too early could result in competitive disadvantage (especially if a company creates the "New Coke" of automobiles... or worse still, The Homer). These threats create uncertainty in allocating R&D spend, risk of sales cannibalization of new products by old, and sustained costs for carrying both future and legacy lines for an extended period of time.

Is it possible to be balance sheet flexible, brand adaptable, and operationally lean and agile, so that no bet need be a bet of the business itself, but near-infinite optionality? A leader can be ready for as many possibilities as that person can imagine. Unfortunately, that readiness goes only as far as creditors and investors will extend the confidence, customers will give credibility to stretch the brand, and employees and suppliers can adapt (and re-adapt). To the stars and beyond, but if we're honest with ourselves we'll be lucky if we reach the Troposphere.

Luck plays a bigger role than anybody wants to acknowledge. The bigger the bet, the more likely the outcome will be a function of being lucky than being smart. The curious thing about New Coke is that it might have been the Hail Mary pass that arrested the decline of Coca-Cola. Taking away the old product - that is, completely denying anybody access to it - created a sense of catastrophic loss among consumers. Coca-Cola sales rebounded after its reintroduction. In the end, it proved clever to hold the flagship product hostage. Analyst and media reaction was cynical at the time, suggesting it was all just a ploy. Then-CEO Roberto Goizueta responded aptly, saying "we're not that smart, and we're not that dumb."

And that right there is applied business strategy, summed up in 9 words.

Monday, August 31, 2020

Legacy Modernization

I've worked with quite a few companies for which long-lived software assets remain critical to day-to-day operations, ranging from 20-year-old ERP systems to custom software products that first processed a transaction way back in the 1960s. In some cases, if only a very few, these assets continue to be used because they still work very well, were thoughtfully designed, and have been well cared for over the years. The vast majority of the time, though, they continue to be used because the cost to replace them is prohibitive. Decades of poor architecture guidelines and lax developer discipline resulted in the commercial-off-the-shelf components of an ERP becoming inseparable from the custom code built around it. Decades of upstream and downstream systems resulted in point-to-point and often database-level integrations to legacy systems. Several turns through the business cycle starved those legacy applications for investment at critical junctures, and the passage of time has left few people remaining with intimate knowledge of how they work. The assets are not current with business needs, it will require a long time to re-acquire the knowledge of how they work, and it will require a fair bit of investment if they are to be leveled up to meet current needs. Operations suffer an increased labor intensity needed to compensate for feature gaps and repetitive, systemic errors.

It may seem like common sense that this is an asset that must be replaced for the betterment of the business, but the economics don't necessarily justify doing so. It's cheaper to hire administrative labor to make up for systemic shortcomings than it is to replace all the features and functions of an asset that has been decades in development. Risk may be legitimate, but risk is a threat that merits cost for preparation and containment, not necessarily cost for elimination. Justifying a pricey legacy migration principally on an expectation of risk realization is a huge career gamble, especially if there is plausible deniability (e.g., annual independent audits that flag the exposure "high" but the probability "low") to the decision-maker should the risk actually materialize. By and large, until the asset fails in spectacular fashion, there's no real motivation for a business to invest in a replacement.

Enter the "legacy modernization" initiative. A basic premise is that there are alternatives to traditional "lift and shift" strategies by retiring legacy assets in a piecemeal fashion. Several things stand out about these initiatives.

The first and most obvious is that they are long on how to start, but short on how to finish. The exercise of assessing, modeling and dispositioning the landscape does offer valuable new ways of looking at legacy assets. Collectively mapping an ERP, CRM and a handful of custom applications to logical domains as opposed to traditional thinking that portrays them as closed systems with territorial control over data offers a different perspective on software capabilities. The suitability of the investment profile (sustain, evolve or upgrade) of underlying assets can also change when seen in the light of how those assets enable or impair the evolution of those domains in response to business need. Code translators (for example, from COBOL to J2EE) can make legacy code accessible to a new generation of developers to aid with reverse-engineering. All useful stuff, but it's still just assessment work. A lot has to be true for a piecemeal strategy to be viable, and that isn't knowable until coding begins in anger.

The one (and as near as I an tell, only) alternative pattern to "lift and shift" is strangulation in one form or another. Or perhaps more accurately, asset capture through strangulation. The preferred form is the gradual retirement of code through encapsulation, interception and substitution. When it is viable, it is pretty straightforward: isolate key functionality, decouple it from things like direct database calls, wrap it in an API, create a battery of automated tests around it, have old and new systems invoke the API, rewrite the functionality in a modern architecture using a modern language, and redirect the API to the new code from the old once ready. To truly incrementally retire the legacy code, however, requires that the legacy asset itself be both decomposable and (at least to some extent) recomposable. The integration of orchestration logic with functional logic common in legacy code makes decomposition very difficult. Tight coupling of code to data (think COBOL programs to VSAM files, or ABAP programs to customized tables in SAP) makes code difficult to recompose into new structures. Plus, both decomposition and recomposition require people fluent in the legacy code, with the skills to re-engineer it to do things such as redirect to an abstraction layer that enables functionality to be migrated, and with the confidence to do so without causing any damage. Of course, this can be side-stepped, at least to a degree, by building agnostic user interfaces that invoke APIs over robotic process automation, but the lack of malleability of the underlying code will preclude eclipsing a legacy asset at a finely-grained and highly tunable manner. It can only be strangled at a more coarsely-grained manner. By way of example, a credit-card processor supporting both house-branded cards and white-labeled third party cards that wants to replace its legacy transaction processing assets might be able to migrate volume brand-by-brand. This is preferable to migrating all-or-nothing, but it lacks the granularity of migrating function-by-function within the asset itself. Excluding any card-specific custom capabilities, 100% of the legacy codebase will remain in production until all transactions are redirected toward the replacement and all in-flight transactions are fully processed.

Something often overlooked is that the tight coupling of software assets is a mirror of the tight coupling of business functions. A company with a legacy procurement system integrated to a legacy ERP may find that it can modernize purchase orders but not purchase order receiving because of tight coupling of receiving to inventory, accounts payable, manufacturing and supply chain management. It is one thing to strangle purchase order functionality or migrate purchase order volume from legacy to modern systems, but it is entirely another to do so over core accounting functions that are, by their nature, tightly coupled capabilities regardless the fact those capabilities are the purview of different departments or functions. Integrated finance really does have its benefits.

Another challenge to modernization is finding relevant reference cases. The conditions surrounding one company's legacy assets - sophistication and quirks of the business model, erosion of asset and / or business process fluency, complications from prior modernizations that started but stalled, complications from acquisitions, complications from regulatory restrictions, and on and on - are not identical to those surrounding another. Finding a success story is great, but applicability of any methodology is a function of the similarity of conditions. As was written many decades ago by the great IT philosopher L. Tolstoy, "all IT organizations are dysfunctional in their own way." There isn't a legacy migration playbook. At best, there are individual plays, or more likely things to be learned from individual plays.

Perhaps the biggest challenge is that the value proposition for legacy modernization is thin. A quick survey of consulting firms pitching legacy modernization services reveals the primary benefits to be things like "reduce IT complexity and costs", "improve flexibility and collaboration", "increase data consistency." These are woolly statements, difficult to quantify, and not necessarily true. Replacing the long-since-paid-for on-prem mainframe with IaaS will bulge the expense line of the income statement as accountants treat IaaS as an annually-funded subscription, not an asset that can be bought and paid for and capitalized. While there may indeed be a step-by-step path to legacy modernization, the cost of not completing the journey is additional layers of code that need to be maintained, redundant production systems and infrastructure, and additional layers of transaction reconciliation. Reduction of IT costs requires legacy system retirement. Retirement requires feature parity of core cases and edge cases and a complete migration of volume. While any individual step of modernization may be inexpensive, the enterprise must still sign up for the entire journey if the justification is "reduce IT complexity and costs".

This makes the value case that much more important. Legacy modernization performed in pursuit of digital modernization - that is, in pursuit of changing the fundamental business model - can be a path to that value. My colleague Ranbir Chawla pointed out to me a couple of years ago that a company we were working with was steadfast in espousing technology paradigms abandoned a long time ago by the vendors who sold them. Those paradigms - even technological paradigms - are blinders that lock people's understanding of how their company transacts business. Re-imagine the business - goodness, it doesn't even have to be so ethereal, just look at what competitors are doing today - and it is possible to expose the opportunities those legacy paradigms self-select you out of doing.

However, recasting "legacy modernization" as "digital modernization" is no easy task. It's one thing to redefine an operation from linear batch processes performed in an on-premise mainframe to parallel event queues in an elastic cloud. There is bound to be lift in capacity, efficiency, and recoverable revenue for the architectural change, and this is easily projected against current state. Unfortunately, the benefits for digital modernization are harder to prove. For example, it sounds great that the company could pursue adjacent markets by exposing APIs to those event queues and selling subscriptions with variable pricing to 3rd party providers of ancillary services. There may even be a large addressable market of 3rd party providers of ancillary services. Potential is valueless without a plausible path to conversion. As it is highly resource- and labor-intensive to get hard evidence on the nature of those adjacent markets given the current state of the business, those opportunities are just conjecture. A CFO being asked to approve multi-million-dollar spend to modernize legacy assets is not a CFO who will do so based on somebody's conjecture.

Still, this does not negate the role that digital modernization has to play in making the case for legacy modernization. If the legacy modernization provides just enough lift to make a case on its own merits (and in the absence of a bloated cost base or significantly eroded revenue that can be won back quickly that is a best-case scenario) it is a cost-neutral to marginally cost-positive investment that opens the door to digital modernization and the multitude of potential benefits that lie beyond. Restated, it doesn't hurt the business to invest in modernization, and the odds are that it will be better off for it.

Most corporate IT departments would prefer not to start from the position they're currently in, so "legacy modernization" will find a willing audience. Unfortunately, there are no modern technological silver bullets that make legacy modernization any less onerous than it was the last 10 times IT proposed doing so. Not to mention, the financial analyses that dictate how capital is allocated in legacy firms are geared heavily toward "certainty" and not at all toward "speculative". What a legacy modernization initiative needs is a legitimate path to paying for itself. Provided that there is one, ceteris paribus, spending $100 today for $102.76 of value in two years leaves the firm no better off but also no worse off than had it spent $0. But no companies operate in a static world and every executive knows as much today. That means it's a pretty safe bet for the CFO that $100 on a cost neutral legacy modernization is a free call option on a lot of upside gain, if not outright survival of the business itself.

Friday, July 31, 2020

The Innovator's Cunundrum

Even as the pandemic hits sales, [automakers] need to pour vast sums into developing electrical vehicles - with absolutely no guarantee of success.

-- FT Lex, July 29, 2020

Socioeconomic systems in transition are impossible to navigate. At the same time that the old order is in collapse, the keys to the new order are very difficult to forge. Plenty of people bet and lost - in many cases quite tragically - during the 1918 Russian revolution, the great depression that began in 1929, the 2008 financial crisis, as well as in many other transitions in between. It is easy to realize that things are changing, but what they are changing from is easier to identify than what it is they are changing into.

Let's consider a specific case. For about a decade now, governments round the world have mandated that automobiles change from hydrocarbon-powered internal combustion engines to battery-powered electric motors. Yet the availability of electronic vehicles for sale has far outstripped the demand for the vehicles themselves. Plenty of automakers are building EVs at volume. Unfortunately, that volume is staying on manufacturer and dealer balance sheets, because legacy automakers are designing and building EVs that few want to buy.

The future is right in front of every legacy automaker, yet the legacy automakers don't really know how to crack the nut. Build EVs, check. And build EVs they have. Unfortunately, it's hard to make money on a product when the unit volume is measured in 4 or low-5 digit range. In 2019, Chevrolet sold 4,915 Volts and 16,313 Bolts in the United States, for a combined sales volume of 21,228 EVs. That same year, Chevy sold:

If at first you don't succeed, try, try again. That sounds easy enough for the legacy automakers to do: keep funneling cash flows from the lucrative legacy business of internal combustion trucks to finance more R&D of EV products (including, of course, an electronic pickup truck) until they get it right. Regulators have spelled out the future, and neither the debt burden nor equity holder's appetite for dividends preclude a healthy R&D spend. Tesla figured it out from scratch. How hard can it be?

Tesla the automotive company (ignoring the solar panel company) has (in that context) one mission - to make money manufacturing, marketing, selling and servicing a line of electronic transportation products. Sole mission and sole purpose make a clear investing proposition: this is an all-in wager. By way of comparison, legacy automakers have multiple missions: satiate bondholders and shareholders of a multi-line mass-transportation company. All-in wagers are not so appealing to investors in legacy firms.

This is especially true since it isn't clear just how quickly the clock is ticking on this transformation. EVs are the future, but when is that future? Proven hydrocarbon reserves are vast, demand for refined hydrocarbons (lubricants, jet fuel, automotive fuel) is down across all sectors and will remain depressed for years. Excess and untapped supply portend cheap refined petroleum product prices for a decade. Regulation could change that, but regulation is just as subordinate to economic needs as it is to environmental ones. The electronic vehicle manufacturer doesn't employ as many people (a.k.a. "voters") as the internal combustion engine vehicle business does. Cheap energy - be it electric or hydrocarbon - drives household productivity and therefore household leisure. Lots of entities stand to lose if the migration to EVs is too quick, EVs are the future, but that future might very well now be anywhere from one year to ten years out.

The standard playbook to navigate this dynamic is to do continuous market testing. The modern tech playbook would have that be done in the form of user need surveys, MVPs that gauge user interaction via instrumentation, and user satisfaction surveys, all in lock step. But if the market is in a volatile state, historical data is useless and real-time data only has value if the right filters are applied. Good luck with that.

During times of transition, there is no right policy, but there is wrong policy. The macro strategy risks are almost too obvious to point out. Bet too heavily on the future and you will lose. Cling too tightly to the past and you will lose. That's great policy, but utterly useless, unless you're JC Penny a decade ago standing at the city on the edge of forever with the ability to step back in time to alter the present. The micro strategy is where the transition is won or lost. The successful strategy will be one of muddling through.

In 1959, Dr. Charles Lundblom published a paper entitled "The Science of Muddling Through". The point was that policy change was most effective when incremental and not wholesale: evolutionary, not revolutionary. Dr. Lundblom was mocked by the intelligentsia of the day, who subscribed to the grand strategy theory that was fashionable at the time: that all outcomes could be forecast as in a chess match, that all plays could be anticipated and their outcome maximized toward a grand plan. The theory sounded great, but it assumed a static future path, muted all feedback loops, and was willingly ignorant of statistically improbable but highly significant events. Experience teaches us that the world is not so much chess as it is Calvinball. The grand strategy proved intellectually bankrupt as exhibited through its applications ranging from companies such as ICI Chemicals to the United States military prosecution of the Vietnam conflict. The prior resulted in monumental destruction of employee and shareholder value; the latter resulted in societal implosion.

Grand strategies will be all the rage in response to great challenge because they appear to have all of the answers. History teaches us that most, and likely all, will fail. In times of great transitions, Dr. Lundblom was right. Grand schemes and grand hypotheses will not win the day. Micro-level attentiveness, situational awareness, and adaptability will.

Tuesday, June 30, 2020

If there was ever a good time to be sure you have good governance, this is it

Since 2006, I've written multiple blog posts, a few articles and self-published an e-book on governing investments in strategic software. Software development is unique from most every other office-based occupation because it is the conversion of capital into intangible assets by way of human effort, at a level of effort that remains labor intensive to this day. Although other white-collar occupations such as back-office accounting can be labor intensive, their costs flow through to the income statement as SG&A rather than depreciation. Blue-collar occupations such as manufacturing labor can be capitalized (the labor that goes into building a car is a cost that is factored into finished goods inventory, which is a balance sheet phenomenon), but decades of capital investment in robotics have reduced the labor intensity of most manufacturing functions. True, many firms (particularly nascent tech firms) expense their software development labor costs; yet development is still a capital allocation function if they're spending investor capital or retained earnings rather than cash flow from operations to finance the development. Restated, software development is discretionary spend of balance sheet cash that could be directed to other investments or returned to investors.

Being an act of capital allocation, it has always struck me odd that most corporate captive IT organizations and tech firms self-govern the transformation of capital into software. In practice, software development governance is clubby, even more clubby than most corporate boards where the CEO has hand-picked the directors. For purposes of ceremony, IT governance is heavy-handed in reporting, but light touch in execution, loaded as it typically is with vendor reps with sales revenue on the line, delivery managers with bonuses on the line and business sponsors with promotions on the line. This isn't an environment where hard questions are tolerated, let alone asked. Some years ago, I was working with a regulated healthcare company replatforming its customer-facing solutions at all locations worldwide. It's target state one month after launch was to have 5 locations running the new software in parallel with legacy, processing 0.15% of corporate transactions, with another 5 locations ready for go-live the following month. In the event, the software had to be shut down after two locations were live for a few days because the new platform was dispensing client instruction incompatible with and contradictory to the healthcare products being provided. Sixty days after go-live, no locations were operating on the new platform, there was no resolution for this potentially life-endangering flaw, and therefore no path to production for the initial 2, or 5, or 10 locations, let alone any of the thousands of locations beyond that. The PMOs self-reported program status to the governance board? Green.

"We're all ok here" has been the modus operandi for IT governance for the past two decades. A lot of this is a function of the fact that despite the dot-com bust and the 2008 financial crisis, the first two decades of the 2000s have been characterized by overall global economic growth. Mid-way through 2020, we are now in a period of economic uncertainty. COVID-19 has initiated socioeconomic change that will have lasting effect on consumer behaviors, business practices and public policy. Uncertainty remains as the nature and trajectory that containment and combatitive policies to defeat the virus remain unknown. A lengthy period of containment and combat won't simply harden new business patterns such as work-from-home policies and consumer preference for take-out dining: it will unleash entirely new patterns as socioeconomic pressures build concomitant with human frustration at restrictions and seasonal change incompatible with policy relaxation.

The uncertainty factor matters because it creates new pressures for captive IT and tech firms alike to show diligence with every dollar spent. No matter the exposure to socioeconomic disruption your company now faces and balance sheet strength your company had entering into this crisis, the CEO was not and is not too keen on capital spend right now. While there are good cases to be made for investments in strategic software, particularly those that are reasonable preparations for how a society and economy function in the future, the tolerance for both wasteful allocation and misallocation is low.

Which brings us to the question of software delivery governance. If governing software delivery was hard during stable economic times, it is that much much harder during unstable economic times. Weak governance is highly vulnerable to threats old and new that cannot be swept under the rug worn threadbare by sudden economic convulsions. There are three threats worth highlighting: overspend, vendor fraud, and employee misalignment.

The first threat is little to no tolerance for overspending. I've written previously that CFOs earn their stripes on Wall Street by creating the appearance of control. Control arises from operational predictability: predictable operations create stable cash flows that finance healthy dividends and buoy the credit rating. In good times, perhaps the chief accountant was able reclassify a lot of maintenance work as capital improvement to bury that $2m IT project overrun a few years ago, and a late-year sales flourish buried that $4m overrun last year. In a sustained downturn, there's no accounting trickeration or sales windfall to come to the rescue. Not to mention, the CEO really wants to tell Wall Street that despite the economic upheaval, the company results were so good and the indicators so overwhelmingly positive, we're increasing the dividend. No CIO in their right mind will jeopardize the CEO's victory lap.

The second threat is an increase in fraud. Vendors have always exported overhead labor from their income statement to that of their customers. At a tactical level, vendor overhead employees are written into client contracts in loosely-defined roles that permit them to bill a few hours every billing cycle. Vendor direct labor rates are slightly inflated as a means of deflecting attention from the fact that vendor direct labor hours are significantly inflated. Surcharges such as "vendor administration services" - which amount to vendors charging their customers to make certain that timesheets are entered, invoices are generated and payments are collected - are added as service fees. And, of course, vendors cheerfully enter into one-way contracts to deliver CEO vanity projects of "re-imagination" and "market disruption" with no clawback clauses for results or outcomes. There is also the more pedestrian version where vendors enter into one-way contracts that create disincentive for improvement and (perversely) create incentive for perpetual failure: e.g., a vendor that charges per bug fix incident has no incentive to remediate the underlying cause, and actually has incentive to perpetuate the underlying cause.

These are all phenomenon that happen in good times, the product of lax contract administration, asymmetric knowledge of software delivery that favors the vendor over the buyer, and the starry-eyed "true north" ambitions of buyers. The boss wants to do this, we're not a software firm, procurement was told to get the deal done, and the contract gives little room for pushback, so everybody goes along for the ride. In unstable times, the pressures for vendors to inflate contracts to meet revenue targets, stave off layoffs, and sign obligation-light contracts to create flexibly tappable cash flow intensify that much more. Vendors have many more problems to try to export to their customers.

The third threat to look out for is the mismatch between individual and organizational goals. I wrote about this in 2013 in a two-part series on conflict. The phenomenon materializes as contrived complexity: obfuscated domain knowledge, jealously protected relationships, and unnecessarily complicated code are all examples of acts of individual self-preservation of employment that undermine effective governance of a strategic software investment. The board member who everyone knows is beholden to the knowledge of the governed will be played. If there is nobody with the domain knowledge, relationships, and algorithmic know-how to call bullshit on the executors, then the bullshitters will rule the day every day.

Today, governance is increasingly being put to the test, specifically at a time when governance standards have become institutionally lax and have been so for a long period of time. Governance will tighten, as described by John Maynard Keynes in his book The Great Crash of 1929:

In depression all this is reversed. Money is watched with a narrow, suspicious eye. The man who handles it is assumed to be dishonest until he proves himself otherwise. Audits are penetrating and meticulous. Commercial morality is enormously improved.

Keynes' statement that "all of this is reversed" is all well and good, but the fact is that regulators always lag the regulated, and governance is a regulatory phenomenon. By the time regulators catch up with the regulated, the damage is done to regulators' reputations for control, scrutiny and alignment. Rest assured that your successor will bask in the glory of being a more diligent investor, a more scrupulent auditor, and a more competent leader than you were.

If you prefer not to allow your career gravestone to be a platform for your successor to dance upon, what does it take to know that your governance practices are up to the task?

It's worth pointing out that you never will. Governance is something that is only evident by its absence. In addition to being intangible, there is no infallibility to governance: in practice, even the most prudent governance structures fail to detect problems, while the most lax governance structures suffer no damage from problems that do not metastasize. Governance is not a precise science. Governance is only obvious when it succeeds as a counterfactual (hindsight makes clear a crisis was intentionally avoided, such as investing in Bernie Madoff's fund in the summer of 2008) or when it fails (a fund piles in all of its money into Madeoff's fund in the summer of 2008).

That said, what should IT be doubling-down on to increase the effectiveness of its governance practices?

  1. Strengthen the cost audit function. Ask people everywhere from the accounting fraternity to the program management office to procurement to increase the scrutiny of every dollar that goes out the door for a strategic software investment, whether for badged employee payroll, vendor services, cloud services, or licensed product. The objective is to identify non-value-generative chargebacks. Either the person responsible for the charge can answer "what value did we get for this" or they cannot. The point isn't to spot administrative errors, but inexplicable costs and patterns of charges that point to budget leakage or legitimately high costs that point to accelerated budget drainage, and to do so while threat to budget can be contained.
  2. Strengthen the solution audit function. Engage a captive technical audit function or an external vendor with no connection whatsoever to a program of work to perform audits and peer reviews of tests, code, and team practices. Is the code written to a degree of simplicity that it is comprehensively testable, and easily maintainable by other developers? Can the piece-parts of the solution be tested in isolation and without an environment, or do they require expansive infrastructure and consistent data fixtures? Are there transitive dependencies that create false testing positives or negatives? Are integrations validated through contract tests before arriving in an environment? Is the team working in very narrow silos of domain knowledge or technical expertise? Are some people acting as bottlenecks? Have the number of hand-offs increased to complete requirements or fix defects? Have the number of defects increased, or has the time to resolution of them increased? Keeping a close eye on both the what and the how provides an early warning to delays in delivery or higher cost of maintenance than planned.
  3. Increase the threshold of acceptance of management status reports. In medicine, patient testing is frequently done to confirm or deny a threshold of "no evidence of disease." This involves taking a sampling of cells and subjecting them to tests to ascertain a percentage present per the statistically significant sample, and a trend of counts of samples made over time. A higher standard of acceptance is "no evidence of disease", but that's an impractical threshold as 100% testing of cells would be fatal to the patient. In strategic software investing, the lower threshold of "no evidence of failure" is a proxy of convenience to the board: if the program manager says the status is green and the summary indicators support that, the program manager has given no reason not to believe the status is green, but not reason to believe that it well and truly is. Program governance must raise the threshold on program management to provide "evidence of no failure". That means program management conclusively demonstrates that effort is not passing for results, tasks are not masquerading as requirements, and work is not being deferred that conflates "developer complete" with "user acceptance".
  4. Be an activist investor. As a board member, get your own data, form your own contra hypotheses, and engage in direct, constructive interrogation at board meetings. As a board member, you can talk to users, shadow teams, analyze delivery data, look at code and run analyses over that code. Nothing else will provide more effective leadership in these times from a governance role than investor activism that respectfully challenges those on the line creating a strategic software asset.
  5. Have at least one independent director on the governing board. Break up traditional governance structures by having independent directors with no connection to solution sponsorship, vendor representatives, or delivery managers. With no political, financial or emotional investment in any one or any thing, independent directors are better positioned to provide critical insight of the state and trajectory of the investment, as well as clinical recommendations for potential change.

Governance of strategic software investments does well to apply the standard of US <-> Soviet arms reductions treaties in the 1980s of "trust, but verify." Creating a no-trust governance environment is counterproductive. Low-trust environments are extremely dysfunctional and chaotic, and at best offer an anti-pattern for governance. That clubby team of vendor reps and ambitious managers doesn't perform well when at each other's throats like the crew of the ISS Enterprise in Star Trek: TOS. But a board driven by an activist chair and flanked by an independent director with a penchant for healthy skepticism creates an environment where trust is earned and re-earned, rather than tacitly given or outright kicked to the curb. That sets the tone for the board, delivery leaders and delivery partners, a tone especially valuable during a time when control is at a premium and trust is in rather short supply.