Metrics and Goals

Apr 07, 2020

Good morning,

Best wishes to everyone observing holidays this week. Without question, family celebrations will deviate from their traditional format; hopefully the unusual situation will help create some positive memories.

‑Edward

The Other Side

Right now, many of us are experiencing some phase of government-mandated (or at least encouraged) lockdowns. Everybody sequestered in their homes is asking the same two questions:

WHEN will the restrictions end?
WHAT will happen when the restrictions end?

With optimism, many people are pointing to the fact that China, Singapore, and Taiwan have already “come out on the other side.”

But I spent some time last week talking to friends in Hong Kong and Singapore, and their descriptions of “relaxed social distancing” are actually MORE — not less — stringent than what I am currently encountering in Seattle.

Plus: China might actually be slipping backwards. A Washington Post article from March 31 reported that Wuhan documented ZERO new cases for almost a month, and yet, their citizens still face stringent conditions:

People are allowed out of their residential complexes only if they have a return-to-work pass issued by their employer, and only if the government-issued health code on their cellphone glows green — not orange or red — to show that they are healthy and cleared for travel.
In the malls that opened this week, people must stand five feet apart on escalators, and clothes that customers have tried on must be sprayed with disinfectant. Subway passengers must wear masks and sit two seats apart.

To be clear, these rules are actually looser than the government’s previous orders (which shut down the subways entirely, etc.).

Although we might envy China for being “over COVID,” how true is that perception? China’s insistence on prolonged social distancing raises questions about their government’s confidence in having controlled the outbreak.

Officially, China has recorded 81,000 positive cases of COVID-19 (less than USA, Italy, Spain, and Germany) and approximately 3300 COVID-related deaths (less than USA, Italy, Spain, France, and the UK). China is currently reporting ~30 new cases per day, along with fewer than five deaths. Many experts have expressed skepticism about China’s official numbers.

If China really “beat COVID,” then why maintain the firm lockdown on Wuhan? One possible answer: metrics.

Some key context: in early March, district governments in Wuhan announced plans to reward communities — like rural villages and apartment complexes — with a financial incentive of 500,000 yuan ($72,000 USD) for reporting no new cases of infection.

In previous letters, I highlighted the importance of testing and tracking the coronavirus. By all accounts, China allocated significant resources toward testing and tracking. A possible problem, though, occurs when the metric BECOMES the goal.

If the government offers $72,000 to maintain a virus-free apartment building, there are two ways for the complex to receive that money:

Actually eradicate all presence of the virus.
Report that you eradicated the virus.

Without metrics, you can never determine if your actions are working. But once metrics become the goal, you contaminate the clarity of your evaluation process. With a $72,000 prize on the line, every village and facility has a clear incentive to fraudulently report a “virus-free” status.

Therein lies the problem.

We can find examples of the “do whatever it takes to hit the goal” problem for other government-created initiatives, like the one reported by a Bloomberg article:

The pressure to get China back to work after the coronavirus shutdown is resurrecting an old temptation: doctoring data so it shows senior officials what they want to see.
This phenomenon is playing out in Zhejiang province, an industrial hub on the east coast, in the form of electricity usage. At least three cities there have given local factories targets to hit for power consumption because they’re using the data to show a resurgence in production, according to people familiar with the matter. That’s prompted some businesses to run machinery even as their plants remain empty.

At first, China was tracking the right metrics — the number of new coronavirus cases revealed patterns in infection, and the consumption of electricity indicated activity in factories. But as soon as the government changed those metrics into goals, the participants — lured by incentives — shifted their behavior. We can no longer trust that the metrics can measure what they were designed to measure.

Share Marketing BS with Edward Nevraumont

Not Just a China Problem

Of course, the problem of misused metrics extends far beyond China. You can see this phenomenon play out in any country, organization, or company. Whenever you turn a metric for assessment into a goal for rewards (or punishments), you shift the incentives for the participants — and they will find ways to game the system.

Economist Arnold Kling wrote an insightful blog post explaining how all “systems invite gaming.” Here are some highlights (I encourage reading the entire post):

If you run a sales force and give bonuses based on a formula, your sales people will game the system. If you reward volume, they will make concessions on price. If you reward sales revenue this year, they will invest less in building long-term relationships.
There are many stories and jokes about gaming under central planning. If the planners reward the number of nails manufactured, they get a lot of little nails. If they reward the weight of nails manufactured, they get one giant nail.

The idea that all incentive systems will be “gamed” is not new. You might be familiar with Goodhart’s Law (a simplification of an idea described by economist Charles Goodhart back in 1975): “When a measure becomes a target, it ceases to be a good measure.”

In the world of methodological research, Donald Campbell developed a similar theory.

Campbell’s Law states that “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”

Specifically, Campbell was referring to ways that tests can be (mis)used in the education system:

Achievement tests may well be valuable indicators of general school achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.

If tests are used to identify what students are (and aren’t) learning, then that’s great. But with the introduction of incentives — like school funding, teacher bonuses, student awards, etc. — people shift their focus from “learning” to “winning.” You can find ample criticism of the increasing “testification” of education from the National Education Association, the mainstream media, and scholarly publications. In the most damning illustration of Campbell’s Law, think about the details from Operation Varsity Blues — the charges brought against parents and administrators involved in widespread corruption of the college admission process.

Even without bribing school officials and forging application documents, the college admission process relies on a skewed perspective of assessment metrics. In 1983, US News and World Report published their first “America’s Best Colleges” report, topped by Stanford, Harvard, Yale, Princeton, and UC Berkeley. As one key metric, the report considered “selectivity” (the percentage of applicants who were accepted); in many ways, selectivity was viewed as a proxy for the overall quality of the school.

Over the years, the “America’s Best Colleges” report grew in importance, influencing where employers would focus their recruitment attention and where students would set their aspirational goals. Low-ranked schools struggled to attract talented students; without talented cohorts of new students, schools were rarely able to improve their reputational standing.

So what’s the solution for schools sitting at the bottom? Follow the strategy of the cities in China’s Zhejiang province: start running empty factories in order to drive up electrical usage. Or at least the educational equivalent.

Last fall, the Wall Street Journal investigated tactics used by colleges to manipulate the metric of selectivity:

Colleges rise in national rankings and reputation when they show data suggesting they are more selective. They can do that by rejecting more applicants, whether or not those candidates ever stood a chance. Some applicants, in effect, become unknowing pawns.
The [company that runs the SATs] is using the SAT as the foundation for another business: selling test-takers’ names and personal information to universities. That has helped schools inflate their applicant pools and rejection rates.

According to the WSJ, some schools knowingly deceive potential students:

Schools purchase the names and email addresses of SAT test-takers who have scores SO LOW that they would never be accepted.
Schools email those students, encouraging them to apply.
If any of these students do apply, the school rejects them.
By rejecting more students, schools inflate their selectivity rate, which in turn increases their score on the “America’s Best Colleges” report.

Bottom line: after the metric of selectivity evolved over time into the goal, we can no longer trust its ability to measure anything of value (let alone what it was designed to measure).

We Appreciate your Feedback

Suppose I wanted feedback on reader satisfaction with this newsletter.

I might ask something like “How likely are you to recommend Marketing BS to a friend?”, followed by a scale from 1–10.

Once I collect that data, I can calculate my Net Promoter Score; according to Satmetrix Systems, “This proven metric transformed the business world and now provides the core measurement for customer experience management programs the world round.” (Net Promoter Scores are calculated by taking the percentage of customers who rate your business with a 9 or 10 and then subtracting the percentage of customers who rate your business with a 0–6. A Net Promoter Scores can range from a low of -100 to a high of 100).

Once you determine your Net Promoter Score, you can compare it to just about any other major company on the planet. Over 47? You are more loved than Apple! What if you scored a ZERO? Don’t worry, that’s where Coke sits (remember scores can go as low at -100). Among top-50 companies, Starbucks holds the highest score (77), while Facebook occupies the bottom (-21).

Companies love to collect NPS scores and CEOs love to brag about them.

The intention behind NPS is rational: companies should find ways to improve customer enthusiasm and loyalty. In practice, though, NPS does not accurately reflect real-world outcomes, like sales, revenue, or growth. The Wall Street Journal identified the limitations with NPS:

Two 2007 studies analyzing thousands of customer interviews said NPS doesn’t correlate with revenue or predict customer behavior any better than other survey-based metric. A 2015 study examining data on 80,000 customers from hundreds of brands said the score doesn’t explain the way people allocate their money.

Even the creator of NPS is disillusioned with its ubiquitous adoption by major companies:

...the metric has taken on a life of its own, so much so that the inventor, Fred Reichheld, said he is astonished companies are using NPS to determine bonuses and as a performance indicator. “That’s completely bogus,” Mr. Reichheld, who still consults for Bain, said in an interview. “I had no idea how people would mess with the score to bend it, to make it serve their selfish objectives.”

Steve Bennett, the former CEO of Intuit, played an important role in popularizing the use of NPS. Back in 2003, he “attributed the company’s growth in self-prepared taxes to its net promoter efforts.” Bennett later assumed the CEO role at Symantec, where he eliminated the collection of NPS scores. When asked why he changed his philosophy, Bennett replied, “When organizations manage to the metric, they find ways to game the system.”

Little Lies, Big Problems

When a company uses Net Promoter Scores to reward performance, they can inadvertently distract employees with counterproductive behavior. Wells Fargo discovered the dangers of turning cross-selling metrics into goals — their employees gamed the system, costing the company $3 billion. MIT’s Sloan Review provides some details of the Wells Fargo fiasco.

When the bank decided to actively track daily cross-sales numbers, employees rationally responded by working to maximize them. Throw in financial incentives, a permissive culture, and intense demands for performance, and they might even illegally open some unauthorized accounts, all in the name of advancing the ‘strategy’ of cross-selling.

Wells Fargo employees created MILLIONS of fake accounts for their customers, but not to defraud them; in fact, most of the fake accounts did not impact the customers in any way.

So why create the fake accounts? Because the bank rewarded employees with bonuses that grew depending on the NUMBER of new accounts. Employees quickly identified a shortcut: create new accounts — even fake ones — and receive a bigger reward. The employees were defrauding their employer. And the managers who knew about the scheme were defrauding their shareholders.

Wells Fargo deserves attention for the audaciousness of their employees’ actions, but many organizations are plagued by this type of conduct. In most cases, the deception is not as blatant as running empty factories to provide a false sense of industrial activity, but the impacts can be just as wasteful.

Mistaking Metrics for Reality

Back in early March, the Grand Princess cruise ship anchored in limbo off the coast of San Francisco. Twenty-one of the passengers and crew had tested positive for COVID-19. Experts recommended bringing the ship to shore, so that everyone on board could be tested and quarantined. President Trump disagreed, saying:

I like the numbers being where they are. I don't need to have the numbers double because of one ship that wasn't our fault.

Let’s be clear: the coronavirus outbreak was already starting to spread across the US, and that fact would not be altered whether the ship’s passengers remained on board or disembarked into quarantine.

But for President Trump, that fact was not his concern. The instant the COVID-positive passengers set foot on American soil, they would be immediately counted as American cases — hurting his “metrics.” Once again: when the metrics become the goal, they cease to measure what they were designed for AND they cause (otherwise) irrational decision making.

In the case of the Grand Princess, Trump did NOT get his way; the passengers were ultimately evacuated from the ship. But the presidential obsession with metrics did not end there, as noted by the San Francisco Chronicle:

Despite assurances from Vice President Mike Pence that all Grand Princess cruise ship passengers quarantined at Travis Air Force Base would be tested for COVID-19, The Chronicle has learned that two-thirds of them have declined, often at the encouragement of federal health officials. As of Wednesday, 568 of the 858 passengers screened while confined turned down the test, a federal official familiar with the Travis quarantine and testing told The Chronicle.

Think about the negative consequences of this chain of events:

Federal officials encourage passengers to decline being tested for COVID-19.
The disembarking passengers were at a high risk of infection (because of proximity to so many positive cases), yet they were allowed to freely travel.
Almost certainly, some of those passengers were unknowingly asymptomatic; they likely spread the virus to other people.

By not testing passengers, the government artificially suppressed the “number of COVID-19 infections in America.” Once the metric became the goal, federal officials engaged in actions that were not merely unhelpful, but actually compounded the problems.

Hacking Life

The stories about factories in Zhejiang, Well Fargo employees, and the Grand Princess seem fantastical — how could anyone make such irrational and/or immoral decisions?

And yet, the term “growth hacker” is discussed with reverence in the marketing world. We gravitate toward the idea of using tricks to exceed our growth targets. We shouldn’t be surprised by our collective interest in “gaming the system” — our education system cemented this idea in our brains.

Paul Graham, cofounder of Y Combinator, wrote an incisive essay about a central disfunction in our society: grades.

Getting a good grade in a class on x is so different from learning a lot about x that you have to choose one or the other, and you can't blame students if they choose grades. Everyone judges them by their grades — graduate programs, employers, scholarships, even their own parents.
Anyone who cares about getting good grades has to play this game, or they'll be surpassed by those who do.

In addition to grades, this “education hacking” continues with the college admissions process. For many high school students (and their parents), getting into the right college is the ultimate goal. As you might recall from personal experience, the best strategy for gaining acceptance is not “be smart” or “learn a lot”; instead, students need to understand how the testing process works, and then “hack the test.” Paul Graham again:

In practice, the freakishly specific nature of the stuff ambitious kids have to do in high school is directly proportionate to the hackability of college admissions. The classes you don't care about that are mostly memorization, the random “extracurricular activities” you have to participate in to show you're “well-rounded,” the standardized tests as artificial as chess, the “essay" you have to write that's presumably meant to hit some very specific target, but you're not told what.
As well as being bad in what it does to kids, this test is also bad in the sense of being very hackable. So hackable that whole industries have grown up to hack it. This is the explicit purpose of test-prep companies and admissions counsellors, but it's also a significant part of the function of private schools.

Almost every traditionally successful person has spent a significant part of their life identifying “tricks” and “hacks” — instead of pursuing genuine intellectual curiosity.

Final Thoughts

In the Marketing BS newsletters, I regularly revisit a few key themes. In particular, I highlight my belief that misguided incentives are responsible for many of the problems in the marketing industry.

CMOs are incentivized to take any actions necessary to build their budgets and teams — NOT the more critical goal of improving their companies’ marketing strategies. Likewise, agencies are motivated to excel at sales and customer service — NOT the delivery of actual marketing materials and campaigns.

Breaking away from these misplaced incentives takes a strong sense of ethics and a determination to do the right thing — a focus on learning the material instead of hacking the test.

As always, I hope to steer people away from chasing shiny objects and gaming the system. I truly believe that “doing well” rather than “doing well on the test” will increase professional achievements in the long run (not to mention improving personal happiness).

In the meantime, do not trust the COVID infection and death numbers coming out of China (or the US, for that matter). Whenever we turn a metric into a goal, we distort the metric’s ability to measure anything of value.

Keep it simple and stay safe,

Edward

If you are enjoying Marketing BS, the best thing you can do is share it with friends and recommend they subscribe. The next best thing you can do is comment on this piece or click the heart (which makes this article more visible for other readers).

Edward Nevraumont is a Senior Advisor with Warburg Pincus. The former CMO of General Assembly and A Place for Mom, Edward previously worked at Expedia and McKinsey & Company. For more information, including details about his latest book, check out Marketing BS.

Subscribe now

Nick Roberts

Apr 7, 2020

Another excellent piece Edward. It is so easy to become blinded by the ever important growth hacking mentality that anyone involved with marketing at a growth-stage company typically has to endure. Finding the time and space to work on improving the things that matter the most is incredibly important; it's even more important to vocalise this to the rest of the company. What can you do, however, when OKR's are set at the top-level for growth with targets clearly reliant on the CMO to pursue? (Also the college admissions thing was pretty wild to read about!)

Expand full comment

Marketing BS with Edward Nevraumont