Alex Moog ·

May 6, 2022

Measuring success: The road to hell is paved with good intentions

Post Illustration Image

Snakes and Incentives

In the 1800s, when Britain occupied India, the British officers in Delhi wanted to reduce the number of dangerous venomous cobras slithering through the streets. Wanting a simple solution, they devised a standard economic incentive for the Indian public: the British officers offered a cash payout for every cobra head brought forth. The logic behind this scheme is clear, if you want to encourage a behaviour, in this case killing cobras, make that behaviour more valuable and market forces will solve the problem.

In some ways this scheme was very successful; it produced a huge number of cobra heads, indicating that people in Delhi were in fact killing a huge number of cobras. But the British officers’ reliance on using dead cobras as a measure of success meant that there was an obvious way to game this system. If one could get their hands on lots of cobras, they could be sold to the British for a small fortune.

Finding an efficient way to produce and kill cobras was now a viable career opportunity. As a result, many Indian citizens became cobra farmers, raising and slaughtering cobras to sell to the British. When the officers discovered that people were farming the snakes they promptly ended the programme, eliminating the incentive for the farmers to continue keeping the cobras in captivity. The conclusion was a mass release of cobras in the streets of Delhi, creating a problem much larger than before.

The British officers’ programme is an example of a perverse incentive, an incentive that results in an unintended or undesirable outcome. Perverse incentives demonstrate how choosing the wrong metric with which to measure a goal can cause a misalignment between intention and consequence. In the case of the British cobras-for-cash programme the outcome was a clear result of the incentive, but this is not always the case. Sometimes focusing on a single, narrow outcome can obscure the impact of an incentive or intervention on the system as a whole.

Plastic Bags and the Behavioural Ecosystem

Behaviour is always part of a larger system. The economic, social, and cultural forces that shape a behavioural ecosystem are complex and interrelated. A change in one behaviour can, and often does, cause related changes in different behaviours somewhere else in the system, a process known as spillover. Take, for example, one of the most celebrated recent examples of behavioural science implemented in public policy, the plastic bag tax.

The prevalence of single-use plastic products like straws, cups, and plastic shopping bags is a persistent environmental problem. Reducing the number of single-use plastics is therefore an important and worthwhile goal. To help curb the use of single-use plastic bags, many municipalities, cities, and countries have implemented programmes to incentivize shoppers to bring their own reusable shopping bags to the grocery store.

The initial idea to encourage people to reduce their reliance on plastic bags was to give a small discount on their grocery purchase if they didn’t take one. This turns out not to be particularly effective, as people don’t tend to respond enthusiastically to small monetary gains. However, using the principle of loss aversion, whereby people are more motivated to avoid losses than to earn equivalent gains, governments implemented a small fee for each plastic bag used at stores. One of the most well known implementations of this programme reduced the uptake of single-use plastic bags by 42% (Homonoff, 2018).

For years the plastic bag tax has been held as an example of the success of behavioural science, and to some extent this is true. Within the context of single-use plastic bag usage, it successfully improved on the metric with which it was measured. However, it turns out that many people also use these bags as liners in small trash cans in their homes, a behaviour you might recognize in yourself. By motivating people to reduce the number of plastic bags they bring home from the grocery store, the behaviour of using these ‘single-use’ bags for this secondary purpose was also reduced. As a consequence, people now needed to find another way to line their small trash cans. A recent study has found that in cities where there are taxes or bans on plastic grocery bags there is a related increase in the sale of similarly-sized plastic trash bags (Huang and Woodward, 2022).

While the bag tax showed the power of behavioural science to influence behaviour, measuring the success of the tax through the narrow focus on one kind of plastic obscured the lessened effect on plastic consumption in general. An understanding of the larger system at play is an important part of ensuring that the behaviour you aim to change is really going to have the intended impact on the outcome that matters.

Presumably, the designers of the cobra programme and the plastic bag tax had positive motivations, wanting safer streets and less environmental waste. However, as a consumer it is important to recognise when a system, and what it chooses to measure, are designed without your best interests at heart.

Daily Streaks and Satisfaction

Sometimes the misalignment resulting from an improper measurement is not between intention and consequence, but between the motivation of the designer and the desires or welfare of the recipient. Often, businesses measure the success of their products through metrics that do not align with what is best for their customers, clients, or users.

Take, for example, Duolingo – the language learning app. Learning a language is difficult work and requires consistent practice, a fact of which Duolingo is well aware, and which they push repeatedly to their users. The app encourages its language learners to complete at least one lesson every single day, and their primary motivational tool for doing so is the daily streak. Duolingo’s daily streaks keep track of the number of days in a row that a user completes a lesson. This tool is highly motivating, shown by the fact that some users have maintained streaks hundreds, or even thousands, of days long.

Seemingly this is a brilliant success for both Duolingo and its users – Duolingo’s metric of daily active users remains high, while their users practice languages consistently. The problem is that maintaining a daily streak on Duolingo doesn’t actually require the user to improve their understanding of a language, only to engage with the app. In order to tick your daily streak one notch higher you can repeat the same basic lesson every day, never progressing further in competency.

Therein lies Duolingo’s misalignment. The company is incentivising and encouraging daily engagement at the cost of helping their users learn a language. If a daily streak can keep someone coming back day after day, it doesn’t matter to Duolingo whether that streak leads to their users’ self-improvement.

Duolingo’s reliance on a feature detached from actual language learning causes users to feel a sense of pointlessness and dissatisfaction, and can even cause users to completely drop the app once their streak is broken (Mogavi et al., 2022). This isn’t to say it is impossible to learn a language using Duolingo, only that the motivational tools that they use can encourage behaviours that do not align with their users’ best interests.

Measuring Success

The simple act of measuring a behaviour will change how and how often it is performed. If you incentivise a behaviour, such as killing cobras, people will do what they can to maximise that outcome. Encouraging or discouraging a behaviour, like using reusable bags or not using plastic ones, will cause changes to related behaviours. And designing a system that measures behaviours that do not align with the goals of its users will ultimately cause dissatisfaction and disengagement.

As the cases of cobras, plastic bags, and daily streaks show, when designing a system, product, or service it is important to carefully consider what you measure and how you measure it, and to be conscious of how the products and services you use measure your behaviour.

So how can you avoid measuring the wrong behaviours?

  1. Maximisation

Avoid measuring behaviours that can be maximised in unhealthy or unhelpful ways. This can be difficult to do from the outset, so checking in with your users to see how they have changed their behaviour in response to your incentives and interventions is important.

  1. Spillover

Keep a wider focus on the behavioural ecosystem as a whole to make sure that the behaviour you measure does not have any negative spillover effects. Before implementing an intervention, investigate how the product, service, or system you intend to change interacts with other products, services, and systems to gain a better understanding of how the intervention will affect related behaviours.

  1. Alignment

Align your measurements with your users’ goals to encourage long-term satisfaction. Ask your users or customers why they are using your product, and what they hope to achieve, and help them to measure those outcomes and behaviours.

References & Further readings

Homonoff, T. A. (2013). Can small incentives have large effects. The Impact of Taxes versus Bonuses on Disposable Bag Use. Princeton University.

Huang, Y. K., & Woodward, R. T. (2022). Spillover Effects of Grocery Bag Legislation: Evidence of Bag Bans and Bag Fees. Environmental and Resource Economics, 1-31.

Mogavi, R. H., Guo, B., Zhang, Y., Haq, E. U., Hui, P., & Ma, X. (2022). When Gamification Spoils Your Learning: A Qualitative Case Study of Gamification Misuse in a Language-Learning App.