# Georgetown University Initiative on Innovation, Development, and Evaluation — Zusha! Road Safety Campaign

Georgetown University Initiative on Innovation, Development, and Evaluation (gui2de)'s Zusha! Road Safety Campaign does not meet all of our criteria to be a GiveWell top charity but is a standout charity. Although we don't recommend these organizations as strongly as we do our top charities, they stand out from the vast majority of organizations we have considered.

Published: June 2018

We spoke with James Habyarimana, Béatrice Leydier, Whitney Tate, and William Jack for general updates on Zusha! on December 4, 2018.

## Summary

What do they do? Road accidents are a leading cause of death and injury globally, and the problem is particularly pronounced in low- and middle-income countries. The Georgetown University Initiative on Innovation, Development, and Evaluation (gui2de)'s Zusha! Road Safety Campaign (from here on, "Zusha!") is a campaign intended to reduce road accidents. Stickers are distributed to public service vehicles encouraging passengers to speak up and urge drivers to drive more safely. (More)

Does it work? We believe there is reasonably strong evidence from two randomized controlled trials (RCTs) conducted in Kenya that Zusha! decreases road accidents. However, we believe that these results have limitations which lead our best guess of the program's effectiveness to be lower than the estimated effects from the RCTs. Evidence from two additional RCTs, in Tanzania and Uganda, is forthcoming. In its scaled up program in Kenya, Zusha! has recently implemented strong monitoring and has found estimated coverage rates of about 20-50% among targeted vehicles. Due to differences in measurement approach, these compliance rates are difficult to compare to the rates measured in the RCTs, but we would guess that they are roughly similar. (More)

What do you get for your dollar? Our analysis suggests that Zusha!'s cost-effectiveness is in a similar range of cost-effectiveness as unconditional cash transfer programs. This analysis depends on a number of difficult judgment calls about moral weights and how to adjust for questions about internal and external validity. Our best guess estimates involve large discounts relative to taking the results of Zusha!'s RCTs at face value. (More)

Is there room for more funding? We believe that Zusha! could use at least an additional $800,000 per year to support implementation of its program in Kenya. We are not aware of any other potential funders for Zusha!, and our best guess is that Zusha! will need to scale down its field staff support for the program and substantially reduce its monitoring if it does not receive additional funds. Zusha! has told us that it would have varying levels of involvement in the program depending on how much additional funding it receives, but that its highest priority use of small amounts of additional funding would be to print and deliver stickers to its partners, which Zusha! would try to encourage to take a larger role in direct implementation and ownership of the program absent external funds. (More) ## What do they do? ### What is the problem? Road accidents are a major cause of preventable death and disability globally. The World Health Organization (WHO) estimates that more than 1.25 million people are killed and an additional 20 to 50 million are injured by road accidents each year.1 If current trends continue, the WHO projects that road accidents will become the 7th leading cause of death globally by 2030.2 We have not vetted these estimates. The problem is particularly pronounced in low- and middle-income countries, which the WHO estimates account for about 90% of the world's road traffic fatalities despite having only 54% of the world's vehicles.3 The road traffic death rate is higher in Africa than in any other region.4 ### What is the Zusha! program? Zusha! is a road safety campaign intended to reduce road accident deaths and injuries that targets drivers of public service vehicles. The campaign distributes stickers that are placed inside public service vehicles to encourage passengers to speak up and urge drivers to drive more safely.5 ("Zusha" means "protest" in Swahili.)6 For a description of how Zusha! distributes stickers, see the following footnote.7 The campaign also includes a lottery through which public service vehicle (PSV) drivers can win a cash prize if they correctly display the stickers, to encourage the drivers to place and retain the stickers in their vehicles.8 In addition to facilitating distribution of stickers and supporting a lottery, Zusha!: • Monitors its program to determine whether stickers are successfully displayed in vehicles over time (more). • Works with governments and insurance companies with the goal of passing ownership of the program to them in the long term.9 On this page, we primarily focus on Zusha!'s program in Kenya. Zusha! is in the process of conducting RCTs in Tanzania and Uganda and may be able to share results from those studies in 2018. (A note on terminology: Zusha! is the name of the road safety campaign in Kenya. gui2de, the organization which implements the campaign, is also running road safety campaigns in Tanzania and Uganda under other names that mean the equivalent of "Zusha!" in the local language. We use the name Zusha! throughout this report for simplicity.) ### Spending We have not yet seen comprehensive past spending information from Zusha!’s program. However, we have seen a forward-looking budget for Zusha!’s program in Kenya that appears to include all costs. This budget implies a total cost of about$18 per targeted vehicle per year.

Past spending

We have not yet seen detailed past spending information from Zusha!’s program that includes all actors’ costs. Zusha! shared with us its high-level accounting of how it spent its $900,000 GiveWell Incubation Grant (see footnote).10 However, we have not used this information to estimate the cost per vehicle of Zusha!’s program in Kenya because it does not appear to include all relevant costs.11 Therefore, to estimate the cost per vehicle of Zusha!’s program in Kenya, we instead use the forward-looking budget information, discussed below. Budget for an additional year of Zusha!’s program in Kenya Zusha! shared with us a budget for the cost of operating its program for an additional year in Kenya, which appears to include all major costs of the program: sticker distribution, lottery, monitoring, and staff support (see footnote).12 The budget provides different cost scenarios that vary in terms of the overall staffing for the project, the frequency of sticker distribution, support for a lottery, and support for monitoring and additional staff to distribute stickers.13 In the analysis below, we focus on the scenario that includes two sticker distributions per vehicle per year, full lottery support, and full monitoring support, since this is the version of the program that we believe would be necessary to achieve the full effects described in our cost-effectiveness analysis.14 In summary, the budget implies a cost per vehicle per year of about$18.41. The major costs are:

Category Cost Cost per vehicle % of total cost
Staff costs $158,211$3.16 17%
Sticker design, printing, and shipment $117,200$2.34 13%
Lottery $312,000$6.24 34%
Zusha! staff at NTSA centers and bus park monitoring $174,000$3.48 19%
Staff travel, office rent, and indirect costs $75,335$1.51 8%
Additional indirect costs $83,675$1.67 9%
Total $920,420$18.41 100%

## Does it work?

We believe there is reasonably strong evidence from two randomized controlled trials (RCTs) conducted in Kenya that Zusha! decreases road accidents. However, we believe that there are limitations to these results which lead our best guess of the program's effectiveness to be lower than the estimated effects from the RCTs. Evidence from two additional RCTs, in Tanzania and Uganda, is forthcoming. In its scaled up program, Zusha! has recently implemented strong monitoring and has found estimated coverage rates of about 20-50% among targeted vehicles. Due to differences in measurement approach, these compliance rates are difficult to compare to the rates measured in the RCTs, but we would guess that they are roughly similar.

### Does Zusha!'s program reduce road accidents?

We believe there is reasonably strong evidence from two randomized controlled trials (RCTs) conducted in Kenya that Zusha! decreases road accidents. These RCTs estimated that the intervention reduces road accidents by 25 to 50 percent. However, we believe that there are limitations to these results which lead our best guess of the effectiveness of the program to be lower than these estimated effects.

• The first RCT is described in Habyarimana & Jack 2011. A bundle of five stickers, two with images and text, and three text-only,15 was given to each of 1,155 treatment group vehicles out of a sample of 2,276.16 The vehicles were all long-distance minibuses.17 Intent-to-treat estimates suggest that the stickers decreased accidents by 50 percent, an effect that is statistically significant at the 1% level.18
• The second RCT is described in Habyarimana & Jack 2015. This RCT is considerably larger than the first, with more than 12,000 vehicles recruited, of which more than 8,000 received stickers. Both long and short-distance minibuses were recruited.19 Intent-to-treat estimates suggest that the stickers decreased accidents by 25 percent on average, an effect that is statistically significant at the 5% level.20

We are aware of two ongoing RCTs which are evaluating the extension of Zusha! to Tanzania and Uganda.21 We are particularly interested in the results of the Uganda RCT, since we expect it to have at least as much statistical power as Habyarimana & Jack 2015.22 In Tanzania, enumerators were not given direct access to police road accident records; the accident rate in the Tanzanian study is comparatively low, and it seems possible to us that station officers might have underreported minor accidents.23 Unfortunately, this means the study has low statistical power. Our statistical power calculations suggest that the Uganda RCT has approximately three times the statistical power of the Tanzania RCT.24

A new data collection effort conducted in Tanzania in partnership with the police since the end of 2017 might help mitigate this statistical power issue, but we are still waiting on the results.

### What are the limitations of the evidence on Zusha!'s effectiveness?

Limitations of the available evidence on Zusha!'s effectiveness include:

• No evidence on long-term effects of the intervention. Zusha! estimates that stickers have now reached 51,276 unique vehicles in Kenya, and it is estimated that there are 50,000 to 60,000 public service vehicles in the country as a whole.25 We have not vetted these estimates, but it seems that many vehicles and passengers have already been exposed to the intervention in the last few years. Additional funding in Kenya to maintain the program (for example, to replace worn stickers) might be less effective than past distributions: passengers may get used to the stickers and stop complaining, and drivers may stop putting the stickers up, making the stickers less effective in the long-term.26 We do not expect to see high-quality evidence on the effects of this intervention after multiple years of exposure. (Note: See Zusha!'s response on this point by clicking the "Charity response" tab at the top of this page or by seeing the following footnote.)27
• Initial imbalance in the accident rate in Habyarimana & Jack 2015. In the second RCT, the treatment group vehicles were found to have a statistically significantly higher accident rate (and were more likely to have had any previous insurance claim) before the treatment was implemented.28 It seems plausible to us that there are no systematic differences in characteristics between the treatment and control vehicles, but, by chance, the treatment vehicles happened to have more accidents in the time period before the treatment began, when the balance tests were taken. If the treatment and control vehicles are not systematically different, we can expect this imbalance to disappear over time; in other words, the accident rate in the treatment group will revert to the mean. This is problematic, because it means that the "treatment effect" estimated in Habyarimana & Jack 2015 might be confounded by the mean reversion of the accident rate in the treatment group.29 (Note: See Zusha!'s response on this point by clicking the "Charity response" tab at the top of this page or by seeing the following footnote.)30
• Spillovers. Passengers and drivers in the RCTs were not restricted to travel only in treatment or control vehicles, which creates possible spillovers31 (i.e., the treatment may have also reduced accidents to some extent in the control group, leading to an underestimate of the treatment effect). We attempt to adjust for this factor in our cost-effectiveness analysis.32
• Inconclusive evidence on mechanisms. We would be more convinced of the effectiveness of Zusha! if there were more evidence to show how the stickers have an effect. In Habyarimana & Jack 2015, the authors try to test mechanisms, but the evidence is mixed. There is no significant difference in reports of reckless driving or passenger complaints between the treatment and control vehicles.33 There is also no significant difference in maximum moving speeds.34 However, the average moving speed was slower for treatment vehicles than control vehicles, by about 1 km/h on average.35 Our intuition is that this difference is unlikely to be large enough to explain the observed treatment effect, though we have not conducted formal analysis. We also view the magnitude of the treatment effect as surprising since compliance with the intervention (as measured by lottery winning vehicles using all their stickers) was about 69% in Habyarimana & Jack 2011,36 and about 70-90% after one month and about 20% after six months in Habyarimana & Jack 2015.37(Note: See Zusha!'s response on this point by clicking the "Charity response" tab at the top of this page or by seeing the following footnote.)38

### How is the program monitored?

Zusha! has shared with us its monitoring methods and results to assess whether stickers are being distributed and whether vehicles continue to display stickers over time. Zusha!'s recent monitoring methods appear to be strong overall. We have not yet vetted underlying monitoring documentation. Our best guess is that Zusha! has achieved ongoing coverage of roughly 20-50% among its targeted vehicles. It is difficult to compare these estimates with the coverage rates achieved in the RCTs due to differing estimation methods, but we would guess that this represents similar coverage to what was achieved in the RCTs.

#### Types and representativeness of monitoring

Below, we focus on Zusha!’s monitoring of the coverage rate of stickers in vehicles in Kenya, since we see this as the most important mechanism for program success. Zusha! also has other monitoring processes, e.g. for sticker distribution and ensuring that lottery winners receive payments, that we do not cover here but that are described in Zusha! report for GiveWell, Kenya, September 2017.

In brief, Zusha! has conducted 3 major kinds of monitoring to assess the coverage rate of stickers:

1. Lottery compliance: Zusha! and its partners follow up with vehicles that have won the lottery associated with its program (described above) to confirm that the stickers are being used; the vehicle must have stickers displayed correctly in order for the driver (and other actors in the sticker distribution chain) to receive lottery winnings. Zusha! and its partners told us that they conducted this monitoring for three 2-3 month periods since 2015, inspecting 263 vehicles and finding a compliance rate of about 80-90% among inspected vehicles.39 However, we and Zusha! suspect that this is an overestimate of true compliance; Zusha! has some anecdotal evidence that some agents and/or drivers may keep stickers in an envelope until they learn that they have won the lottery and then display the stickers for the inspection.40
2. National Transport and Safety Authority (NTSA) inspections: Zusha! partners with the NTSA, which aims to inspect all PSVs in Kenya at least once per year, to distribute and assess coverage of stickers. Prior to mid-2017, Zusha! found it challenging to collect high-quality data from NTSA centers.41 To address this, in mid-2017 Zusha! hired its own staff members to collect data at NTSA centers regarding sticker compliance and sticker distribution.42 Zusha! shared with us a few months of data from inspections of about 10,000 vehicles which found that, prior to receiving any new stickers, about 30% of vehicles had at least one sticker and about 12% of vehicles had all stickers.43 However, we believe this may underestimate compliance because, in a small survey, Zusha! found that, prior to the sticker check, about 20% of vehicles were cleaned and had stickers removed as part of the cleaning process.44 Zusha! told us that data quality checks on NTSA inspections include a) checking photos of stickers taken by enumerators to ensure stickers are displayed properly, and b) asking enumerators to record the audio of surveys so that other Zusha! staff can randomly backcheck 10% of audio recordings to ensure protocols are being followed.45 Zusha! shared the error rate of some of these checks and noted that it discovered some non-compliance using these methods.46 We have not conducted any auditing of this monitoring or asked for underlying documentation.
3. Bus park checks: Zusha! told us it conducted two rounds of compliance checks over a period of several months in 2017 in PSV parks across central and southern Kenya.47 These checks aimed to reach a large convenience sample; they were not designed to be nationally representative and did not randomly survey vehicles within bus parks, though they did aim to cover a variety of bus parks to improve representativeness.48 The survey covered roughly 21,000 vehicles49 and found that about 22% of vehicles had all stickers and about 55% of vehicles had at least one sticker.50 Zusha! told us that data quality checks included picture and audio backchecks similar to those described for NTSA inspections above, and Zusha! shared error rates from these checks.51

We have not vetted any of the above monitoring by auditing examples of it or asking for underlying documentation.

We are unable to straightforwardly compare the more rigorous compliance estimates from bus park checks and NTSA inspections to the rates observed in the RCTs, which we understand relied mainly on lottery checks. However, compliance rates in the RCTs were low (see details above), and we do not have reason to believe that Zusha!'s distribution process is less effective now than it was in the RCTs, so we would guess that Zusha! is achieving similar coverage to what occurred in the RCTs.

## What do you get for your dollar?

Based on the calculations in GiveWell, Zusha! cost-effectiveness spreadsheet 2018, our best guess is that Zusha! is in a similar range of cost-effectiveness as unconditional cash transfer programs.52

As with all of our cost-effectiveness models, there are several parameters about which we are significantly uncertain.53 Our best guess estimates involve large discounts relative to taking the results of Zusha!'s RCTs at face value. Key parameters about which we are particularly uncertain include:

• Internal validity adjustments: We adjust our expectation of the intervention's effect to account for the methodological limitations of the RCTs discussed above. We are uncertain how much we should adjust for these factors, but more information is available in GiveWell, analysis of key CEA parameters.
• External validity adjustments: We also adjust the expected effect to account for a) our expectation of the longer term effect of the program as people become more familiar with the intervention, and b) how we expect to adjust our estimate of the effect in Kenya after seeing final results from the RCTs in Tanzania and Uganda. We have seen preliminary results from these RCTs that we have factored into this adjustment.54
• Moral weights: We are uncertain about the moral weight that we assign to a road accident injury or death avoided, and the average number of injuries that occur per death in road accidents.55

These cost-effectiveness estimates should not be taken literally, due to the significant uncertainty around them. We provide these estimates (a) for comparative purposes and (b) because working on them helps us ensure that we are thinking through as many of the relevant issues as possible.56

We believe that Zusha! could use at least an additional $800,000 per year to support implementation of its program in Kenya.57 We are not aware of any other potential funders for Zusha! and our best guess is that Zusha! will need to scale down its field staff support for the program and substantially reduce its monitoring if it does not receive additional funds. Zusha! has told us that it would have varying levels of involvement in the program depending on how much additional funding it receives, but that its highest priority use of small amounts of additional funding would be to print and deliver stickers to its partners, which Zusha! would try to encourage to take a larger role in direct implementation and ownership of the program absent external funds.58 Zusha! provided more detail on how it would use different levels of additional funding in response to this review; see the "Charity response" tab at the top of this page or the following footnote.59 We would like to see evidence of the program's effectiveness in other contexts before considering additional funding for programs outside of Kenya. We expect the results of the RCTs in Tanzania and Uganda to help us address this question. ## Questions for further investigation • What is the effect of the intervention in the Tanzania and Uganda RCTs? • Are we appropriately adjusting for our concern about the baseline imbalance observed in Habyarimana & Jack 2015? • Are there ways to improve our estimate of the longer-term effect of Zusha!'s intervention in Kenya? • To what extent is the context of road safety in Tanzania and Uganda similar to or different from that in Kenya? • Will Zusha! be able to successfully pass off its program to the government or other actors over time? • If Zusha! does not receive any further funds, what will the monitoring of the program be and how would Zusha! use small amounts of funds received from donors in the future? ## Our investigation process ## Sources Document Source Conversation with Zusha! staff, February 21, 2018 Unpublished GiveWell Blog Post on Zusha! 2017 Source GiveWell Site Visit Notes 2017 Source GiveWell Zusha! Additional Calculations 2018 Source GiveWell, additional analysis relevant to Zusha! internal validity adjustment Source GiveWell, analysis of key CEA parameters Source GiveWell, Zusha! cost-effectiveness spreadsheet 2018 Source Habyarimana & Jack 2011 Source (archive) Habyarimana & Jack 2015 Source (archive) Habyarimana and Jack 2015 Supporting Information Source (archive) Photos from GiveWell's site visit to Zusha!, February 2017 Source Table for Zusha! response Source WHO Road Traffic Injuries Fact Sheet 2018 Source (archive) Zusha! private email, March 7, 2018 Unpublished Zusha! Accounting and Budgeting Source Zusha! Additional compliance information Source Zusha! Incubation Grant budget 2017 Source (archive) Zusha! March 2018 budget and spending information - with GiveWell edits Source (archive) Zusha! previous draft budget across multiple countries, 2018-2019 Source (archive) Zusha! report for GiveWell, Kenya, September 2017 Source • 1. • "More than 1.25 million people die each year as a result of road traffic crashes." • "Between 20 and 50 million more people suffer non-fatal injuries, with many incurring a disability as a result of their injury." • 2. "Without sustained action, road traffic crashes are predicted to become the seventh leading cause of death by 2030." WHO Road Traffic Injuries Fact Sheet 2018 • 3. "90% of the world's fatalities on the roads occur in low- and middle-income countries, even though these countries have approximately 54% of the world's vehicles." WHO Road Traffic Injuries Fact Sheet 2018 • 4. "Road traffic injury death rates are highest in the African region." WHO Road Traffic Injuries Fact Sheet 2018 • 5. For images of the stickers, see Zusha! report for GiveWell, Kenya, September 2017, pgs. 5-9, and Photos from GiveWell's site visit to Zusha!, February 2017 • 6. "...road safety experiment...called Zusha! (Swahili for 'Protest!')" Habyarimana & Jack 2015, pg. 4661. • 7. • Zusha! provides a detailed description of its distribution process on pgs. 10-21 of Zusha! report for GiveWell, Kenya, September 2017. • We provide a description of what we observed on our site visit in GiveWell Site Visit Notes 2017, pgs. 2-3: "The basic process as I understand it now is: • They've distributed the bulk of their stickers through the insurance company they're partnering with (called "DirectLine Assurance"), which insures ~60% of the relevant vehicles in Kenya and has 17 insurance centers across Kenya. Insurance is required for all public service vehicles (PSVs) in Kenya (i.e., for all matatus and buses). Typically, an "insurance agent" - someone who is responsible for purchasing insurance for 5-100 vehicles (we couldn't get an average number, but it seemed like it might be in the 10-50 range) - will come in to the DirectLine office when the insurance for a portion of the vehicles the agent covers is near expiration (insurance is purchased for ~1 week to ~1 year at a time). The agent is usually buying insurance for 2-10 vehicles at a time (ballpark), since insurance usually expires at different times for different vehicles. DirectLine's electronic system automatically checks whether the vehicles that the agent is purchasing insurance for have been issued stickers within the last 6 months. If they haven't, or if the insurance agent just requests replacement stickers, the insurance agent will give Zusha! stickers (4 stickers for matatus, 8 stickers for buses (larger vehicles)) and explain the program. The [stickers are packaged in envelopes] (which we got samples of) [that also] have an explanation of the lottery on them... • Zusha! has also more recently begun distributing stickers through the Kenya National Transport and Safety Authority (NTSA). The NTSA has 17 inspection centers all over Kenya. All PSVs are required to receive an inspection from NTSA at least once per year. NTSA inspects vehicles for road-readiness, safety, etc. Zusha! is now closely partnered with NTSA. When a vehicle shows up to an NTSA inspection center, a staff member at the entry point is supposed to ask the driver whether they have Zusha! stickers. If they don't, the staff member is supposed to issue stickers and explain the program. Then, when the vehicle is actually being reviewed by an NTSA inspector, the inspector is supposed to check whether the stickers have been displayed and, if not, is supposed to put up the stickers for the driver. So, theoretically all relevant vehicles in Kenya should receive Zusha! stickers through this process if they don't have them already. It's worth noting that Zusha! isn't currently running a lottery for vehicles that receive stickers from NTSA (since DirectLine pays lottery winners, and many of the vehicles reached by NTSA aren't insured by DirectLine)." • 8. • "Every week, DirectLine and Zusha! (under license and observation of the Kenya Betting Control and Licensing Board (BCLB)) jointly run a lottery to [select a random group of PSVs currently insured by DirectLine.] In order to get the lottery payout, the vehicle needs to actually have the stickers displayed in their vehicle. An insurance claims investigator from DirectLine checks whether the vehicle has stickers displayed as part of his or her general investigation work. If a vehicle wins and displayed the stickers, ~$50 is paid to the vehicle driver, ~$50 to the vehicle owner, and ~$50 to the insurance agent for that vehicle. DirectLine pays the cost of the lottery, arranges for payments to the winners via mobile money, handles the compliance aspect of those payments, and is a strong supporter of the program. The lottery system ensures that winners are geographically distributed across Kenya.
• BCLB’s issue of a license every three months is critical to Zusha! [being] able to continue to run its lottery. Difficulties in gaining a license (which process involves various requirements, including holding the prize money in a separate bank account) meant that the lotteries were not running when we visited Kenya." GiveWell Site Visit Notes 2017, pg. 3.
• [Added by Zusha!: In experimentation countries (Tanzania and Uganda), the Zusha! team directly runs and funds the lottery for the duration of the experiment, in a way that is comparable to what DirectLine and Zusha! do in Kenya.]
• 9.
• For more discussion of Zusha!'s partnerships with other actors, see extensive explanation on pgs. 10-26 of Zusha! report for GiveWell, Kenya, September 2017.
• "Prior to our grant, Zusha! thought it was likely that they would need to pass off the program to the government as soon as possible or risk letting the program end. Now that we’ve made a grant, they told us they expect to pass off the programs more slowly but that they think passing off to the government is more likely to succeed." GiveWell Site Visit Notes 2017, pg. 10
• 10.

See Zusha! March 2018 budget and spending information - with GiveWell edits, sheet "Accounting".

• 11.
• Zusha! noted that for several months during the grant period, it also had support from USAID that is not included in this summary. Zusha! private email, March 7, 2018
• The past spending information does not appear to include costs incurred by Zusha!’s partners to implement the program, such as costs of running the lottery.
• The past spending includes a substantial amount of spending on RCTs in Tanzania and Uganda, as expected.
• 12.

See Zusha! March 2018 budget and spending information - with GiveWell edits, sheets "Budget I-III".

• 13.

See Zusha! March 2018 budget and spending information - with GiveWell edits, sheets "Budget I-III"

• 14.

See Zusha! March 2018 budget and spending information - with GiveWell edits, sheet "Budget III".

• 15.

See Habyarimana & Jack 2011, pg. 1440, Figure 1.

• 16.

Habyarimana & Jack 2011, pg. 1442, Table 1

• 17.

"Independent insurance claims data were collected for more than 2,000 long-distance matatus before and after the intervention." Habyarimana & Jack 2011, pg. 1438

• 18.

The regression results are presented in Table 4 in Habyarimana & Jack 2011, pg. 1445.

• 19.

"...the current evaluation included both long distance and intracity buses." Habyarimana & Jack 2015, pg. 4669

• 20.
• "Under the parallel trends assumption, the counterfactual annualized rate of claims expected in the post period for vehicles assigned to any nonplacebo treatment was 6.86%, so the coefficient of -0.017 represents a reduction in claims of 25%." Habyarimana & Jack 2015, pg. 4666
• This RCT includes eight treatment arms, and so attempts to provide information on what type of sticker is most effective. Each of four types of sticker (one text only, three with images and text) were provided through two treatment arms, one including a collective action message and the other not. (See Habyarimana & Jack 2015, pg. 4662, Table 1.)
• The most effective stickers, which contained images of wrecked vehicles, decreased accidents by 34 percent. See Habyarimana & Jack 2015, pg. 4667.
• However, the authors do not test whether the differences in the effects between the different types of stickers are statistically significant.
• 21.

gui2de has extended the program to Tanzania and Uganda, where it is branded locally as 'Funguka' and 'Speak Up' respectively.

• 22.

See GiveWell Zusha! Additional Calculations 2018, pg. 4, Table 1.

• 23.

"Collecting data on accidents in Tanzania has been a challenge. The police have been hesitant to let Zusha! view accident data directly. Instead, the police agreed to fill out a special form for Zusha!. A major question is whether the police are reporting all accidents; if they're not, it could reduce the power of the study." GiveWell Site Visit Notes 2017, pg. 8

• 24.

See GiveWell Zusha! Additional Calculations 2018, pg. 4, Table 1.

• 25.

"No longer run as a controlled experiment, the intervention aims to reliably and consistently reach every PSV in the country - estimated to be between 50,000 and 60,000 vehicles. Launched in May 2015, the Zusha! scale-up has now distributed 104,730 complete sets of stickers to 51,276 unique vehicles." Zusha! report for GiveWell, Kenya, September 2017, pg. 2

• 26.

Also, if the intervention has successfully shifted social norms about driving, passengers may complain even without continued exposure to the stickers, and drivers may no longer have an incentive to drive recklessly regardless of whether the stickers are present.

• 27.

Zusha! wrote: "Our analysis of data seven years after the first experiment failed to detect a long-term effect of the original intervention (which was orthogonal to the subsequent deployment of Zusha). We believe this is consistent with a model in which, at least over such a time-scale, passengers need to be sensitized and empowered to speak up on a continuing basis. In this context, it appears road accidents are more akin to a 'chronic condition' than a curable disease, and long-term prophylaxis is necessary for sustained improvement. On the other hand, it is also possible that in the very long run, the stickers could support the formation of new social norms and expectations about driver and passenger behavior, so that enforcement of higher standards of service could occur without stickers.

Nonetheless, 'drug resistance' is a potential issue, and we are aware of the possibility that passengers might become immune to the treatment. For this reason, we update the design of the stickers and the messaging every 6 months, and continue to provide incentives for adherence to the 'medication' through the lottery."

• 28.

See Habyarimana & Jack 2015, pg. 4665, Table 3.

• 29.

The authors conduct a "falsification test"; however, we believe the results of that falsification test are still consistent with the explanation that the mean reversion of the accident rate for the treatment group can explain the treatment effect. Habyarimana & Jack 2015's falsification test is to run a difference-in-difference regression using data from an earlier time period (2007/8) and the 'pre-intervention' time period (2009/10) when the balance tests were taken. If the jump in the accident rate in the pre-intervention time period was just by chance, then we should expect to see a positive and significant effect in this new difference-in-difference regression that is roughly equal in magnitude to the size of the treatment effect in the baseline regression, which is what the authors find. We therefore believe that the main results and the falsification test are both consistent with the explanation that the estimated treatment effect is confounded by the mean reversion of a by-chance higher accident rate in the treatment group prior to the implementation of the treatment. See the falsification test in Table S3, Habyarimana and Jack 2015 Supporting Information, pg. 3. See also further discussion in GiveWell, additional analysis relevant to Zusha! internal validity adjustment.

• 30.

Zusha! wrote: "We confirm that in the results reported in our PNAS paper, the treatment group had a statistically significantly higher accident rate in the period prior to the intervention than the control group. On the one hand, if that difference reflected different underlying vehicle and/or driver characteristics in an otherwise stationary environment, then our results would understate the true impact of the stickers.

However, it is conceivable that the data generation process that drives the observed pattern of accidents exhibits regression to the mean - having an accident might cause a driver to be more careful next period, reducing his chance of making a claim, while remaining accident free might induce more reckless behavior, if drivers update their beliefs about risks symmetrically. If this is the case, it is possible that our results would over-estimate the impact of the stickers, as the treatment group would have been on a downward trend in any case.

To test this possibility, we report results of a falsification test in which we conduct a diff-in-diff analysis in the two periods prior to the intervention. The results show that while the differential trend for treatment vehicles is positive, treatment vehicles also have higher accident rates in period t-2 (not included in the PNAS supplementary materials, but reproduced below) – that is, the rate for treatment vehicles is even higher in period t-1 – an observation that is not consistent with a simple regression to the mean process.

In the table below, we report a series of placebo tests in which we treat the two-year period directly before the intervention (2009-2011) as the “post” period, and years prior to this as the “pre” period. We report 6 different specifications (diff-in-diff coefficients in columns (5) and (6) are reported in the supplementary materials), in which we use all available data prior to 2009 (column 1), data from the five years prior to 2009 (column 2), and data from four, three, two, and one years prior to 2009 (columns 3-6), as the “pre” period.

The coefficient on “Any treatment sticker” is consistently positive and significant, supporting the interpretation that the treatment group was inherently and stably more risky than the control group. Similarly, the “Any treatment x post” coefficient is mostly significant but positive, suggesting that if anything treatment vehicles were becoming riskier over time.

Both of these observations give us confidence that our core results are not driven by reversion to the mean, and that if anything likely under-estimate the true impact of the stickers."

• 31.
• Firstly, drivers in control vehicles may have previously driven treatment vehicles. If they have internalized previous passenger complaints, they may drive more safely in control vehicles too. "Whereas most drivers drive the same vehicle on a regular basis, there is some driver rotation both within the day and across days, so drivers who have been exposed to the treatment can end up driving untreated vehicles." Habyarimana & Jack 2015, pg. 4662
• Secondly, passengers in control vehicles may have previously ridden in treatment vehicles. They may be aware of the stickers, and so complain in the control vehicles too. "...passengers nearly certainly will ride on both treated and untreated vehicles." Habyarimana & Jack 2015, pg. 4662
• Since both spillovers lead to a partial treatment effect in the control group, the effects estimated in the two RCTs may underestimate the true effect of the stickers.
• 32.
• 33.

"In Table 4, we are unable to detect differences in reports of reckless driving (fourth column) or passenger complaints (fifth column) across treatment groups." Habyarimana & Jack 2015, pg. 4667

• 34.

"Maximum speeds in nonplacebo treatment vehicles were on average about 1km/h less than those of the control group in the post period, although this difference is not statistically significant." Habyarimana & Jack 2015, pg. 4666

• 35.

"A broader measure of vehicle speeds is the average moving speed shown in the third column. Here we estimate a significant difference: vehicles with stickers are about 1 km/h slower than control vehicles." Habyarimana & Jack 2015, pg. 4666

• 36.

"Compliance to the randomized assignment was high but not perfect… with 68.5% taking all five [stickers]," Habyarimana & Jack 2011, pg. 1439

• 37.

See Habyarimana & Jack 2015, pg. 4666, Fig. 2, "Weekly lottery winners".

• 38.

Zusha! wrote: "Part of the decline in compliance in the Habyarimana & Jack 2015 study reflects declining effort in tracking down lottery winners. The study design addressed anticipated low compliance by arranging for all study vehicles to receive a new set of stickers starting in September 2011 (about 6 months after the initial issue). Overall compliance was not perfect but well above 20%.

Even so, we agree that the effect size is large, and even with 100% compliance throughout the study period, we would have been surprised by such a large reduction in accident rates. Indeed, this was a significant motivating factor in our efforts to replicate the analysis, as we were concerned that the results of the first experiment might have been a fluke.

However, we believe the data we used to measure the treatment effects in both studies was of sufficiently high quality that there is no reason to question the conclusions. In fact, our estimates in the presence of declining compliance rates suggest that more intensive treatment (and retreatment), with stronger retention incentives, could have even higher returns in terms of lives saved."

• 39.

See table on pg. 26 of Zusha! report for GiveWell, Kenya, September 2017.

• 40.
• ”Compliance as measured by the three lottery periods (during which 299 vehicles were drawn) has averaged 76% among all eligible vehicles, and 86% among vehicles which actually completed inspections. Given that stickers issued by DLA, the largest distribution channel, are not placed directly inside vehicles, this number is suspected of being artificially high, and indicative of a moderate level of gaming of the lottery. We therefore conducted two rounds of independent, direct compliance checks in PSV parks across the country, ultimately surveying 20,770 vehicles. Of these vehicles, an average of 22.4% were found to be fully compliant, though more than half were partially compliant.” Zusha! report for GiveWell, Kenya, September 2017, pg. 2
• ”Unlike in Uganda and Tanzania, where Zusha! is being implemented as a tightly-controlled RCT and enumerators are employed to directly administer the intervention, the process of distributing stickers through Directline in Kenya requires many more steps and coordination among several individuals.

The lottery therefore functions as a positive incentive for the three key actors in the process: the agents who need to deliver the stickers from the DLA office to the driver; the owners who need to allow Zusha!! stickers in their PSV fleets; and the drivers who must place the stickers in their vehicles and keep them in. However, the lottery only allows the research team to check a very small percentage of vehicles which received stickers. DLA also does not insure the entire population of PSVs in Kenya, and covers approximately 60% of the market share at any given time. Therefore, because the lottery does not include vehicles that did not receive their stickers through DLA or are not covered by DLA at the time they are inspected, the lottery compliance rates are not necessarily representative of the universe of PSVs reached through all distribution channels.

In addition, because stickers are not placed directly in vehicles by enumerators in Kenya, it is far more likely that results from lottery reflect a moderate level of gaming. Based on anecdotal evidence and investigations, it is believed some agents and/or drivers simply retain their envelopes of Zusha! stickers until they are called and informed they have been drawn and are eligible for the lottery, at which time they place the stickers in their vehicles prior to arriving for inspection.

These assumptions were corroborated by independent direct checks conducted in matatu and bus parks across the country by the research team, which revealed a lower compliance rate than indicated by the lottery.” Zusha! report for GiveWell, Kenya, September 2017, pgs. 27-28

• 41.

“Stickers have been issued at each of the NTSA’s 17 regional inspection centers since the start of the scale up in May 2015. However, the protocol for distributing stickers and collecting data on vehicles has changed and improved over time.

Between June and September 2015, NTSA staff were trained by the Zusha! team to issue stickers to vehicles coming through the center for their annual inspection, and collect data on Mobenzi, an early generation mobile phone survey application. Approximately 5,900 unique vehicles were reached through this strategy, 1,200 of which already had a full set of stickers (received from Directline). Because a robust partnership with NTSA had not yet been established and workflows were still being piloted, NTSA staff from just seven centers were issuing stickers and collecting data during this period.

Between May 2016 and May 2017, NTSA staff from each of the 17 centers were authorized to work with Zusha! to issue stickers and collect data. However, neither the project nor the Authority had funding at the time to equip these staff members with mobile phones or tablets, so paper forms were used. An example of a completed form can be seen below in Figure 2.0.

Every month, each branch sent its collection of paper forms to NTSA’s central Headquarters in Nairobi to be digitized by the IT staff. Both original and transcribed records were eventually shared with the Zusha! team, though there was often a significant delay and the records are still incomplete across all centers. Although the data is still being compiled, cleaned, and analyzed, approximately 15,000 unique vehicles were surveyed and received stickers over the 12-month period through this workflow.

Requiring NTSA staff to fill out an additional paper form that recorded much of the same information they were required to also collect on the Motor Vehicle Inspection (MVI) form during the actual inspection was a burdensome and inefficient process, and resulted in incomplete and unreliable data collection. This was confirmed by comparing the information collected on the Zusha! forms with the MVI forms, which showed that many more vehicles came through for inspection than were recorded on Zusha! forms. One reason for this was that NTSA staff often did not fill out the form for vehicles that arrived at the center with a full set of stickers, thus providing an incomplete picture of monitoring and compliance.

In addition, it was discovered that NTSA staff often simply handed the drivers an envelope of stickers, as opposed to directly placing them inside the vehicle, thereby increasing the likelihood that the vehicle was not properly exposed to the intervention.” Zusha! report for GiveWell, Kenya, September 2017, pgs. 15-16

• 42.

“After extensive discussions with NTSA leadership about how to improve this system and ensure the collection of accurate and comprehensive data and reliable distribution of stickers, in May of 2017 the Authority agreed to allow Zusha! to place its own staff members at each center to collect data on vehicles coming through for inspection, and directly issue stickers to vehicles. These enumerators are equipped with tablets, on which they fill out a SurveyCTO form. They typically survey the vehicles as they first arrive at the center and queue at the weigh bridge to wait for inspection.” Zusha! report for GiveWell, Kenya, September 2017, pg. 16

• 43.

See table on pg. 20 of Zusha! report for GiveWell, Kenya, September 2017.

• 44.

For discussion and data on stickers being removed during cleaning, see table on Pg. 21, Zusha! report for GiveWell, Kenya, September 2017.

• 45.
• For full data quality check procedures, see Zusha! report for GiveWell, Kenya, September 2017, pgs. 18-19. Some quotes:
• ”Enumerators were prompted to take a picture of the stickers after placing them inside the vehicle. Those pictures were checked daily to verify the correct placement of stickers, and that each vehicle was left with a full set; any errors were discussed with enumerators.” [Note: this is a check on sticker distribution, not compliance prior to distributing stickers.]
• ”SurveyCTO was programmed to take an audio recording of each survey module following consent. Enumerators were made aware of the recording during training to deter cheating. This has proven to be a useful enforcement mechanism, and has provided corroborating evidence that has been used to fire several enumerators who were repeatedly non-compliant with the measurement protocols.
The audio recordings of a random 10% selection of surveys were analyzed to further verify that the interaction between the enumerator and the driver was legitimate, and the survey was administered completely and accurately. Although it was discovered that there are valid circumstances in which the enumerator does not need to administer the survey live in order to collect the data1, a review of the recordings allowed the research team to confirm certain data points within the survey, such as name of SACCO, etc.
On an on-going basis, a random 10% sample of surveys is selected for identified data points to be backchecked using the audio recording. Audio recordings are taken of all surveys, however, so recordings can be used on a case-by-case basis to corroborate evidence of misconduct by enumerators.”
• 46.

For rates of errors in sticker placement and audit process, see Zusha! report for GiveWell, Kenya, September 2017, pg. 19.

• 47.

See map of bus parks surveyed and details of timing of surveys on pg. 29 of Zusha! report for GiveWell, Kenya, September 2017.

• 48.
• "Destinations were chosen to collect a geographically representative sample and maximize the number of vehicles able to be reached over the time period. Enumerators started with the bigger, busier parks before moving to smaller ones." Zusha! report for GiveWell, Kenya, September 2017, pg. 28
• "The research team made a decision not to randomize the compliance checks, both for logistical reasons and because it would have required a comprehensive listing of all active PSVs in the country, which is not accessible. The current survey being administered through the NTSA (since May 2017) will be a source for a reliable sampling frame in the future, as all active PSVs are expected to travel through the centers over a 12-month period." Zusha! report for GiveWell, Kenya, September 2017, pg. 29
• 49.

See tables on pg. 33 of Zusha! report for GiveWell, Kenya, September 2017.

• 50.

See tables on pg. 33 of Zusha! report for GiveWell, Kenya, September 2017.

• 51.

For a full description of data quality procedures and error rates, see Zusha! report for GiveWell, Kenya, September 2017, pgs. 30-32.

• 52.
• 53.

For examples of our sensitivity analysis of our other cost-effectiveness analyses, see this blog post.

• 54.

• 55.

See Sheet "Moral weights" in GiveWell, Zusha! cost-effectiveness spreadsheet 2018.

• 56.

See this blog post for a discussion of our approach to cost-effectiveness.

• 57.

See Sheet "Budget III," Zusha! March 2018 budget and spending information - with GiveWell edits. This is a full annual budget for Zusha!'s program in Kenya, with a total cost of about $920,000. If Zusha! receives ~$100,000 per year due to GiveWell standout status, then it would have a remaining funding gap of about ~$820,000 to achieve its full budget. • 58. Conversation with Zusha! staff, February 21, 2018 • 59. Zusha! wrote: "Zusha can be deployed at scale using a number of different modalities. First, with full funding of roughly$900,000 per year, gui2de-East Africa would directly execute all components of the intervention, including sticker design and printing, sticker delivery through insurance offices and NTSA inspection centers, lottery execution and funding, and monitoring and compliance activities. We would exercise full control of both daily functions on the one hand, as well as strategic decisions on the other.

Second, with intermediate funding of roughly $500,000 per year, our local partner organizations would continue to play a substantive role in the project: for example, Direct Line Assurance would continue to fund the delivery of stickers through its retail network, and would fund the lottery; and the NTSA would take on the costs of inspection center activities. But gui2de-East Africa would play a strong coordination and management role between the partners. And finally, with minimal funding of$100,000 per year, gui2de-East Africa will play a limited role, organizing the printing of stickers, ensuring they are delivered to DLA retail outlets and NTSA inspection centers, and monitoring the execution of the lottery. However, successful implementation will rely on the engagement and good will of our partners, who themselves could be subject to both market and political distractions."

As an initiative committed to generating rigorous evidence for programs improving people’s lives, gui2de appreciates the great opportunities that GiveWell has provided through the incubation grant and this very thorough review process. GiveWell’s commitment to rigor and integrity, as well as the ongoing and open dialogue that has been kept between our organizations throughout the duration of the incubation grant, has been very rewarding for our teams.

We welcome GiveWell’s reservations towards the evidence on Zusha’s effectiveness and would like to take this opportunity to provide elements of response, hopefully shedding some light on the context from the frontlines of operations, as well as providing more data analysis investigating some of GiveWell’s claims. Finally, we provide some more details around how different scenarios of funding would help Zusha operate, for full transparency towards interested funders.

## Long-term effects of the intervention

Our analysis of data seven years after the first experiment failed to detect a long-term effect of the original intervention (which was orthogonal to the subsequent deployment of Zusha). We believe this is consistent with a model in which, at least over such a time-scale, passengers need to be sensitized and empowered to speak up on a continuing basis. In this context, it appears road accidents are more akin to a “chronic condition” than a curable disease, and long-term prophylaxis is necessary for sustained improvement. On the other hand, it is also possible that in the very long run, the stickers could support the formation of new social norms and expectations about driver and passenger behavior, so that enforcement of higher standards of service could occur without stickers.

Nonetheless, “drug resistance” is a potential issue, and we are aware of the possibility that passengers might become immune to the treatment. For this reason, we update the design of the stickers and the messaging every 6 months, and continue to provide incentives for adherence to the “medication” through the lottery.

## Initial Imbalance in Habyarimana & Jack 2015

We confirm that in the results reported in our PNAS paper, the treatment group had a statistically significantly higher accident rate in the period prior to the intervention than the control group. On the one hand, if that difference reflected different underlying vehicle and/or driver characteristics in an otherwise stationary environment, then our results would understate the true impact of the stickers.

However, it is conceivable that the data generation process that drives the observed pattern of accidents exhibits regression to the mean - having an accident might cause a driver to be more careful next period, reducing his chance of making a claim, while remaining accident free might induce more reckless behavior, if drivers update their beliefs about risks symmetrically. If this is the case, it is possible that our results would over-estimate the impact of the stickers, as the treatment group would have been on a downward trend in any case.

To test this possibility, we report results of a falsification test in which we conduct a diff-in-diff analysis in the two periods prior to the intervention. The results show that while the differential trend for treatment vehicles is positive, treatment vehicles also have higher accident rates in period t-2 (not included in the PNAS supplementary materials, but reproduced below) – that is, the rate for treatment vehicles is even higher in period t-1 – an observation that is not consistent with a simple regression to the mean process.

In the table below, we report a series of placebo tests in which we treat the two-year period directly before the intervention (2009-2011) as the “post” period, and years prior to this as the “pre” period. We report 6 different specifications (diff-in-diff coefficients in columns (5) and (6) are reported in the supplementary materials), in which we use all available data prior to 2009 (column 1), data from the five years prior to 2009 (column 2), and data from four, three, two, and one years prior to 2009 (columns 3-6), as the “pre” period.

The coefficient on “Any treatment sticker” is consistently positive and significant, supporting the interpretation that the treatment group was inherently and stably more risky than the control group. Similarly, the “Any treatment x post” coefficient is mostly significant but positive, suggesting that if anything treatment vehicles were becoming riskier over time.

Both of these observations give us confidence that our core results are not driven by reversion to the mean, and that if anything likely under-estimate the true impact of the stickers.

## Intervention mechanisms and compliance

Part of the decline in compliance in the Habyarimana & Jack 2015 study reflects declining effort in tracking down lottery winners. The study design addressed anticipated low compliance by arranging for all study vehicles to receive a new set of stickers starting in September 2011 (about 6 months after the initial issue). Overall compliance was not perfect but well above 20%.

Even so, we agree that the effect size is large, and even with 100% compliance throughout the study period, we would have been surprised by such a large reduction in accident rates. Indeed, this was a significant motivating factor in our efforts to replicate the analysis, as we were concerned that the results of the first experiment might have been a fluke.

However, we believe the data we used to measure the treatment effects in both studies was of sufficiently high quality that there is no reason to question the conclusions. In fact, our estimates in the presence of declining compliance rates suggest that more intensive treatment (and retreatment), with stronger retention incentives, could have even higher returns in terms of lives saved.

Zusha can be deployed at scale using a number of different modalities. First, with full funding of roughly $900,000 per year, gui2de-East Africa would directly execute all components of the intervention, including sticker design and printing, sticker delivery through insurance offices and NTSA inspection centers, lottery execution and funding, and monitoring and compliance activities. We would exercise full control of both daily functions on the one hand, as well as strategic decisions on the other. Second, with intermediate funding of roughly$500,000 per year, our local partner organizations would continue to play a substantive role in the project: for example, Direct Line Assurance would continue to fund the delivery of stickers through its retail network, and would fund the lottery; and the NTSA would take on the costs of inspection center activities. But gui2de-East Africa would play a strong coordination and management role between the partners.