Government policy and international development programs from NGOs ought to be built around the best available knowledge and evidence. But efforts to incorporate academic research into such programs have been halting at best. Why?

Yale SOM’s Rodrigo Canales, associate professor of organizational behavior, and Tony Sheldon, executive director of the Program on Social Enterprise and lecturer in the practice of management, with then-MBA students Brendan Lehan ’18 and Tory Grieves ’18 and a large team of research assistants, spent two years conducting hundreds of interviews and developing case studies that make clear the impediments that keep evidence from being effectively integrated into practice. But the research also allowed them to tease out the aspects of a successful approach, drawing on all-too-rare instances of evidence being successfully applied in policies and programs.

The output of their work, supported by the William and Flora Hewlett Foundation, is an integrated model for incorporating evidence into practice. The approach involves bringing together key stakeholders—including academics, policy makers, and practitioners—from design through implementation. This process, the research suggests, allows for dramatically better results, and can feed a virtuous cycle, generating iterative improvement to the integrated model itself. 


Q: What is the Evidence in Practice research project?

Rodrigo Canales: For a long time, we’ve been talking about designing public policies and NGO interventions based on the most rigorous and latest evidence. Everybody agrees that’s how it should be done, yet the most frequent thing you observe is that it’s not. The William and Flora Hewlett Foundation has been puzzled over why. 

Tony Sheldon: We teamed up with them to understand the constraints that prevent evidence from becoming embedded in policy and practice, as well as teasing out some of the variables that were present in situations where policy based on evidence has occurred successfully. We did over 200 interviews and wrote eight original case studies; there definitely were patterns. 

In the unsuccessful cases, there was often dependence on the “waterfall model,” which hopes that evidence produced by experts, who tend to be academics, just naturally cascades to policy makers and from there to implementers. 

But researchers, policy makers, practitioners, and funders define evidence differently, making it hard to translate from one stakeholder group to another. On top of that, they each have different incentives, timelines, and organizational constraints, meaning the waterfall model frequently doesn’t work.

On the other hand, in cases where the best available evidence was incorporated, there was a real effort to align the different stakeholders upfront, so they could convene around addressing a shared problem, in a way that let each contribute while acknowledging the strengths and constraints faced by other stakeholders.

Q: Could you give some examples that illustrate the challenges of incorporating evidence?

Sheldon: One of our case studies is about a group in India that did very rigorous scientific evaluation of a water purification system, then developed a very high-quality product. But, when they moved to distribution, they didn’t look to the social science evidence for what would most likely work.

They sold it in retail shops in rural villages. Yet recently published studies had shown that rather than distributing in retail stores or directly in homes, the best venue is a village’s central pump where people collect their water. They were so careful to apply evidence in some areas but not in others.

Canales: We also looked at a very ambitious program in Mexico aimed at lowering unemployment rates for newly graduated young people. Primer Empleo, or “First Job,” became a central policy for the incoming administration of President Felipe Calderón. 

The goal was genuine and, if you look at the design of the policy, it makes logical sense; however, they never looked at recent research showing that generalized subsidies, though they appear promising, do not tend to work and that what does work is specific, highly focused incentives.

In their design process, they didn’t incorporate the companies that they hoped would be doing the hiring. They didn’t include researchers or NGOs that know the most about recent school graduates—who they are, how they behave, and what they are looking for in terms of employment. As a result, they ended up with a policy that made a lot of sense to policy makers but was effectively inoperable for the potential partners.

Beyond that, their drive to have an impact very rapidly led them to roll it out quickly as a major government program. The issue with government programs is they have to be woven into the official budget, which comes with strict rules for how public funds are used.

Once that happens, a program becomes rigid. With Primer Empleo, as they started getting information that the program, as designed, was not achieving the impact that they wanted, they had very little ability to change it. The law determined what the program could and could not do with the funds. They were not able to course-correct because of how they went about designing the intervention. 

Q: What prevents evidence and insight from flowing between one area of expertise and another?

Canales: Normally, when academics do a study, we try to control for as many factors as possible so the data generated will result in a publishable finding. For example, we want to control for context so that the results are as widely applicable as possible. For practitioners, context may be everything. The NGO implementing an intervention needs to understand how to deal with the specific factors on the ground where they work. The typical abstract academic study often cannot be easily applied to the conditions they are facing. The result is that evidence can’t move from one field to another without translation.

The integrated model helps navigate these differences and, because of what each group brings to the table, collectively they develop a much better answer. It’s a much more participatory and collaborative process. It’s also more complex and takes more time.

One of the critical hurdles is that without really being aware of it, the stakeholders come from very different kinds of organizations; they belong to different professions with different professional norms; they are evaluated on very different factors; they need to deliver very different types of results. 

The academic is worried about publishing and only publishing. The politician is worried about electoral cycles and only electoral cycles. The practitioner is focused on implementation and meeting funders’ demands. They each operate on different currencies of exchange.

Sheldon: We found that recognizing the different currencies of exchange was a necessary step to fostering collaboration. Recognizing the value of each perspective and the evidence that each can bring to bear is what is most likely to create successful collaborations.

The fact that each group has its own currency of exchange can be a barrier to successful integration of evidence, but it can also be an opportunity to convene together in the service of addressing the underlying problem.

Canales: The currency-of-exchange model allows each party to clearly state what they have to offer and what they need to get out of the process.

Academics offer the ability to ask really good questions in very well-defined terms so that we learn effectively. They bring legitimacy. Policymakers can do things at scale that really address big, interesting problems. Practitioners offer experience and very clear information about what works and doesn’t at the ground level. That’s what each brings to the table.

From there, each party can state the minimum they need to remain engaged through the entire process. The academic will say, “I need to be able to publish a paper from our interactions. Otherwise, I can’t remain engaged.” The policy maker will say, “I have elections a year and a half from now, so I have to be able to communicate to the public that what we’re doing makes sense in a way that people will understand and will be politically acceptable to them.” And the implementer will say, “I need to be able to meet the budgetary cycles of my funders and the needs of my clients.” 

In order to be able to collaborate, we need to figure out how to get everyone what they need. We need to figure that out because if we work together, we can do really incredible things, but given our constraints it’s not natural or simple for us to work together. However, once we put currencies of exchange on the table, we can negotiate ways to allow the players to remain engaged throughout. 

Sheldon: It is important that we very intentionally say, “How can we make sure the irreducible needs of each constituency are actually met?”

Q: What does it look like when the integrated model is used?

Canales: South Africa developed an intervention to help reduce unemployment especially among young males. They convened researchers, policy makers, and implementers from the beginning. Jointly, they agreed to target youth unemployment. They agreed that to truly make an impact only government could deliver at the scale required. Since the main venue that the South African government had to address this issue were labor centers, they agreed to design interventions that fit within the constraints and work processes of the labor centers. They agreed to incorporate the latest published academic research as well as to develop new evidence specifically from the South African context. 

It was a very different way of working for everyone involved. The researchers had to adapt how they thought about the problem in order to fit within the government’s constraints. The government had to agree to do a randomized control experiment. Governments never do randomized experiments; they have serious concerns about fairness, equity, and legal constraints. But the researchers persuaded them that the only way to truly learn about what works was to offer interventions in some centers that weren’t offered in others while measuring both. The labor center employees, who had to carry out the experiments, were incentivized by aligning the interventions within their existing structure of evaluation and accountability. 

“Politicians and practitioners are used to trying to avoid failure at all costs. For researchers, we need variance in results to figure out what works.”

By doing all this, very quickly they generated very, very useful information about what worked to improve the situation for unemployed young people, what didn’t work, and why.

They used what they learned in a second, larger intervention. And then the findings of the second intervention were used to develop the next intervention. 

Sheldon: The integrated approach is designed with learning loops that inform the next iteration of the program rather than saying at the start, “We know that this works, so this is what we’re going to do,” which tends to be the pattern of programs that have not been flexible enough to integrate new evidence.

Beyond developing a successful intervention, there are ongoing benefits. The researchers in South Africa established relationships with the government, which have led to additional collaborations around other issues related to unemployment. 

Q: Is it simply a matter of acknowledging the currencies of exchange and negotiating from there or are there other hurdles?

Canales: One of the most difficult misalignments of incentives that we find in this ecosystem of evidence to practice is fear of failure.

Politicians and practitioners are used to trying to avoid failure at all costs. A failure to politicians might mean that government resources were wasted or that some people were actually negatively affected by government policy. For an NGO practitioner, failure might mean that because they weren’t able to show results they’re not going to be eligible for future grants. 

For researchers, we need variance in results to figure out what works. We need some things to work and some things not to work so that we can compare them. In a sense, the thing that doesn’t work is a “failure,” but without failure you cannot learn. 

One of the most important switches in successful cases was this reframing from a binary of success or failure to designing for learning where there will be failure—controlled, small-scale failure—that allows us to learn the specific things we need to learn. 

Additionally, government officials and practitioners tend to want to start as big as possible. When you’re designing for learning, you start as small as you can because you know that you’re likely not going to get it right. Smaller means that everything is more controlled. 

Government officials also need to show results quickly. They are in the public spotlight and the next election is always coming up. Learning takes time, so we have to find ways to build patience into the system. Small pilot projects are one option. For government officials, they are easy to explain; they get relatively quick results that can provide additional political coverage and time for the longer learning cycle to happen.

For practitioners, their biggest constraint is funding. These NGOs are bound to the funding cycles and the demands of donors. This new approach looks different. They have to explain, we’re not just implementing, we’re learning. That doesn’t mean we’re not accountable; it just means our mechanisms of accountability are going to be different. Instead of being evaluated on outputs, which is what we’re normally evaluated on, we will negotiate milestones and the type of learning that we will show you at each milestone.

Q: Where do funders fit in the integrated approach?

Sheldon: The underpinning of enlightened financing is crucial to this whole approach. Often financiers, whether they come from private foundations or multilateral agencies like the World Bank, can help convene the other constituents. But it really takes an enlightened funder to recognize the value of learning for the long term, really tackling the problem, rather than pushing for short- and medium-term outputs that match what we think we already know.

Canales: Sometimes you find financiers who are already enlightened, and they push this learning mindset and methodology into a program. 

With financiers and donors that don’t normally think about things in this way, I’ve found that when you explain the approach to them, they can be convinced with the right language, the right data, or the right examples. Not all of them, but many of them are interested by the idea of committing a smaller amount of funding directed at learning, then, only once you’re sure that the intervention has been proven in context, putting the bigger amounts toward scaling.

But, as Tony said, without financing that is structured for learning and for flexibility, this approach is not going to work. 

Q: Is it challenging to make the integrated model work in different contexts?

Canales: Our research was explicitly designed to take this complexity into account. Every city, every country, every political system is different. 

Sheldon: We did two case studies each in India, South Africa, Mexico, and Ghana. The integrated approach tends to work across cultural and geographic contexts. But it isn’t a cookie cutter model; the integrated approach is saying, “Come with a sense of humility and collaboration and desire to really understand and grapple with the particularities of the problems in this context.” This approach anticipates that ideas from other contexts will need to be adapted. 

Canales: You also need a convener who will help identify who needs to be at the table and then can translate the different languages that different professions bring. The convener can also be a trusted mediator who helps resolve conflicts. 

Sheldon: We often found that there was a champion. Sometimes it was a funder; sometimes it was a government official. The champion provided a safe space for these different constituencies to be willing and able to collaborate together. 

When that happens, the integrated approach can help change the structure of thinking within all these institutions.

Canales: One of the most-cited examples of well-incorporated evidence in practice is a government program designed during the Zedillo administration in Mexico called Prospera, or Prosperity. When President Zedillo started his term, he wanted to address the problem of intergenerational poverty. Rather than very quickly making big waves with a policy announcement, he took a disciplined approach, saying, “We’re going to take two or three years to design this carefully.” 

One of the things that really allowed Prospera to work was that the president explicitly kept it out of the budgetary law and used more flexible funding from foundations and discretionary government funding while the priority was to learn. 

President Zedillo convened experts from universities and international agencies, like the World Bank and Inter-American Development Bank, along with federal government policymakers and partners from local government who really know how things work on the ground. Together, they started a process of trying to design, from evidence, a policy that would work.

They mapped out what was known and what was not known. They identified questions that they would face at the different levels of the problem and the evidence that they would need to gather at each level to answer each question.

The first question was, what do we need to provide families that will actually help them get out of poverty? Recognizing that generalized subsidies or transfers had been demonstrated not to work, they focused on a transfer of cash or other government resources to poor families conditioned on specified behaviors. 

To figure out what kind of incentives and conditions to require, they designed small randomized controlled trials in different locations. Mexico is a very varied country so they wanted to make sure that whatever they discovered could travel across different parts of the country. 

Based on the initial results, they refined the next iteration and ran experiments to figure out how to track outcomes. Then they ran experiments at a slightly larger level on the kinds of rules and organizational structures that would be needed. It was a very ordered process. 

Some experiments were designed to produce rigorous quantitative data. Some developed qualitative data that helped understand why certain organizational structures were working better in some regions versus others. It takes planning and foresight to design experiments that collect all these very different types of evidence.

Periodically they regathered at small conferences to figure out, “What have we learned from the past six months? What are the next questions that we need to answer? How do we answer them?” Collectively they figured out the next steps. Then they would do the next round of experiments, reconvene, and ask, “What did we learn? Where are we now? What’s next?”

Through that process, they developed one of the most successful evidence-based programs in the world to date. It has been replicated and started a wave of targeted government subsidies that have come to be known as conditional cash transfers. 

Q: Where do you hope the work on Evidence in Practice goes in the future?

Canales: Our hope is that this is a two-way street. As these multi-actor partnerships of government officials, local implementers, academic researchers and institutions, and international foundations come together to integrate evidence into practice, we will get better solutions to problems and we will be able to evolve our understanding of the process of integrating evidence into practice. 


The research team for the Evidence in Practice project included Jillian Anderson, Shira Beery, Hitoishi Chakma, Erika Drazen,  Jessica Gallegos, María del Mar Gutiérrez, Elizabeth Karron, Emilie Leforestier, Scott Lensing, Kiersten Abate Sweeney, Ewelina Swierad, Katherine Wong, and Lauren Wyman. Charles Cannon and Design Observer provided crucial design and editing contributions.