Happy Holidays to all.
Megan McArdle has an interesting post, somewhat meladramatically titled “Why pilot projects fail.”
Many of her points are well-taken. Indeed these comprise classic matters within the discipline of program evaluation. An excellent efficacy trial of a highly-resourced boutique intervention implemented under ideal conditions by uniquely motivated staff serving a highly compliant and homogeneous population won’t scale-up with similar results within the messy world of actual practice. The people I know who design early-childhood, health insurance, and violence prevention interventions spend a lot of time considering the challenges McArdle discusses.
Any policymaker, advocate, or evaluator who hasn’t internalized this reality is making a huge mistake. It’s all too easy to oversell what a given intervention is likely to accomplish. Of course the typical preschool intervention won’t be as effective as (say) the famous Perry Preschool intervention or related best-practice examples. Head Start requires various improvements and adjustment. Evaluated by the proper effectiveness standard, Head Start is still a worthy and valuable program that has amply justified its costs.
I spend much of my own time designing and implementing various pilot projects. Some succeed. Some fail. Some really aren’t designed with that straightforward metric from the get-go. Indeed it’s worth considering what one is really trying to do in this kind of policy and social science research. (An excellent NBER working paper by Jens Ludwig, Jeffrey Kling, and Sendhil Mullainathan provides the best recent discussions of these concerns. It rewards a close read.)
One reason to perform an ideal-case intervention is to provide an upper-bound estimate of what can realistically be accomplished. David Olds’ noted research on nurse home visitation for young disadvantaged pregnant women and new moms showed that a terrific program could help reduce some markers of child abuse and neglect, and could reduce the incidence of rapid subsequent unintended pregnancies. The same intervention did essentially nothing to reduce the incidence of low birth weight deliveries. That is useful information to understand the potential impacts and limitations of real-world programs that seek the same goals.
Other experiments are more traditional evaluations of plausible policies or clinical interventions. Still others might be described as process improvement. These seek to incrementally improve current practices within a school or a clinic setting.
Over the past few years, I have spent significant time helping to evaluate or implement the latter kinds of intervention—screening and brief interventions for problematic substance use, needle exchange, violence prevention efforts among high school students, mental health interventions for youth involved in the criminal justice system, summer employment and recreational opportunities for high-risk youth.
Many others have greater experience and expertise in these areas. Having paid my dues in a variety of public health settings, for the moment I will add a few points to what McArdle has said….
1. The value of causal understanding is (sometimes) overrated in designing effective interventions.
We are often tempted to think that the way to design better public policies is to understand the root causes of what we are trying to influence, and then to design interventions that specifically influence these causes. We’re bombarded (for example) with brain-scan studies of substance abuse, early-childhood psychological studies that purport to explain the fundamental causal pathways that lead to adult violence.
Such studies are generally far less useful for designing practical interventions and policies than their proponents believe they are. Most really good interventions start with plausible (though often somewhat fuzzy) conceptual foundations. These were rigorously evaluated, and improved over time. Social problems such as teen smoking won’t be addressed through some sort of fundamental scientific breakthrough. They’ll be addressed through the methodical development, implementation, evaluation, and improvement of reasonable interventions.
I’ve cited before my favorite quote in these matters. In 1971, the cancer researcher Sidney Farber testified before Congress. At the time there was a bitter dispute between those who argued for rapid clinical trials of potential therapies, and those who wanted greater emphasis on basic research into fundamental cellular mechanisms. Farber was emphatically in the first camp:
We cannot wait for full understanding; the 325,000 patients with cancer who are going to die this year cannot wait; nor it is necessary, to make great progress in the cure of cancer, for us to have the full solution of all the problems of basic research. . . . The history of medicine is replete with examples of cures obtained years, decades, and even centuries before the mechanism of action was understood for these cures—from vaccination, to digitalis, to aspirin.
This is a fascinating comment. It basically expresses my gut-instinct about public policies. If this is the reality facing cancer researchers—whose scientific apparatus forty years ago was vastly superior to what we will ever have in social science—Farber’s insights hold doubly for most everything else we do in public policy. Moreover, even we had decent causal models, these models wouldn’t automatically translate into effective clinical or policy interventions.
Of course, Farber’s perspective was pretty limited, too. For one hundred years, chemotherapy and radiation cancer treatments were basically hit-and-miss, blunderbuss exercises. To oversimplify things– but only slightly–oncologists deployed whatever powerful toxins would kill or contain cancers and that didn’t kill the patient first. Clinical oncology held great scientific discipline and rigor, but this rigor resided in the ability to execute valid and informative randomized trials rather than the ability to explain the biological mechanisms of action of the actual medications on human cancer cells.
Not surprisingly given this lack of mechanistic understanding, blunderbuss cancer therapies have often caused terrible side-effects. Moreover, effective treatments didn’t exist for many cancers, which for whatever unknown reason failed to respond to available therapies. Within the past fifteen years, researchers have used advanced biology to develop targeted drugs such as Gleevec that treat previously-untreatable cancers. These targeted therapies tend to have fewer side-effects because they attack a smaller range of human cells. One can tell similar stories about protease inhibitors in HIV/AIDS care. Such treatment advances would not have been possible without fundamental scientific advances.
I wouldn’t want to overstate things, but something analogous is sometimes possible in social policy. The academic literature on youth aggression is coming to include a variety of evidence-based methods to improve social-emotional and self-regulation skills and to understand characteristic cognitive traits and behaviors that are common among aggression-prone youth and can be addressed through cognitive-behavioral therapies and other approaches.
One might also design a boutique experimental intervention to understand specific causal mechanisms and the potential impact of influencing one particular pathway that might influence outcomes. There’s no particular reason these kinds of interventions will or should resemble policies that could be feasibly implemented on a huge scale. That’s not the intended purpose.
For example, the issue of food deserts attracts widespread attention in understanding high rates of obesity in racially segregated low-income communities. Ludwig and colleagues note that one could design a nice program to deliver fresh fruits and vegetables right to peoples’ doors. One probably wouldn’t replicate this intervention on a large scale, but it would be interesting to see if such an intervention yielded important improvements in nutrition intake and subsequent obesity. If it didn’t, that would suggest that we probably should not be very optimistic about less powerful interventions that seek to improve nutritious food access within the same populations.
The RAND Health Insurance Experiment was closer to real-world insurance practices of its time. Yet the RAND HIE was a little like that, too. It was designed to examine price elasticities of demand for different kinds of health services. Along the way, it helped illuminate important connections between health insurance, health care utilization, and actual health.
2. Trying to hit singles is more exciting (and more realistic) than trying to hit home runs.
Most of my own work has concerned policy evaluations or quality improvement efforts that seek to improve real-world interventions. This requires implementing and testing cost-effective interventions that realistically match the available human and financial resources one would require to really scale things up. So I personally find a useful and feasible HIV prevention intervention that costs $2,000 per injection drug user or per high-risk youth or per violent offender much more exciting than a truly excellent intervention that serves the same people much better at ten times the average cost. We’re not going to provide that excellent intervention to more than a relative handful of people.
Unfortunately we live in a media, political, and sometimes funding environment that puts too much of a premium on home runs, when useful singles are often more helpful and more realistic as a short-term goal. I was once in a meeting in which a foundation official asked: “Find me the next Perry.” If that is the standard, one is tempted to pin false hopes on promising but imperfect interventions. One is also tempted to dismiss the value of what one can actually do.
Feasible, well-evaluated relatively modest interventions accumulate, done well, to genuine improvements in public policy. This is essentially the model in health care quality improvement. Simple checklists and process improvements accumulate to reduce hospital infections, prevent medical errors, to focus people’s attention on what really needs to be done. There is no particular home run intervention behind (say) the 100,000 lives campaign. There are merely a succession of more modest interventions implemented methodically and well.
3. We need to nurture the people and organizations who do the actual work of social policy.
This is obvious, but still deserves mention. Many promising innovations fail, or never really go anywhere, because they don’t match the ecosystem in which the work must be done. An innovative school- or hospital-based program won’t work unless the people who work in these busy and challenging environments really embrace it, and apply their own expertise and effort to make it a success.
As a public health researcher, I might be jazzed about an innovative experiment to improve the quality of school food, to increase physical activity, or to provide counseling designed to reduce violent behavior. Each of these efforts requires the active cooperation of teachers, administrators, and others to make this experiment work. These men and women generally wish innovative measures well. Yet these staff members have many other things to worry about and to do. They will be jazzed about my new thing to the extent that it helps them do their jobs better, or effectively addresses issues to which they assign some genuine urgency.
When we fail to attend to such matters, when we fail to regard the world from our colleagues’ shoes, we can pretty much guarantee that we will fail.