In “Assessing the case for social experiments,” James Heckman and Jeffrey Smith warn against mistaking the apparent simplicity of randomized controlled trials for actual simplicity. For, they are not so simple when assumptions on which they rely are violated. (How often they are violated and the extent to which that threatens validity of findings is not so clear, but it’s plausible that they are, to some extent, in a nontrivial proportion of cases.)
In an experiment, the counterfactual is represented by the outcomes of a control group generated through the random denial of services to persons who would ordinarily be participants. [… T]wo assumptions must hold. The first assumption requires that randomization not alter the process of selection into the program, so that those who participate during an experiment do not differ from those who would have participated in the absence of an experiment. Put simply, there must be no “randomization bias.” Under the alternative assumption that the impact of the program is the same for everyone (the conventional common-effect model), the assumption of no randomization bias becomes unnecessary, because the mean impact of treatment on participants is then the same for persons participating in the presence and in the absence of an experiment.
The second assumption is that members of the experimental control group cannot obtain close substitutes for the treatment elsewhere. That is, there is no ” substitution bias.” […]
It has been argued that experimental evidence on program effectiveness is easier for politicians and policymakers to understand. This argument mistakes apparent for real simplicity. In the presence of randomization bias or substitution bias, the meaning of an experimental impact estimate would be just as difficult to interpret honestly in front of a congressional committee as any nonexperimental study. The hard fact is that some evaluation problems have intrinsic levels of difficulty that render them incapable of expression in sound bites. Delegated expertise must therefore play a role in the formation of public policy in these areas, just as it already does in many other fields. It would be foolish to argue for readily understood but incompetent studies, whether they are experimental or not.
Moreover, if the preferences and mental capacities of politicians are to guide the selection of an evaluation methodology, then analysts should probably rely on easily understood and still widely used before-after comparisons of the outcomes of program participants. Such comparisons are simpler to explain than experiments, because they require no discussions of selection bias and the rationale for a control group. Furthermore, before-after comparisons are cheaper than experiments. They also have the advantage, or disadvantage, depending on one’s political perspective, that they are more likely to yield positive impact estimates (at least in the case of employment and training programs) due to the well-known preprogram dip in mean earnings for participants in these programs.
In fact, I frequently see policy arguments made with before-after type evidence. A familiar theme these days is that anything that’s happened in health care since March 2010 is due to Obamacare. Nothing could be more preposterous,* yet this is all a politician needs for a talking point.
* Well that’s not true. It’d be more preposterous to say that anything that’s happened in health care since March 2010 is due to the 2020 presidential election. That would not fly as a talking point. Not yet, anyway.