I basically agree with the vast majority of what you wrote, and will resist the urge to write 8,000 words on the incredible nuances of every detail of my thinking.
The one overarching comment is that a huge theme of the book wasn’t how great experiments are, but rather the depth of our ignorance about the effects of our non-coercive interventions into human society. The plea of the book was to recognize this when making decisions.
My advocacy of ITT as the default position for evaluating a trial is not because this answers the question we most want answered (it very often does not, for the reasons you describe), but because at least we can have some confidence about internal validity. That is, we can know the answer to some question, as opposed to potentially fooling ourselves about the answer to an even more important question.
Similarly, when it comes to non-experimental methods, as you note I advocate using them. But I tried to make the point that even a collection of such analyses doesn’t only compete with other analytical methods or “make the decision blindly,” but also with the alternative of allowing local operational staff to make decisions with wide discretion. Over time, and considered from a high level perspective, this is a process of unstructured trial-and-error, which I see as the base method for making progress in knowledge (or at least, as I put it, implicit knowledge) of human society.
Finally, one narrow point is that I think (and tried to describe at length in the book) why the causal mechanism in smoking – lung cancer is qualitatively different than that of social interventions, and therefore why the Hill approach [relying on many nonexperimental studies when RCTs are impossible] does not generalize well from medicine to sociology.
I think it’s difficult to say that because the Levitt abortion-crime regression isn’t robust, therefore we can conclude that abortion didn’t cause some material reduction in crime. A key argument in the book is that the regression (or more generally, pattern-finding) method is insufficient to tease out causal effects of interventions. As I said in the book, I think that the rational conclusion, based only on the various analyses published on the subject, isn’t “no material effect,” but rather “don’t know.”
This goes back to the applicability of the Hill method to social phenomena. It’s why I think the research that compares non-experimental estimates of intervention effects to what is subsequently measured in RCTs and shows these methods don’t reliably predict the true effect is so important. And the kinds of interventions that can be subjected to RCTs are generally simpler than the kinds of things like “legalize abortion in several American states” that cannot be. So, if anything the very interventions that are analyzed non-experimentally should be harder to evaluate than the kinds of interventions that are subject to testing.
What I think would be very practically useful would be to have a large enough sample of paired non-experimental and RCT analyses of the same intervention, so that we could have rules of thumb for where the non-experimental approaches provide reasonable estimates of causal effect. I researched this for the book, and while a number of studies have been done (I footnoted them), there is nothing like the breadth of coverage to allow such an analysis as far as I can see.
Thanks Jim, you get the final word!