• What’s Convincing Evidence?

    Robin Hanson’s response to my post is interesting. He writes,

    More likely than not, medicine is on average is useless or harmful on the margin; that is my best reading of the evidence.  When I try to persuade folks, I start with our single best data point, the old RAND experiment, and people complain it was too small, short, and long ago (and it let folks leave too easily). …

    Without a single very big very clear study there is simply no way to convince most folks that marginal medicine is on average useless, and we should cut way back.

    That old RAND health insurance experiment (HIE) was a single, big study, and Hanson is finding it hard to convince people of its merit. But he thinks an even bigger one will be more convincing. I’m very skeptical on that point. Given that, I think the opportunity cost of a half-billion dollar ten-year repeat of the RAND HIE is too high.

    Notice I’m not questioning the results of the RAND HIE. Let’s just assume for the sake of this debate that I agree with Hanson on its interpretation. In fact, what we could be debating is how he and I, together and in agreement, could convince others of our jointly held point of view.* He wants to spend a half-billion and ten years on another RAND HIE. I want to fund about 1,000 roughly two-year observational studies.

    Anybody want to put up a billion dollars? Hanson and I will split it, fund our studies, and then see how many people we each can convince. I would hope to be able to say things like, “In a review of 1,000  studies conducted over the last few years by different scholars using different data and methods all addressing the marginal value of medical care, the vast majority found X.” He’d be able to say things like, “The largest randomized study ever conducted on the topic found X.” Then critics would begin to attack the studies. He’s got one massive target. I’ve got 1,000. I like my odds.

    * Let me be clear. I’m not committing to sharing Hanson’s view on medical care or the RAND HIE. To make that commitment we’d have to have a much longer discussion. But that’s not important for this debate. For this one, let us presume I agree with him except on the question what study/studies to do to convince people of what he believes.

    • We *already* have thousands of med studies, and yet you can’t clearly say what most of them found about the aggregate health value of med on the margin, because most were not directly on that. Looking backward, the old RAND study is a much clearer source, why wouldn’t that also be true for a new similar study looking forward?

    • @Robin Hanson – If the old study is clear, why is it not convincing? Or if it isn’t convincing because of flaws and elapsed time, how are those to be avoided in a new study? You’re expecting perfection where none can exist.

      If observational studies to date haven’t focused on the issue at hand then that’s a reason to fund some. Why not do those first? They’re fast and cheap.

      We won’t agree. Nor is there need to.

    • I said “clearer” not completely clear. Flaws can be avoided with more funding, elapsed time fixed by new study. You never said what the thousand new studies would be on, so I didn’t presume they were on what I want, just on whatever.

      • @Robin Hanson – “Flaws can be avoided with more funding, elapsed time fixed by new study.” — As is true for observational studies as well.

        “You never said what the thousand new studies would be on…” — If they could be on anything then it isn’t a fair comparison. But the paragraph that begins with “Anybody want to put up a billion dollars?” should make it clear. As I said, I’d rather enter a debate with the results of 1,000 observational studies than the result of one big randomized one.

    • Time after time, observational studies purport to find one thing while randomized studies show something else (observational studies never get a good handle on selection effects, unobservables, compliance effects, etc.). So this is sort of like saying that 1,000 counterfeit pennies beat one genuine hundred-dollar bill, because a thousand is more than one.

      • @Stuart Buck – Agreed. Observational studies are not trivial to do well. But techniques exist to do so. If only everyone knew how to follow them.

        I think the fatal flaw in the study Robin Hanson proposes is its duration, as I wrote in my original post. He thinks the fatal flaw in observational studies is bias, you raise issues of selection and endogeneity.

        I’m convinced selection and endogeneity can be addressed, and has been in many good studies (on other topics). Bias exists everywhere and in many forms, which is why I trust a body of work more than one single study.

        There really is no end to this argument. But it might help if I said that I do generally prefer randomized experiments to observational ones, just not in all cases. We have other modes of inquiry and they are valuable, useful, and necessary.

    • Focusing on the number of *studies* (1000 vs. 1) does not seem like the most relevant frame. Instead, the merit of each approach will be based on the total sample size with comparable treatments, and how clear the causality of the extra med can be shown to be.

      W/r/t length quandaries, I say we think long now.

      • @Andy McKenzie – Quite right. But, the number is relevant in two respects: (a) total cost and (b) the more done the more likely some will be done well. With one big study, critics’ job is easy: find a flaw, any flaw. With many studies, all done somewhat differently, if there is a consistent pattern of results they are much harder to refute. The theories of refutation have to satisfy many more constraints.

        There are really two issues going on here. One is what do we take as the best scientific evidence? The other is, what is most convincing? Those are different things. Anyone who has debated or gotten a paper or proposal through a scientific peer-review process knows that supporting one’s argument with multiple non-overlapping sets of evidence is more persuasive than doing so with just one piece of evidence.

        Hanson wants the one big one–at great expense–because he thinks it’ll be most convincing. I disagree on that point. Perhaps this is only a difference about what he and I find convincing. Since that’s subjective, there’s no possible resolution.

    • Observational studies are not trivial to do well. But techniques exist to do so.

      Supposedly, yes, but I’m not convinced that IVs and propensity score matching are as good as randomized studies. A professor I know says that the only instrument he really trusts is a randomized lottery, which can be used in an ITT analysis. So that’s back to randomization.

      • @Stuart Buck – Observational studies are generally not as good as randomized ones in terms of controlling for all factors relevant to selection and causal inference. I never claimed otherwise. But when the cost of the randomized study is so high and results from it will take so long to emerge, I think it is reasonable to consider alternatives. There are also things one cannot learn through randomized trials, either due to ethical considerations or in difficulties of recruitment. If observational studies can be good enough to learn things in one domain there is no reason to reject them in others.

    • If you look at Table 3 here: http://www.csicop.org/si/show/science_and_pseudoscience_in_adult_nutrition_research_and_practice , there are 14 examples of medical claims made on the basis of observational studies that turned out to be false (or at least not proven) when a clinical trial was done. Based on findings like this, I have a nearly insurmountable prior of skepticism that any accumulation of observational medical studies would outweigh a good randomized study.

      • @Stuart Buck – So for things for which no randomized trials can be done (due to ethical considerations or lack of ability to recruit anyone) what do you do? Are there not things you believe about which no randomized trials have been done? There must be. Very little of what we “know” is due to evidence from a random-design experiment.

        In any case, let us agree to disagree. I believe a lot can be learned from observational studies AND they are hard to do well.