• The history of the politics and abuse of methodology

    About my post on RCT’s gold-standard reputation, below is text of an email from a reader who wishes to remain anonymous. I’m posting it not because of the compliments (I do like them, though) but because I am grateful for the final two paragraphs on the history of the politics and abuse of methodology, about which I know very little.

    The comments are open for one week. Chime in if you know any relevant history. Bring the dirt! References welcome.

    I think when thinking about the role RCTs have in medicine you’re dead-on when saying the conceptual simplicity is really, really important. The people who read economics journals almost all have major quant training. Doctors are supposed to understand medical journals and most of them have very little.

    I’d bring up 2 related points.

    The first is FDA. B/c medications must be approved with 2 pivotal trials, we’re all used to seeing RCTs regularly and seeing them as literally the government’s official imprimatur of success.

    The second is marketing. In the ’90s pharma figured out how to use RCTs to their advantage. Design massive trials in highly-selected populations, don’t look hard for side effects, and don’t publish the negative trials. If p<0.05, market to everyone. Gold standard blockbuster! For example, there’s reason to worry if SSRIs help much of anyone. http://www.nejm.org/doi/full/10.1056/NEJMsa065779

    The third, supporting your contention, is history. In the ’90’s there were really 3 schools fighting over how clinical data should be used in clinical practice. At Yale, Alvan Feinstein wanted a very detail-oriented, methods-based clinical epidemiology. David Eddy instead envisioned systems of care involving RCTs, decision analyses, and decision support. Finally at McMaster, Sackett, Guyatt, etc, developed a very simple view of the evidence hierarchy. McMaster won. Their success in marketing a simple approach toward clinicians with books, curricula, and doctor-focused series in JAMA was central to that.


    • My understanding is that in economics the original goal of RCTs was to estimate important, presumably fixed parameters like the elasticity of labor supply, price elasticity of health care demand, etc.

      Then at some point it became clear to people that these parameters varied a lot across contexts and even theoretically were poorly defined (Frisch or Hicksian? Is the price of healthcare $1/$1 at the margin or $0 since you know you will hit the OOP max?)

      My guess is that at some point this old focus will become the goal of RCTs in medicine because when you are doing “personalized medicine” no one is going to care about the ATE, they want to know how the medicine will effect me. And in theory we can make good predictions based on your genome, methylation patterns etc. and some computer model … Once you calibrate the parameters using experiments.

    • I only know of statistical anecdotes from bicycle safety studies, where selection effects and conventional wisdom rule.

      There’s at least one researcher who collected data showing that helmets reduce the incidence of leg injuries (but that’s not what he claimed in his paper, of course) and years later, when he and co-authors compiled data showing that cities that introduced bike share reduced injuries of all forms in *absolute numbers*, presented this as “adding bike share increases the proportion of bicycle head injuries” (because head injuries were not reduced as much as other injuries).

      The first error was failure by selection effect (children whose parents manage to get them to wear helmets are different from those who don’t), the second was failure by fitting data to the desired result. RCT would have fixed the first problem, but not the second.

      The takedown is here: http://www.cyclelicio.us/2014/bike-share-head-injury-helmets-safety/

      (Something that needs to be studied is the apparent safety of bike share; it is disproportionately urban which makes it safer than cycling in general, but it has turned out to have far fewer fatalities — zero — than anyone expected, and having biked home in the company of bike-share cyclists, I can tell you that they are not all slow.)

      In another RCT-but-not-blinded bicycle safety study, the authors had to (attempt to) correct for “test group enthusiasm” — daytime running lights were observed to reduce the (reported) rate of single-party daytime crashes, which is implausible. ( http://dx.doi.org/10.1016/j.aap.2012.07.006 , paywalled. )

      Presumably this would be a problem for any study where the test group can figure out that they are the test group — I recall a classmate’s rat study years ago of a drug that tasted so awful (Dilantin) that both control and test groups had to be fed sugar water in such quantities that they became obese. (No, I don’t have a citation for that one.)