• How not to compare observational studies to a randomized trial

    Initially, I was excited about this review from The Cochrane Collaboration, which concluded, “[O]n average, there is little evidence for significant effect estimate differences between observational studies and RCTs.”

    Then I read it and, unless I’m confused about something, I think they asked the wrong question and got a useless answer.

    In the literature from 1990-2013, the authors found 14 studies that compared results of observational studies to an RCT. A subset of these 14 studies examine one specific condition or treatment. That’s a worthwhile exercise, provided the observational studies used sound methods. (Some reasonable criteria must be applied.) I’d very much like to know if a collection of observational studies using good methods can be meta-analyzed to yield something close to the result of an RCT. This is basically a variant of the question of whether there’s wisdom in crowds.

    Another subset of the 14 studies identified consists of RCT vs observational comparisons that lump together a swath of conditions/treatments. This is dumb. I don’t care at all if a collection of observational studies looking at different conditions and treatments, on average, is close to corresponding RCTs. In fact, I’d expect, on average, that they would be close since they’d be randomly biased in different directions (effects both smaller and larger than found in RCTs).

    Worse, the Cochrane study took all of these 14 studies and meta analyzed them. Examining everything in one glop like this, they found exactly what you’d expect. With stuff randomly biased positive and negative, they got close to zero apparent overall bias. This is not useful information. It’s meta-analysis of things that are heterogeneous in a way that guarantees a result one could predict in advance.

    By the way, it’s not a forgone conclusion that (unbiased) observational studies should match RCT results. It’s reasonable to expect that an RCT focused on a carefully selected subpopulation might produce different results than an observational study (or a collection of them) focused on a broader population. RCTs are awesome at internal validity but not external validity. So, though it’s a worthwhile question if a collection of observational studies match an RCT on the same subject, it’s only a worthwhile question if you believe they should match, i.e., that the RCT’s sample is representative of the populations examined by the observational studies.

    Ultimately, there’s just no shortcut to looking carefully at study designs and samples.


