RCTs and observational data (1): The problem with RCTs

SayBigDataTo have truly evidence based medical care we need lots of evidence. How do we get it? Many of us want to build new learning health systems that continuously acquire new evidence and thereby improve care. In this post and the next, I’ll discuss the prospects for using large observational data sets (“big data”, but don’t tell him I said that) to advance evidence-based care.

According to David Sackett,

Evidence based medicine is the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients.

But what is the “best evidence”? The best evidence will tell us whether our clinical interventions cause benefits for patients and how large that benefit is.

How do we get this causal knowledge? One view is that we need to generate evidence by running lots of randomised controlled trials (RCTs), because

Randomised controlled trials are the most rigorous way of determining whether a cause-effect relation exists between treatment and outcome and for assessing the cost effectiveness of a treatment.

Which is great. Unfortunately, I doubt we can achieve truly evidence-based medicine through RCTs alone. RCTs are too expensive and too slow. As a result, according to Christopher Longhurst and his colleagues,

Even in the well-studied field of cardiology, only 19 percent of published guidelines are based on randomized controlled trials.

RCTs often use a narrowly selected subset of patients, which increases their efficiency at the cost of limiting the degree to which they can be generalized to a wide subset of patients. Most importantly, most RCTs control the treatment delivery very closely, which is not the way routine medicine works. This means that the RCT’s estimate of the size of the treatment effect is greater than we see in everyday care.

There are two directions we can go at this point. One is to try to make RCTs less expensive and and more similar to routine care. The other — discussed in the next post —  is to estimate the causal effect of treatment from observational data.


For more thoughts about methods and causality, see Austin’s recent posts here, here, here, and here.

Hidden information below


Email Address*