The following is a guest post from Dr. Stephen Soumerai and Dr. Ross Koppel about a study of Pioneer ACOs recently published in the New England Journal of Medicine, which Austin summarized here. Dr. Soumerai is a professor at Harvard Medical School and the Harvard Pilgrim Health Care Institute. Dr. Koppel is a professor of sociology at the University of Pennsylvania and conducts health-care research at Penn and Harvard. The authors of the paper about which Drs. Soumerai and Koppel comment will respond on TIE shortly [link will work when the post goes live].
The April 16 New England Journal of Medicine article by McWilliams et al, close and respected colleagues, concludes that 32 nationally recognized medical care organizations—selected by CMS as “Pioneer Accountable Care Organizations (ACOs)” – saved 1.2% in health care spending contrasted to a sample of other medical organizations that are not comparable (as we shall explain). Our goal is to advance the dialogue within a learning health system because we have an obligation to consider methodological issues to inform ourselves, our students and healthcare policy.
The comparison group was comprised of medical organizations not selected as “Pioneer Accountable Care Organizations” and experienced a different reward system. This is an important distinction when the hypothesized reason for the differences is that the Pioneer group received financial incentives for lowering costs.
The small differences between the groups (1.2%) appear uncertain in comparison to what might have happened anyway given the irreconcilable differences between the winners (the study group) and the losers (the controls). This is the very definition of non-trivial selection bias that is unmentioned in the limitations of the report. Actually, there are two forms of selection bias here—both volunteer and “cream of the crop” selection. That is, 1. The ACOs had to apply for Pioneer ACO status via a formal proposal; and 2. Then, CMS selected the most desirable ACOs from the “winning” candidates. In other words, CMS, the leader of the initiative, selected these ACOs from a large field of applicants because they were judged to “have experience offering coordinated, patient centered care, … and offering… quality care to their patients, along with other criteria listed in the Request for Applications (RFA) document available at www.innovations.cms.gov.” (This is not mentioned in the study.)
Despite Herculean statistical attempts to level the playing field in this observational study, the possibility exists that dozens of unmeasured factors may have been responsible for the less-than-overwhelming 1.2 % “savings,” such as the understandable inclination of the ACOs to delay cost saving measures until they are paid for.
Perhaps this national experiment worked? Or maybe it didn’t (consistent with international systematic reviews of related studies of pay-for-performance). But shall we base policy on tiny and possibly spurious results? (The rapid drop-out from the program of over a third of the ACOs raises other questions of efficacy and failure.)
The finding that the most expensive ACOs before the program reduced costs the most after the ACO contract began sounds very much like a statistical bias called regression to the mean. This is akin to saying groups of students who score poorly on an initial test, score better the next time. But, alas, it is a well-known statistical artifact that has been proven literally millions of times: “what goes up, comes down; what starts low, looks better on the re-test.” While it’s true that the authors sought to address this bias in a statistical appendix, one could argue that even a very small and undetectable bias could derail such a tiny reported savings.
Furthermore, even assuming that the 1.2% savings are real, what was the unmeasured investment for this national program – a limitation fully acknowledged by the authors. Should we be better informed before we support additional incentive payments suggested by the authors? Should we also include the costs of implementation to government, medical organizations and patients? These expenses may exceed any small savings—this is especially relevant in what is the most costly healthcare system in the world.
It’s very possible that ACOs will save money, but the short observation time of their study may not allow them to demonstrate this. The most influential medical journals should emphasize the most rigorous randomized and quasi-experimental study designs when possible (Shadish, Cook , and Campbell, Wadsworth Cengage Learning; 2002). We must study both the intended and perverse effects of monetary incentives. After all, we often learn more from our failures than from our successes.