Pay-for-performance has been called “the biggest bandwagon in healthcare.” Proposals are everywhere, from the latest “doc fix” plans to the Medicaid expansion in Arkansas. But we sure could use a well-designed study—dare I say a randomized controlled trial?—to finally settle the question of whether it works. A recent paper has the rigorous design and positive results many may have been waiting for, but its implications leave something wanting.
In an RCT study published in JAMA, Laura Petersen and colleagues tested the effect of financial incentives for blood pressure management in 12 VA primary care clinics:
Participants earned incentives for achieving JNC 7 guideline–recommended blood pressure thresholds or appropriately responding to uncontrolled blood pressure (e.g., lifestyle recommendation for stage 1 hypertension or guideline-recommended medication adjustment), prescribing guideline-recommended antihypertensive medications, or both. […] Results from simulations using pilot data and accounting for estimated improvement rates showed a maximum per-record reward of $18.20, $9.10 for each successful measure.
The incentives combined process measures (providing appropriate treatment) with outcome measures (achieving blood pressure control), using a solid, cluster-randomized design—some clinics got incentives, some got their usual salaries. And the incentives worked:
We found that physicians […] were more likely than controls to improve their treatment of hypertension as measured by achievement of blood pressure control or appropriate response to uncontrolled blood pressure. Thus, a typical study physician […] with 1000 patients with hypertension would be expected to have about 84 additional patients achieving blood pressure control or receiving an appropriate response after 1 year of exposure to the intervention.
In other words, the authors got what they were looking for. They increased treatment rates of a potentially fatal condition using carefully-designed incentives. How carefully? The authors tell us themselves:
What aspects of the design and implementation of our study may have contributed to our findings? First, our measures are meaningful process measures to clinicians. Second, we measured and rewarded actions mostly under the control of physicians and their practice teams. Because blood pressure is not completely under a clinician’s control, we rewarded a combined measure of blood pressure control or an appropriate response to an uncontrolled blood pressure (a so-called “tightly linked” measure). Third, responding to an abnormal blood pressure is a discrete task, as opposed to complex problem-solving, such as diagnosing the etiology of abdominal pain. Fourth, we rewarded participants for their absolute rather than relative performance, avoiding a tournament or competition.
The strength of their intervention is that they made it incredibly simple for a doctor to get a payout: 1) Measure blood pressure, 2) Treat accordingly. But that strength morphs into a weakness when it comes to what we’re supposed to do with this study. As the authors note, it didn’t require complex diagnosis—really any diagnosis, for that matter, since blood pressure is measured at every physician visit. And it didn’t require treating the blood pressure successfully; everyone who tried got an A for effort. So in this particular condition (hypertension), with this particular incentive structure, pay for performance works beautifully. The RCT was a success.
My question is what this means for the messy world of real patient care. A day in any clinic involves much more than diagnosing and treating hypertension. Many diseases are harder to diagnose; many are harder to treat. And the authors acknowledge that responding to hypertension is nothing like figuring out what’s causing abdominal pain. Are we ready to use P4P for every condition that presents to the everyday clinic? Or the few that are most common? Are we willing to rely on measures as “tightly linked” as attempted treatment, or are we going to make it trickier by focusing on outcomes?
Models that incentivize a wider array of processes and outcomes are out there, of course. Medicare’s plan for value-based physician pay uses specialty-specific process measures as well as risk-adjusted healthcare utilization. But every metric added dilutes the connection between one’s own work and the incentive. There are gobs of research, covered here at TIE and elsewhere, echoing the premise that getting P4P right in healthcare is possible but maddeningly tricky. What this study tells us is that, at the micro level, P4P does indeed “work.” What I want to know is, how do we use this evidence to create incentives that work in everyday medical practice?
Other relevant TIE posts here: