• Money can’t buy everything (though it can, apparently, buy blood pressure management)

    This is a guest post by Karan Chhabra, a medical student at Rutgers Robert Wood Johnson Medical School and blogger at Project Millennial. Find him on Twitter at @krchhabra.

    Pay-for-performance has been called “the biggest bandwagon in healthcare.” Proposals are everywhere, from the latest “doc fix” plans to the Medicaid expansion in Arkansas. But we sure could use a well-designed study—dare I say a randomized controlled trial?—to finally settle the question of whether it works. A recent paper has the rigorous design and positive results many may have been waiting for, but its implications leave something wanting.

    In an RCT study published in JAMA, Laura Petersen and colleagues tested the effect of financial incentives for blood pressure management in 12 VA primary care clinics:

    Participants earned incentives for achieving JNC 7 guideline–recommended blood pressure thresholds or appropriately responding to uncontrolled blood pressure (e.g., lifestyle recommendation for stage 1 hypertension or guideline-recommended medication adjustment), prescribing guideline-recommended antihypertensive medications, or both. […] Results from simulations using pilot data and accounting for estimated improvement rates showed a maximum per-record reward of $18.20, $9.10 for each successful measure.

    The incentives combined process measures (providing appropriate treatment) with outcome measures (achieving blood pressure control), using a solid, cluster-randomized design—some clinics got incentives, some got their usual salaries. And the incentives worked:

    We found that physicians […] were more likely than controls to improve their treatment of hypertension as measured by achievement of blood pressure control or appropriate response to uncontrolled blood pressure. Thus, a typical study physician […] with 1000 patients with hypertension would be expected to have about 84 additional patients achieving blood pressure control or receiving an appropriate response after 1 year of exposure to the intervention.

    In other words, the authors got what they were looking for. They increased treatment rates of a potentially fatal condition using carefully-designed incentives. How carefully? The authors tell us themselves:

    What aspects of the design and implementation of our study may have contributed to our findings? First, our measures are meaningful process measures to clinicians. Second, we measured and rewarded actions mostly under the control of physicians and their practice teams. Because blood pressure is not completely under a clinician’s control, we rewarded a combined measure of blood pressure control or an appropriate response to an uncontrolled blood pressure (a so-called “tightly linked” measure). Third, responding to an abnormal blood pressure is a discrete task, as opposed to complex problem-solving, such as diagnosing the etiology of abdominal pain. Fourth, we rewarded participants for their absolute rather than relative performance, avoiding a tournament or competition.

    The strength of their intervention is that they made it incredibly simple for a doctor to get a payout: 1) Measure blood pressure, 2) Treat accordingly. But that strength morphs into a weakness when it comes to what we’re supposed to do with this study. As the authors note, it didn’t require complex diagnosis—really any diagnosis, for that matter, since blood pressure is measured at every physician visit. And it didn’t require treating the blood pressure successfully; everyone who tried got an A for effort. So in this particular condition (hypertension), with this particular incentive structure, pay for performance works beautifully. The RCT was a success.

    My question is what this means for the messy world of real patient care. A day in any clinic involves much more than diagnosing and treating hypertension. Many diseases are harder to diagnose; many are harder to treat. And the authors acknowledge that responding to hypertension is nothing like figuring out what’s causing abdominal pain. Are we ready to use P4P for every condition that presents to the everyday clinic? Or the few that are most common? Are we willing to rely on measures as “tightly linked” as attempted treatment, or are we going to make it trickier by focusing on outcomes?

    Models that incentivize a wider array of processes and outcomes are out there, of course. Medicare’s plan for value-based physician pay uses specialty-specific process measures as well as risk-adjusted healthcare utilization. But every metric added dilutes the connection between one’s own work and the incentive. There are gobs of research, covered here at TIE and elsewhere, echoing the premise that getting P4P right in healthcare is possible but maddeningly tricky. What this study tells us is that, at the micro level, P4P does indeed “work.” What I want to know is, how do we use this evidence to create incentives that work in everyday medical practice?

    Other relevant TIE posts here:

    Share
    Comments closed
     
    • Population health management, P4P and incentives for both physician and patient via engagement are the core of the article. Both Optum and Humana have substantial data from various pilots which also include patient engagement with patient incentives

      Interestingly, both companies are presenting together on a joint panel at this weeks MGMA annual conference addressing; Value health, reimbursement models, liking clinical performance to reimbursement.

    • Interesting stuff; and yes, agree that creating incentive systems for health care providers is “maddeningly tricky.”

      But I’m not sure this study (and many others like it) proves that monetary incentives were key. During my years at Kaiser Permanente we carried out a similar effort over a couple years — measuring in several clinics how often blood pressures were (a) taken; (b) rechecked if high; (c) acted on by the clinician; and (d) actually controlled. It took a lot of effort to set up those measurements, but we learned that simply measuring the activities, and then reporting on them in a monthly staff meeting, improved performance — without the application of additional incentives.

      There’s an old lesson, of course — you can improve what you measure. The “maddening” part comes from wondering why skilled well-meaning providers weren’t already performing well without such measurements. But in fact they weren’t, because clinicians are under huge time constraints; they have thousands of clinical priorities competing for each minute of their time, and blood pressure control is rarely Urgent, while other tasks are. (By the way, Kaiser Permanente and other large medical groups are hardly unique in having to deal with worsening time constraints.)

      For my Department, I once made a list of specific clinical tasks that the doc was “supposed” to do, in order to make that visit a “high quality” visit. In 15 minutes, the clinician was supposed to do at least 150 specific things (from greeting to charting)–in addition to actually taking care of the patient’s stated problem.

      So a key goal for health managers and physicians will be to identify those clinical goals which in the long run matter a lot, but which we busy clinicians will not tend to prioritize because our day-to-day incentives go the other way.

      In my view, measuring and reporting on such important metrics will elevate them to the level of “Urgent” — among all the competing priorities of a busy clinical day.

      • Thanks — your observations are fascinating, and I think they square with other examples in the literature. I still worry that itemizing the encounter reduces the attention paid to things that aren’t explicitly incentivized, but your suggestion (not incentivizing things that already automatically get done) is a helpful refinement.

    • You are looking at process vs outcomes. It is usually easier to monitor process as outcomes have many more factors. One of the things we have consistently found is that physicians dont really want to be outliers. If you can reliably measure something (preferably w/o a lot of data entry), you can usually improve it.

      Steve