• I remain skeptical about paying for performance

    I’ve written about this before. A new study in the NEJM does little to change my mind:


    In October 2008, the Centers for Medicare and Medicaid Services (CMS) discontinued additional payments for certain hospital-acquired conditions that were deemed preventable. The effect of this policy on rates of health care–associated infections is unknown.


    Using a quasi-experimental design with interrupted time series with comparison series, we examined changes in trends of two health care–associated infections that were targeted by the CMS policy (central catheter–associated bloodstream infections and catheter-associated urinary tract infections) as compared with an outcome that was not targeted by the policy (ventilator-associated pneumonia). Hospitals participating in the National Healthcare Safety Network and reporting data on at least one health care–associated infection before the onset of the policy were eligible to participate. Data from January 2006 through March 2011 were included. We used regression models to measure the effect of the policy on changes in infection rates, adjusting for baseline trends.

    The government is tired of paying for stuff that’s preventable and, perhaps, the fault of the hospitals caring for patients. So they set up a new policy that said they would no longer pay for extra care that resulted from a preventable hospital-acquired condition. Researchers set out to see how the incidence of those types of conditions were affected by this policy, compared to conditions not related to the policy. And what did they find?


    A total of 398 hospitals or health systems contributed 14,817 to 28,339 hospital unit–months, depending on the type of infection. We observed decreasing secular trends for both targeted and nontargeted infections long before the policy was implemented. There were no significant changes in quarterly rates of central catheter–associated bloodstream infections (incidence-rate ratio in the postimplementation vs. preimplementation period, 1.00; P=0.97), catheter-associated urinary tract infections (incidence-rate ratio, 1.03; P=0.08), or ventilator-associated pneumonia (incidence-rate ratio, 0.99; P=0.52) after the policy implementation. Our findings did not differ for hospitals in states without mandatory reporting, nor did it differ according to the quartile of percentage of Medicare admissions or hospital size, type of ownership, or teaching status.


    We found no evidence that the 2008 CMS policy to reduce payments for central catheter–associated bloodstream infections and catheter-associated urinary tract infections had any measurable effect on infection rates in U.S. hospitals. (Funded by the Agency for Healthcare Research and Quality.)

    Look, the good news is that even before the policy went into place, these hospitals were working to reduce the rates of infections. They were succeeding! Starting in the fourth quarter of 2008, though, when the policy went into effect, no more significant changes were seen in the quarterly rates of these infections. Here’s the money shot:

    It’s not that I don’t think hospitals can’t improve, or that I don’t think this stuff isn’t important. I just think that the ways we often go about trying to pay for performance are ineffective.

    By the way, this research was funded by the Agency for Healthcare Research and Quality. I’m just saying.


    • This result actually really surprised me. There has been an increasing trend towards agglomeration of medical practices into large hospital systems – and a major driver of that is economies of scale with administrative overhead. Big hospitals can have one big billing (“fighting with insurance/medicaid to get paid for their services) department, and one big compliance (“jumping through bureaucratic hoops”) department, rather than having a bunch of doctors doing an amateurish job of both while trying to keep their practices running.

      I would have predicted more of the same with regards to “Pay for Performance,” which – to put it in the starkest terms – gives hospitals money if they manage to check certain checkboxes – in this case reducing central-catheter related bloodstream infections. This is what agglomeration is supposed to be good at! Even if the catheter-infection problem is largely intractable, large institutions should be better able to come up with clever ways to make their statistics look better. I would have predicted that large hospital systems would be quick to “comply” and smaller ones would lag behind. In fact, neither was able to make the numbers move – a pretty unexpected result.

    • I am wondering if any hospitals actually had a reduction in payment for preventable infections. The study does not address this directly but it does say in the Conclusions that hospitals could have gamed the coding by coding infections as “present on admission” and thus avoid the penalty.

      Also, I wonder how the failure of pay for performance fits with the “iron triangle”. Theoretically, pay for performance is paying more for better quality but as you point out, these efforts have not had the desired result. I am thinking that efforts to pay-less to perform-fewer (unnecessary tests and treatments) have a better track record at improving quality.

      • I agree with your last statement.

        My iron-triangle view of this is that there is this odd assumption that improving quality will lower costs. Often, we have to spend money to improve quality. On the other hand, it seems like a no brainer to stop unnecessary tests and treatments, which will lower spending and not negatively impact quality. Of course, there are some that will view that as a reduction in access to those unnecessary tests. But I’m ok with that trade-off.

    • I agree that this paper rightfully adds to skepticism about P4P, but I think it’s worth pointing out one limitation of the study that seems to have gone unmentioned in the discussion. .

      The main limitation, in my view, is that this research design seems underpowered to detect even meaningful improvements following the policy. Check out the confidence interval in Table 2 for “a change in [infection] rate at the time of intervention”. The point estimate is 0.95 (a 5% reduction from baseline), and the confidence interval is from 0.68-1.33. So, the authors wouldn’t have been able to detect even at 30% drop in infection rates following the policy. And note that, for this parameter, the nontarget infection had a point estimate of a 16% rise following the policy (CI 0.82-1.63). That’s a 20% spread between treatment and control, though not statistically significant.

      When one is studying events that happen one in every thousand patient-days, the results are going to be noisy! If one can’t find an effect, it might be because of loud noise rather than the lack of a signal.

      It really is disappointing that medical journals often make zero effort to distinguish between a “tight zero” result and a “wide zero” result. A wide zero is not nearly as informative as a tight one. The difference needs to be acknowledged!

    • While I am inclined to believe the conclusions of this study based on theory and other evidence, I think this research design is seriously limited in what it can tell us.

      Specifically, this research design assumes that the full effect of the policy will appear “overnight” when the policy takes effect. But this need not be the case; indeed, it may not even be the most plausible case.

      Consider, for example, a hospital that takes the new policy seriously and begins a process of quality improvement when the policy is announced (but before it takes effect). Suppose that the hospital rolls out the improvements they identify as they are “ready,” rather than waiting for the actual effective date. In this case, the effect of the policy will show up as a gradual downward trend in the pre-period, not a sharp break on the policy’s effective date.

      Similarly, consider a case where a hospital only notices the policy after it has taken effect and begins an improvement process then. If identifying process improvements takes time, then the improvements will show up as a gradual downward trend in the post period. Again, there need not be a sharp break at the effective date.

      The bottom line is that the longer-term downward trends evident in the policy could well be caused in part by the policy, and this study sheds no light on that possibility.