• The Oregon Medicaid study and cholesterol

    Perhaps you recall that I analyzed the ability the recent Oregon Health Insurance Experiment (OHIE) Medicaid study had to detect plausibly expected changes in glycated hemoglobin and blood pressure levels. I was not cherry picking. Those are just the two measures for which I had the information needed for the analysis at the time of that post. Since then I have obtained information for an analysis of cholesterol.

    For what follows, Jaskaran Bains gets credit for the lit review legwork. He’s an Amherst economics/pre-med student who is assisting behind the scenes at TIE for a short time. As for the power calculations, I’m using the same methods linked to from my prior post. I won’t go over them again here. This will be pretty dry. Skip to the final paragraph (before the PS) if you want the punchline.

    From the literature on clinical interventions to reduce cholesterol, which includes Cochrane Library literature reviews and other studies, we can expect a reduction of at most about 40 mm/dl in total cholesterol and an increase of at most about 5 mm/dl in HDL (the “good” cholesterol). Here’s a PDF of Bains’ annotated bibliography of the studies he examined that support these estimates. Look it over to see if you agree that 40 mm/dl and 5 mm/dl are plausible values. One might also expect more modest changes, but I’m trying to give the OHIE every possible advantage, within reason.

    These cholesterol change values are only to be expected for subjects with problematic levels of total or HDL cholesterol. Anyone with normal cholesterol wouldn’t (or shouldn’t) receive medical treatment. Only 14.1% and 28% of the OHIE control group had high total and low HDL cholesterol, respectively. Therefore, scaling the above plausible changes under medical treatment by 14.1% and 28%, we might have expected Medicaid to result in a change of of -5.64 mm/dl (=14.1% x -40 mm/dl) and 1.4 mm/dl (=28% x 5 mm/dl) in total and HDL cholesterol, respectively. This presumes every Medicaid enrollee with a cholesterol problem was treated as were those in the clinical trials in the literature review. This is probably a generous assumption. Real world treatment compliance is probably below that of an RCT.

    Turning the crank, given these numbers, the study had power of 60% to detect the expected 5.64 mm/dl reduction in total cholesterol. A way to think about this is that there is a 40% chance that failing to find the expected reduction in total cholesterol is an error (a false negative). To attain the customary threshold of 80% power (a 20% false negative probability), the study would have had to have had 60% more participants. (By the way, for this analysis, I’m ignoring the effect of the OHIE’s survey weights, which would tend to depress power. So, once again, I’m being generous. The study was more underpowered than I’m suggesting.)

    Turning to HDL, the study had power of 30% to detect the expected 1.4 mm/dl increase, or a 70% false negative probability. This is very underpowered. The study would have had to have been almost four times larger to achieve 80% power for this measure.

    Appendix Table S14c includes a subanalysis for subjects with a detected pre-randomization diagnosis of high total or low HDL cholesterol. However, 21.9% of the control group had high cholesterol at two years of follow-up and 28.6% had low HDL. We can conduct the same exercise as above, figuring that the expected effect of Medicaid should have been 21.9% x 40 mm/dl = 8.76 mm/dl for total cholesterol and 28.6% x 5 mm/dl = 1.43 mm/dl for HDL. These are both well within the 95% confidence intervals provided in table S14c. Therefore, the study is underpowered for this analysis too.

    The bottom line: The OHIE had insufficient sample to reliably detect expected changes in total and HDL cholesterol. I have already blogged the same for glycated hemoglobin and blood pressure. The Framingham risk score is the only physical health measure reported in the latest OHIE paper that I have not yet examined. Time and help permitting, I’ll investigate whether there was power to detect expected changes in it. Any guesses?

    PS: For those of you who think that measuring more variables means the study should have detected something, you’re wrong. For all physical health measures I’ve now looked at — Framingham risk score excluded for now — the study was underpowered. Measuring more things doesn’t change that.


      Approximately 2 years after the lottery, we obtained data from 6387 adults who were randomly selected to be able to apply for Medicaid coverage and 5842 adults who were not selected. Measures included blood-pressure, cholesterol, and glycated hemoglobin levels; screening for depression; medication inventories; and self-reported diagnoses, health status, health care utilization, and out-of-pocket spending for such services. We used the random assignment in the lottery to calculate the effect of Medicaid coverage.

      What are the empirical QC inter-aliquot “precision” indices for these parameters? Both intra-lab and inter-lab? (i.e., blind replicates, both spiked and production samples) Did they even consider such inherent production process variability? Do we have CLIA QC data on such estimates?

    • “PS: For those of you who think that measuring more variables means the study should have detected something, you’re wrong. For all physical health measures I’ve now looked at — Framingham risk score excluded for now — the study was underpowered. Measuring more things doesn’t change that.”

      Only if you’re all hung up on detecting things that are probably real! Otherwise doing lots o’ tests is a great way to detect *something* 🙂

    • Point of information: Cholesterol is measured in mg(miligrams) a measure of mass, not mm(milimeters) a measure of length.

    • a newbie question on stats. How do you increase the “power” of the study? more people,longer time?
      thanks your reply jl

    • It is not the fact that the study should have detected more things because there were more variables. The problem is that the study detected nothing even when broken down into particular variables. One might think that one variable could be diluted by 14 others, but by looking at each variable one can see that even that is not true.

        • The question was whether or not the Oregon Experiment “showed that Medicaid coverage generated no significant improvements in measured physical health outcomes in the first 2 years.” You disagree with the study based upon your power calculations and others agree with you. On the other hand the well respected authors believe the findings to be significant as do many others. No proof has demonstrated significant improvement, but major studies have sided with the authors.

          The weight of the evidence seems to stand with “no significant improvements”.

          I am looking at this from a non statistical angle recognizing that sometimes we might find average quality hiding superior quality due to it being averaged with poor quality. No such blips occurred in many different analysis. The line was flat. Consider that more of an anecdotal type of analysis, but in medicine simple anecdote is often proven correct.

          Why am I going to such silly lengths to look for the slightest thread of evidence in favor of positive improvements? I recognize how a substantial part of the ACA is based upon such evidence yet we spend enormous amounts of money without any proof to support this position and substantial proof opposing it. I actually believed some health benefits would be proven in Medicaid though I felt the real issue was cost/benefit. Thus I am looking for threads and find none, not even one parameter that would show excellence in just one of many areas to make one take note.

          Many on the left were eager to embrace the study because after one year it favored their beliefs, but now they wish to reject it because it doesn’t which shows how political many of their opinions were and how political the ACA is. Your argument adds another dimension to the debate and needs to be considered as it is another method of analyzing studies. I hope that process is consistently done, but until proof of the alternative conclusion is presented in a strong enough fashion to equal or exceed the present proofs I think, even though my mind thought differently, the idea of “no significant improvements” stands.

          • I’m aware of no one who has challenged the validity, appropriateness, or reasonableness of my quantitative analysis. I’m aware of no one who disagrees with what that analysis implies about *this* study. The rest of the discussion is about the value of Medicaid and/or insurance in general. I’m not addressing that here.

    • I didn’t challenge the validity or any of those things. I don’t know enough about your analysis to do so. I added it to the ‘yes’ group side. I also looked at my own beliefs before the study was completed which was in the ‘yes’ group as well though it was a qualified yes because of the cost/benefit ratio.

      However, many, despite your analysis and others including the well respected authors, believe the study is valid and of extreme importance. Many of them have not changed their views and they believe this study props up all the prior studies that also proved their position.

      Prior to the study most were lauding the study’s design and how this study could prove, unlike others, the ‘yes’ group’s beliefs that persisted despite the negative proof that existed. After the results all too many individuals seemed to try too hard to invalidate the study. This makes their protestations appear political and agenda driven though you didn’t do that. You provided additional evidence.

      In the end it is all about the value of Medicaid for that is what is at stake and we shouldn’t lose sight of that issue for many believe that is affecting the ‘yes’ groups actions and the reason they are attacking this study.