Perhaps you recall that I analyzed the ability the recent Oregon Health Insurance Experiment (OHIE) Medicaid study had to detect plausibly expected changes in glycated hemoglobin and blood pressure levels. I was not cherry picking. Those are just the two measures for which I had the information needed for the analysis at the time of that post. Since then I have obtained information for an analysis of cholesterol.
For what follows, Jaskaran Bains gets credit for the lit review legwork. He’s an Amherst economics/pre-med student who is assisting behind the scenes at TIE for a short time. As for the power calculations, I’m using the same methods linked to from my prior post. I won’t go over them again here. This will be pretty dry. Skip to the final paragraph (before the PS) if you want the punchline.
From the literature on clinical interventions to reduce cholesterol, which includes Cochrane Library literature reviews and other studies, we can expect a reduction of at most about 40 mm/dl in total cholesterol and an increase of at most about 5 mm/dl in HDL (the “good” cholesterol). Here’s a PDF of Bains’ annotated bibliography of the studies he examined that support these estimates. Look it over to see if you agree that 40 mm/dl and 5 mm/dl are plausible values. One might also expect more modest changes, but I’m trying to give the OHIE every possible advantage, within reason.
These cholesterol change values are only to be expected for subjects with problematic levels of total or HDL cholesterol. Anyone with normal cholesterol wouldn’t (or shouldn’t) receive medical treatment. Only 14.1% and 28% of the OHIE control group had high total and low HDL cholesterol, respectively. Therefore, scaling the above plausible changes under medical treatment by 14.1% and 28%, we might have expected Medicaid to result in a change of of -5.64 mm/dl (=14.1% x -40 mm/dl) and 1.4 mm/dl (=28% x 5 mm/dl) in total and HDL cholesterol, respectively. This presumes every Medicaid enrollee with a cholesterol problem was treated as were those in the clinical trials in the literature review. This is probably a generous assumption. Real world treatment compliance is probably below that of an RCT.
Turning the crank, given these numbers, the study had power of 60% to detect the expected 5.64 mm/dl reduction in total cholesterol. A way to think about this is that there is a 40% chance that failing to find the expected reduction in total cholesterol is an error (a false negative). To attain the customary threshold of 80% power (a 20% false negative probability), the study would have had to have had 60% more participants. (By the way, for this analysis, I’m ignoring the effect of the OHIE’s survey weights, which would tend to depress power. So, once again, I’m being generous. The study was more underpowered than I’m suggesting.)
Turning to HDL, the study had power of 30% to detect the expected 1.4 mm/dl increase, or a 70% false negative probability. This is very underpowered. The study would have had to have been almost four times larger to achieve 80% power for this measure.
Appendix Table S14c includes a subanalysis for subjects with a detected pre-randomization diagnosis of high total or low HDL cholesterol. However, 21.9% of the control group had high cholesterol at two years of follow-up and 28.6% had low HDL. We can conduct the same exercise as above, figuring that the expected effect of Medicaid should have been 21.9% x 40 mm/dl = 8.76 mm/dl for total cholesterol and 28.6% x 5 mm/dl = 1.43 mm/dl for HDL. These are both well within the 95% confidence intervals provided in table S14c. Therefore, the study is underpowered for this analysis too.
The bottom line: The OHIE had insufficient sample to reliably detect expected changes in total and HDL cholesterol. I have already blogged the same for glycated hemoglobin and blood pressure. The Framingham risk score is the only physical health measure reported in the latest OHIE paper that I have not yet examined. Time and help permitting, I’ll investigate whether there was power to detect expected changes in it. Any guesses?
PS: For those of you who think that measuring more variables means the study should have detected something, you’re wrong. For all physical health measures I’ve now looked at — Framingham risk score excluded for now — the study was underpowered. Measuring more things doesn’t change that.