Avik Roy has read and posted about the papers I reviewed as part of my Medicaid-IV series. If you’ve forgotten, the purpose of that series of posts was to examine studies that use proven, sound methods to infer the causal effect of (as opposed to a correlation between) Medicaid enrollment on health outcomes. From that series, I concluded that there is no credible evidence that Medicaid is worse for health than being uninsured. Considering only studies that show correlations (not causation), Avik disagrees.
Avik’s post is long, but you can save yourself some trouble by skipping the gratuitous attack on economists in general, and Jon Gruber in particular, as well as the troubled description of instrumental variables (IV).* About halfway down is his actual review of the papers; look for the bold text.
The point I want to drive home in this post is why an IV approach is necessary in studying Medicaid outcomes. People enrolling in Medicaid differ from those who don’t. They differ for reasons we can observe and for those we can’t. An ideal study would be a randomized controlled trial (RCT) that randomizes people into Medicaid and uninsured status. Thats neither practical nor ethical. So we’re stuck, unless we can be more clever.
The next best thing we can do is look for natural experiments. That’s what IV exploits. In this case, the studies I examined use the state-level variation in Medicaid eligibility (and related programs). That variation obviously affects enrollment into Medicaid (you can’t enroll unless you’re eligible), though it is not determinative. Importantly, state-level variation in Medicaid eligibility rules does not itself affect individual-level health. Other than figuratively, do you suddenly take ill when a law is passed or a regulation is changed? Do you see how Medicaid eligibility rules are somewhat like the randomization that governs an RCT, affecting “treatment” (Medicaid enrollment) but not outcomes directly? (If this is unclear, go here.)
Note that IV studies can, and should in some cases, control for observable factors. (The studies I reviewed use quite sophisticated controls, including fixed effects and interactions, that greatly reduce the ambitiousness of the assumptions required to obtain causal estimates. In contrast, assumptions for inference of causality in the studies Avik prefers are far greater.) But controlling for observable factors alone is insufficient. That brings me to a study that Avik has cited many times as evidence that Medicaid produces worse health than no insurance at all. Tyler Cowen referenced the same study in his book, about which I wrote earlier. It’s the UVa surgical outcomes study, formerly known as: Primary Payer Status Affects Mortality for Major Surgical Operations, by LaPar and colleagues.
Avik has summarized this study, so I’ll skip that. It examines 11 surgical outcomes by insurance status, adjusting for many observable factors, but, crucially, with no controls for unobservable factors that affect selection. All adjusted outcomes for Medicaid enrollees are worse than for the uninsured. With only one exception, adjusted outcomes for Medicare beneficiaries are worse than for the uninsured too. Got that? Not just Medicaid enrollees, but Medicare beneficiaries too, fare worse than the uninsured. Any theory to explain what’s going on in Medicaid had better explain Medicare too. It cannot be just that Medicaid enrollees see lower quality providers.
You know what theory is consistent with these results? It’s a pretty famous one? I just described it above: selection (or omitted variable) bias. It is well known that studies that do not exploit purposeful (i.e., an RCT) or natural (i.e., natural experiment or instrumental variables) randomness can suffer from selection bias. Even controlling for observable characteristics is not enough in the field of health care. This is well known. I’ve explained it before, even in a diagram.
The authors of the UVa surgical outcomes study acknowledge the possible presence of selection bias in trying to explain their results. They say as much in many places in the text of their paper, writing,
Another possible explanation for the differences we observed among payer groups is the possibility of incomplete risk adjustment due to the presence of comorbidities that are either partially or unaccounted for in our analyses [sic]. […]
Several explanations for inherent differences in payer populations have been suggested. Factors including decreased access to health care, language barriers, level of education, poor nutrition, and compromised health maintenance have all been suggested. […]
There are several noteworthy limitations to this study. First, inherent selection bias is associated with any retrospective study. […]
For example, the proportion of Medicaid patients may be artificially inflated due to the fact that normally Uninsured patients may garner Medicaid coverage during a given hospital admission. […] [I]n our data analyses and statistical adjustments there exists a potential for an unmeasured confounder. Due to the constraints of NIS data points, we are unable to include adjustments for other well-established surgical risk factors such as low preoperative albumin levels or poor nutrition status.
Kudos to the authors for acknowledging the limitations of their study. That the results have been repeated elsewhere without such disclaimers is a disservice to science.
Moving on, to Avik’s great credit, he unearthed a Medicaid-IV study I had overlooked: The Link Between Public and Private Insurance and HIV-Related Mortality, by Bhattacharya, Goldman, and Sood (ungated PDF available). It examines mortality outcomes in an HIV population using IV methods to control for selection into insurance category (uninsured, public, and private). Table 5 is the key table. It confused me at first, as it has Avik. Just reading the table, it looks as if the “best” model produces the results in the bottom row, which suggest private insurance decreases mortality by 50% and public insurance increases it by 8%, relative to no insurance.
But, reading the text, it is clear that the results in that bottom row are based on a faulty model, which the authors explain. (I will too, below.) The model based on sound methodology produces results in the second to last row of Table 5, a 79% and 66% reduction in mortality for the privately and publicly insured, respectively, relative to the uninsured. Table 6 also reports the results of the preferred model, though there is a typographical error on the mortality results: they’re missing minus signs in the first two rows. (I confirmed this with the authors.)
The results of this study are stated very clearly by the authors, “both private and public insurance decrease the likelihood of death.”
Now, what’s wrong with the model that shows Medicaid killing people, the one Avik thinks is best? It includes AZT and HAART** treatment indicators on the right hand side. That’s a problem because AZT and HAART treatment are more likely for those with insurance and HAART is indicative of poor health. Essentially, they’re “caused” by insurance and highly predictive of the outcome of interest, mortality. This is an example of “bad control,” i.e. controlling for an outcome. It should be clear that having the outcome — or something very close to it — on both the left and right hand sides is a problem. It soaks up too much of the effect of insurance but, being an outcome, it isn’t a proper control. About this, the authors write, “Of course, there is concern that HAART itself may be endogenous, since receipt of therapy almost certainly reflects disease severity and ability to adhere to the complicated regimen.” (Why was this faulty model even included in the paper? My guess has been confirmed by the authors via email: reviewers requested the authors include it. It was not included in their NBER working paper that predates the peer reviewed one. It really is too bad this model was inserted into the paper because it seems to have tricked some readers. However the authors very clearly indicate which model is most sound. Anyone appreciating the essence of good research design will understand it, as I explained above.)
Bottom line: once again, we find that Medicaid is shown not to be bad for health, but only if proper econometric techniques are employed. Sadly, it is easier to ignore the need for such techniques and to misunderstand them than to do the work to educate oneself in their use. The real tragedy is that it leads to an unwarranted conclusion that Medicaid is harming people. We can certainly craft a better Medicaid program, and we should. But we should always use proper science in considering any program. If we don’t, we may mistake improvements to Medicaid as harmful. I’m sure advocates for change, myself included, would not welcome such an outcome.
* Avik dismisses IV as a “fudge factor,” casually and erroneously discrediting a vast amount of mainstream work by economists and several entire sub-disciplines. Since IV is a generalization of the concepts that underlie randomized controlled trials (differing in degree, but not in spirit, from purposeful randomization), and can be used to rehabilitate a trial with contaminated groups — a not infrequent occurrence — it is unwise to trivialize IV and what it can do.
** HAART = highly active anti-retro-viral therapy.
UPDATE: I fixed my explanation of the “bad control” problem in the Bhattacharya, Goldman, and Sood study.
UPDATE 2: The authors of the Bhattacharya, Goldman, and Sood study confirmed the typos in Table 6.
UPDATE 3: Those authors also confirmed that the faulty model was requested by reviewers, as I suspected.