• Bias and the Oregon Medicaid study

    There’s been some chatter about how the Oregon Medicaid study is or might be biased. That’s worth a post!

    There’s a precise way in which the study is not biased. By design it estimated the effect of Medicaid on those who won the lottery and enrolled, relative to those who lost the lottery and did not. This estimate is unbiased for the contrast between precisely these two groups, but not necessarily for others. In econometric jargon, this is known as the “local average treatment effect” (LATE). The “treatment effect” part of “LATE” is clear, but what’s this “local average” business?

    Sigh. I hate this terminology. It’s supposed to evoke the idea that the instrument (the lottery in this case) doesn’t have a “global” effect on study participants, causing all randomized to Medicaid (lottery winners) to be on and all those randomized to control (lottery losers) to not be. It has a more modest, “localized” effect. The other jargon used for this is that the LATE estimate is an estimate of the effect of treatment on “compliers.” That’s a more meaningful term to me. The compliers are those that do what randomization “tells” them to do, they enroll in Medicaid if randomized to do so and they don’t if not.

    Of course, you can’t expect full compliance in this study (or many other RCTs) because some lottery winners turned out to be ineligible for Medicaid by the time they were permitted to enroll. Some had too high income. Some moved out of state. Some may have found other sources of coverage. (You had to have income below 100% FPL, live in state, and uninsured for 6 months to be permitted to enroll.) Also, enrollment wasn’t mandatory. So, if you just decided it wasn’t worth the trouble or didn’t receive or notice the letter inviting enrollment, you might have missed the window (45 days is all they gave you).

    On the flip side, nobody was preventing lottery losers from enrolling on Medicaid if they became eligible in another way. The study pertained only to the expansion of Medicaid beyond the statutory requirements. If people ended up in one of the eligible categories (aged, blind, disabled, pregnant) they could get on Medicaid.

    So, there was considerable “crossover” (lottery losers enrolling in Medicaid, lottery winners not) or “contamination” or “noncompliance,” all jargon for the same thing. This was not a perfect RCT. Few are.

    What to do? The investigators did two things. First, they considered an “intent-to-treat” (ITT) approach, comparing lottery winners to losers no matter whether they enrolled in Medicaid or not. These results are in their first year paper. I’ve forgotten what they say specifically, though in general they’re much smaller effects than the LATE results. The concern with ITT is that all this crossover biases the results toward zero. There isn’t as much contrast between study arms due to noncompliance.

    Next, the investigators provided LATE estimates, about which I wrote above. These are unbiased for contrast among compliers. In this study, they’re about four times the size of the ITT estimates by virtue of the mathematics (“instrumental variables“) of LATE. But they need not be the same as one would find in the absence of noncompliance. There may be bias in that sense. Why?

    • Hypothesis 1: Those who took the trouble to enroll in Medicaid were sicker than those who didn’t. After all, why enroll if you don’t need it? Remember, even some lottery losers (18.5% of them) enrolled in Medicaid. The LATE estimate removes the effect of them since they are noncompliers. Also, some lottery winners didn’t enroll (most of them didn’t) and the LATE estimate removes their effect too. What’s left under this hypothesis is a comparison of relatively sicker people who did enroll in Medicaid with relatively healthier people who didn’t. The investigators actually found some evidence to suggest that Medicaid enrollees are sicker. Many other studies find that Medicaid enrollees are sicker to the point that some studies find an association of Medicaid with increased mortality. Under hypothesis 1, results are biased downward relative to what they would be under full compliance. Medicaid looks less effective than it might otherwise be. 
    • Hypothesis 2: Those who are more organized, better planners, with higher cognitive function and literacy (including health) skills enroll. It takes some awareness and planning to enroll, so there is some face validity to this argument. I’m aware of no evidence to support it though. (Got any?) Under this hypothesis Medicaid enrollees would do a better job of getting and staying healthy even apart from whatever Medicaid does for them. This would bias results toward showing a larger Medicaid effect than would be true in general (under full compliance).

    There may be other hypothetical sources of bias. The point I’d make about all of them is that we don’t know whether any of these biases actually exist and, if they do, how big an effect they have. It’s all speculation. Still, LATE is an unbiased (and causal) estimate of the effect of Medicaid on compliers. It does filter out some who want to be on Medicaid and can’t enroll (lost lottery, no other route) and filters out some who enroll but weren’t invited (lost lottery but became eligible another way). Some of these noncompliers could be unusually sick. Some noncompliers could be unusually organized and aware. LATE filters some of them out.

    Some might wonder about another type of estimate one could do, the effect of “treatment on the treated.” Here one just compares Medicaid enrollees to non-enrollees, ignoring the lottery draw. Unfortunately, this just exacerbates whatever bias might exist. There is no random assignment at play here. There’s no filtering for selection at all. You get an association, not a causal estimate. This is the problem with many studies of Medicaid and insurance. Randomness is key. The lottery should be exploited in some fashion (either ITT or LATE).

    Lastly, notice how complicated RCT interpretation is? Yes, it’s the gold standard, but it still has issues. Using an IV approach for a LATE estimate is, in my view, about the best you can do. But there may be bias when considering generalizing the findings outside the “local” effect of the instrument (lottery or random assignment). These concerns arise with any IV study. In this sense, IV and RCT are much closer cousins than one tends to think. Disparage one and you disparage the other.

    Not all that’s gold glitters, but it is still valuable.


    • This is a really good post, thanks.

    • This discussion of LATEs is valuable, but I think your use of the term “bias” here is problematic (and non-standard relative to the modern IV literature).

      Both the LATE and average treatment effect are well-defined causal estimands that provide the answer to different well-posed causal questions; LATE is not just a “biased version” of the ATE. Indeed, the LATE will, in many cases, be the more policy-relevant estimand. If, for example, your question is about the effects a real-world insurance expansion in which take-up is incomplete, the LATE — and not the ATE — provides the best answer. Holding up the ATE as the gold standard we’re always trying to achieve does not make sense.

      Super-technical note: Your post says that IV is unbiased for the LATE. In general, this is false. It’s consistent (and therefore asymptotically unbiased), but is, in general, biased in small samples.

      • Thanks. Several points/questions:

        * In this case, is LATE potentially biased relative to an estimate from a perfect (no crossover) RCT? I’d say we can’t tell, but it could be.

        * In this case, is LATE potentially biased relative to the ACA Medicaid expansion? I’d say likely yes since the Oregon experiment had constraints the ACA expansion doesn’t (lottery vs. take all comers, 45-day window of enrollment vs. whenever you want, must be uninsured 6 months vs. no such limitation, etc.).

        * I don’t doubt you are right on the asymptotically unbiasedness of LATE, but I fear that is a detail that would not aid the level of reader I was targeting. I didn’t notice that nuance in the investigators’ publications, but I might have missed it.

        • I response to bullets #1 and #2, I think we’re on the same page on the substance of what needs to consider when applying the Oregon estimates in other settings, like the ACA. My complaint is mainly semantic, but I think the semantics here are important because they facilitate understanding.

          Namely, in my experience, the statistics/econometrics literature tends to reserve the technical term “bias” for questions of internal validity: is my estimator estimating what I say it estimates?

          By contrast, the questions being tackled here are about external validity: how applicable is this estimate in other settings? It makes little sense to think of the estimates from Oregon (which wasn’t designed to estimate the effect of the ACA) as being “biased” for the effect of the ACA. Rather, Oregon is answering a different question (because that question happened to be the answerable one) and now we need to figure how good a guide the answer to that question is to the other questions.

          I think talking about the questions in this post in terms of “external validity” (or, perhaps better for a general audience, “generalizability”) is better than talking about this in terms of “bias”; doing so keeps the conceptually distinct questions of external and internal validity fully separate. An added benefit is that these terms avoid the baggage that “bias” brings with it from its colloquial usage.

          On the technical asymptotics point: I completely understand your concern. The work-around I often use in this situation to avoid saying something technically false is to use a phrase like “is a valid estimator of,” which elides the issue. But a minor point regardless.

          • I appreciate your feedback. I think you’re exactly right about terms and how they should be used. I could have clarified that in this post, but the gears in my head didn’t fully mesh. I’ll go back and see what I can do. And going forward, I will be better about this. Indeed, it is exactly this kind of feedback that helps keep me honest and accurate. So, thanks!

          • Ack. I went back to look at the post and a lot of it isn’t about external validity, but about how the LATE, ITT, and treatment on treated estimates differ from each other and how they differ from a full-compliance RCT. I don’t know which term, bias, internal validity, or external validity applies to these notions? I’m sticking with the original post for now, but let’s keep discussing!

            • Long day — I’m just returning to this.

              To answer your question of what terminology I’d use, I can’t exactly figure that out in the abstract, so FWIW (probably little), here’s a “show, don’t tell” answer. Basically, I’d organize and talk about things as follows:

              (1) What internally valid estimates can we obtain from the Oregon Study?

              We can obtain the effect of winning the lottery (ITT) and the effect on the population that gained insurance due to winning the lottery (LATE).

              We cannot obtain the ATE or the TOT; the seemingly natural estimators of these quantities are biased since the populations we are comparing differ due to self-selection. The ITT and LATE avoid this problem because they scrupulously _solely_ compare the full group of lottery winners to the full group of lottery losers.

              (2) What internally valid estimates can we obtain from alternative study designs? How do they differ?

              From a perfect compliance RCT, we can estimate the ATE for the study population. Relative to the group covered by the LATE, the group covered by the ATE also includes: (A) the types of people who still enroll if they lose; and (B) the people who will not enroll even if they win.

              From a Oregon-like study in which we forbid enrollment by lottery losers, we can obtain the TOT. Relative to the group covered by the LATE, the group covered by the TOT adds (A) from above but not (B).

              The LATE/ATE/TOT difference is not about bias. Each average treatment effect is perfectly valid for the population it pertains to; those populations are just different.

              (3) Which estimates have the greatest external validity for the policy questions of current interest?

              This is a hard question. The answer depends on whether the group affected by our proposed policy looks more like the group included in the LATE or the full population. How do we think about that?

              Insert your existing discussion of this point.

    • I think a lot is being lost in the details. The idea behind Medicaid is to provide reasonable care to those that otherwise would have difficulty obtaining it. If Medicaid is not doing the intended job then a lot of money is being wasted on a program that simply increases utilization. The Oregon Study is not the only one to cause one to rethink the problem. The original Rand Study discussed this problem and many other studies have done so as well.

      This is not a matter of which way the evidence tilts. To spend the amounts we are spending we should be seeing very clear results demonstrating demonstrable benefits, but that is not what has been shown.

      I am not saying that Medicaid is not needed and I have no problem assisting those in need, but one has to recognize that there are tradeoffs. Maybe the poor would be in a better position if the money spent was spent in different manner. To try and justify Medicaid by noting not much was proven one way or the other I don’t believe helps the problem since it wasn’t proven that Medicaid provides the benefits it is supposed to provide.

      • Please specify the minimum change in some health measures you would expect Medicaid to produce in 2 years time in order for it to be worth it. Then tell us how big a sample size you’d need to detect those changes. (Hint: We’ve given all you need to do this in prior posts.)

        • The burden of proof is upon those that wish others to spend their hard earned money on a program. So far the proof is not there and based upon the amount of money spent the proof should be obvious to almost all. There have been many good arguments made on all sides, but so far no study has clearly demonstrated that we are getting our monies worth with Medicaid as it is presently designed.

          Understand I am not against money being spent for those in need. I just want to make sure that the money we do spend is spent in a way that gives maximum benefit. So far I have seen no proof that the results parallel the amount of money spent. The Oregon study might have been underpowered but in 15 different criteria there seemed to be no benefit. Similar words were said by the Rand Study. The ball is in your court if you believe that Medicaid money is being spent wisely so that leaves you to prove that case. If you did I would support your presumed position, but so far you are just proving that there was a lack of significance and just a lack of significance is a win for the other side.

          In an earlier comment I mentioned the time frame which to me is disturbing in regard to this study, but not for Rand or other studies. I repeat, to spend the amount of money we are spending we should be seeing very definite results.

          • Seems like a dodge to me. I only asked for what criteria you would apply. You don’t seem to have any. Neither of us is against spending money for those in need. If you think I’m against variations or reforms to Medicaid you haven’t followed this blog very closely (forgiven, but just FYI).

            So, what variation would you like? How would you test that it works?

            You complained about this blog not being open to other views. I’m inviting you to offer some. But please say more than “I want to help poor people but not your way” or the equivalent. What other way?

            Meanwhile, I agree with you that we spend too much on health care for what we get. I’d like to see all of us receive care at the low price of Medicaid. I’d also like it to be of high quality. Given the myriad political and technical constraints, I am under no illusion that I know exactly how to do that. Are you?

            • Austin, I don’t create such studies so it is not a dodge. I leave that to those that are experts at doing so. I am merely saying that the experts supporting the position of Medicaid have not proven their case considering the billions spent on Medicaid. If one wants others to pay for a program then those people should be obligated to provide proof that the program is working without generalizing and saying that there is agreement ” we spend too much on health care for what we get. I’d like to see all of us receive care at the low price of Medicaid.” There isn’t for Medicaid is an entitlement. Insurance has not been an entitlement.

              I don’t accuse you of being for or against changes in Medicaid nor do I accuse you of supporting or not supporting it. I am only stating what appears to be fact. Those in support of the present condition of Medicaid have not proven their case that Medicaid is worth the dollars spent. That exists whether or not the Oregon Medicaid study is underpowered.

              I prefer to more carefully target the program and attempt to mainstream as many people as possible. I would prefer more skin in the game. I would consider mass screenings rather than personal physicians for those that are healthy. It costs next to nothing to get blood pressures, BS etc. I want to leave money to appropriately treat difficult problems in those that are needy.

            • Fair enough. I am an expert. The study discussed showed enormous financial and mental health benefits of Medicaid. It was too small to detect reasonable sized physical health benefits. RAND showed physical health benefits for a subset of the sickest and poorest patients. Other studies have as well. For the wealthy and healthy, I would not expect any detectable physical health benefits of insurance over 2 years.

              But these study subjects were not wealthy. How much skin in the game should someone living below the poverty line have?

            • Austin, at present if I interpret your statement correctly the only proven benefits are: 1) financial and 2) mental health.

              1) Financial: Compared to the money spent the financial benefit doesn’t appear that great considering the fact that it is estimated that the taxpayer will be paying over $7Trillion dollars in the next decade. We have to remember that the money is designated for health care and we have other programs to provide financial aid. Maybe we should do a study to see where those financial benefits went since it appears a good deal went to behaviors not considered acceptable to the taxpayer. Smoking for one, which adversely affects health and raises the costs for everyone.

              2)Mental health: As I stated in an earlier posting this is a tricky issue to evaluate. But even here I wonder about the signal to noise, the criteria, and the significance. If they had stated that the improvement in mental health led to a better ability to care for oneself or permitted one to get a job then I would be able to better appreciate this result, but it didn’t. Just the fact that one won the lottery can affect self reported health benefits including mental health. Looking deeper one notes that 2/3rds of the self reported health took place about one month after Medicaid coverage was approved which was prior to actual care. That spells out a Placebo effect with a capital P unless more proof is offered.

              The last criteria, which remains unproven, is the physical health of the recipients. Even in the two year period, which I originally offered as a potential problem for those not in love with Medicaid, the study should have demonstrated a bit more of a result than nothing. I say that recognizing that Oregon might have a better system than many of the other states. In those other states the same study might have demonstrated “less than nothing” if that is possible.

              When one actually looks at the costs of medications that actually rapidly and successfully improve health and mental well being one sees a big gap of money that to my mind is being wasted even though the intent is good.

            • But it seems like you’re questioning the effects of a study for which you will not even state what the effects would have to be in order for you to be satisfied. And then, once you define the effect size you would look for, you have to ask whether this or any study could find it given the likely sample it could obtain. I think you’re smart enough to do this work. Why not give it a try? You might learn something!

            • You bring up two more points 1) Rand “showed physical health benefits for a subset of the sickest and poorest patients.” and 2) How much skin is required.

              Rand: Like you I am very concerned with the sickest patients that fall through the cracks and that is one reason I would revamp the entire Medicaid system. The more money spent on those not needing care, the less money available for those with true need. Additionally since the money eventually comes from the taxpayer, poor as well as rich, the poor suffer when the government wastes money.

              Let us understand that what you claim with regards to the Rand study occurred only in 4 of the 30 conditions measured; hypertension, vision, dental care and serious conditions the last of which we both agree must be adequately managed. Hypertension is easy to treat and costs very little requiring little skill for most patients that already carry the diagnosis. Vision I believe has to do more with glasses than with anything else. You can correct me on that if I am wrong. That too is low tech. Dental care: A big problem for the poor and even the lower middle class. Do we need a lower level provider to provide additional care? I am not sure, but that type of idea is managed more through licensing requirements than money.

              How much skin: That depends upon the choices society makes. Except for those that cannot manage their own affairs, just for a person’s own feeling of well being, everyone should feel that they are paying something. That also can improve mental and physical health.

              [From rand.org: “Cost sharing also had some beneficial effects. Participants in cost sharing plans worried less about their health and had fewer restricted-activity days (including time spent in seeking medical care).”]

              These people are getting money from the state in one form or another so perhaps some of these monies should be joined together so that the individual has more discretion in how the money is spent. Perhaps the states should have more say in how the money is spent.

              Of course we have to be careful because many people on public assistance earn more than permitted and are gaming the system. As an example it is not uncommon for one spouse to choose a job with a lot of benefits such as health insurance etc., but a low income. The other spouse works under the table. In that fashion they pay little or no taxes and the declared income makes them available for government programs.

            • So until states have the options that you prefer they should offer poor people nothing? And this is in the context of people with employer plans and Medicare beneficiaries receiving many thousands of dollars of benefits from taxpayers? I just want to be clear that is your position.

              I do not hold the view that the poorest among us should get no support for health care expenses until other options are available. I don’t believe I can convince you of my view (if you do not already hold it). So, forgive me if I do not respond to your next comment, should you care to make one. I acknowledge that this is a difference in values.

              Still, I would like to know what set of facts, if any, would change your mind. In my EconTalk interview, I suggested some that would change mine.

              (Nobody in the RAND study was uninsured.)

            • “But it seems like you’re questioning the effects of a study for which you will not even state what the effects would have to be in order for you to be satisfied.”

              By your own admission the study proved nothing and for $7Trillion plus dollars it should have proved something more than dubious mental health benefits and financial benefits that I discussed earlier. Though you didn’t touch on my points I think you recognize them to be valid and that financial concerns belong in a different program.

              All I am looking for in studies are effects that justify the expenditures. Over the many past decades I think substantial proof of efficacy should be available and not be questionable.

              As far my creating a study I don’t think the ideological proponents of Medicaid (I am not including you) would accept any study that proved Medicaid has not lived up to the dollars spent any more then OJ’s jury would be convinced by a video of his guilt.

              But, I don’t even go that far because empirically I believe certain groups require additional help. I just believe the way we are giving help is distorted due to the ideologues and those that seek rent off of the program.

            • “So until states have the options that you prefer they should offer poor people nothing?”

              Where in what I have written did you come up with that idea?

              I don’t think you are advocating carelessly spending money without adequate proof because some people think it is a good idea, yet I don’t know how else to interpret your question.

              In fact I am not advocating that the program to help the needy be discontinued even without adequate proof. I do advocate changes in the program such as block grants to the states that are closer to the people. I want to make sure those most needy actually get what they need rather than what Medicaid presently provides which frequently doesn’t provide for them adequately. At the same time I wish to protect the taxpayer.

              You certainly cannot be against better care for the needy and watching out for the taxpayer.

    • I feel like you could clarify a bit by making the distinction between “bias” and “generalizability.” The instrumental variables LATE approach gives an estimate that is unbiased, but not necessarily generalizable to the whole population. When people talk about the results being biased, they usually mean that they think the effect is due to pre-existing differences in the type of people who enroll in Medicaid versus the type of people who don’t, rather than the effect of having Medicaid itself–this endogeneity bias is theoretically eliminated by the instrumental variables procedure. But, the study made no attempt to measure the effect of Medicaid on classes of people who did not participate–the results aren’t necessarily generalizable to everyone.

    • Austin, one of the things I worry about in the mental health field is diagnostic inflation that is very hard to sort out of the pack. I just this minute received a news alert from Medscape. Based upon the mental health findings in the Oregon Study which were self reported you might find this article of interest. It helps explain why I am so dubious about the reported mental health benefits. The original article is at the Annals of Internal Medicine.

      “On the eve of the official launch of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), Allen Frances, MD, chair of the DSM-IV Task Force and one of the new manual’s staunchest critics, is advising physicians to use the DSM-5 “cautiously, if at all.”

      “Psychiatric diagnosis is facing a renewed crisis of confidence caused by diagnostic inflation,” Dr. Francis, Duke University, Durham, North Carolina, writes in a new commentarypublished online May 17 in the Annals of Internal Medicine.”

      Continued at: http://www.medscape.com/viewarticle/804378?src=wnl_edit_newsal&uac=12683MR