• Overidentification tests

      0 comments

    Last week, in Inquiry, my latest paper with Steve Pizer and Roger Feldman was published. An ungated, working paper version is also available. Note also that I wrote a bit about a portion of it in a prior post, though even that does not describe what the paper is about.  I’ll write more about the results in the paper in another post. If you can’t wait, click through for the abstract. For now, I want to focus on another technical detail, which is likely to interest all of five readers. You know who you are from the title of the post.

    Until fairly recently, my colleagues and I thought overidentification tests of instruments were worth doing. We no longer feel that way. Still, in order to be published, we have little choice but to do them when a reviewer demands them, but we still think they’re not very valuable.

    Though these are typically discussed as tests of excludability, they are, in fact, joint tests of excludability and homogeneity of treatment effects (Angrist 2010). Consequently, instruments that are excludable may be rejected due to local average treatment effects.

    Passing overid tests may convince some reviewers that one’s instruments are excludable from the second stage model, but it shouldn’t. Failing to pass doesn’t prove they are not. This is a rather weak case for their scientific value. Many papers in top economics journals using IV methods do not include overid tests. That’s just fine.

    “Angrist 2010″ is a personal communication with Josh Angrist.

    @afrakt

     

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Practice patterns

      0 comments

    From Aspirin, Angioplasty, and Proton Beam Therapy: The Economics of Smarter Health Care Spending by Baicker and Chandra:

    There is also a substantial literature at the provider level showing that practice pattern norms drive similar care for all of the patients that a provider sees, regardless of individual insurance status – so that changes in the incentives applying to a large share of patients (say, Medicare beneficiaries) can drive changes in the care received by all patients (Glied and Zivin 2002; Baker and Corts 1996; Baker 1999; Frank and Zeckhauser 2007).

    This is of interest because to the extent that practice patterns are good predictors of the type of treatment one might receive but not correlated with unobservable factors that drive outcomes, they make good instruments for observational comparative effectiveness studies. This likely sounds like mumbo-jumbo to some readers, but a lot of money could ride on this type of thing.

    References

    Baker, L, and K Corts. 1996. HMO Penetration and the Cost of Health Care: Market Discipline or Market Segmentation? American Economic Review 86 (2):389-394.

    Baker, Laurence C. 1999. Association of Managed Care Market Share and Health Expenditures for Fee-for-Service Medicare Patients. Journal of the American Medical Association 281 (5):432-437.

    Glied, S., and J. Zivin. 2002. How Do Doctors Behave When Some (But Not All) of Their Patients Are In Managed Care? Journal of Health Economics 21:337-353.

    Frank, Richard G., and R.P. Zeckhauser. 2007. Custom Made Versus Ready to Wear Treatments: Behavioral Propensities in Physicians Choices. Journal of Health Economics 26:1101- 1127.

    @afrakt

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • In defense of reduced form models

      3 comments

    This post is for economists in the sense that I freely use jargon and technical concepts of the discipline. If you can’t get past the first sentence, don’t worry about it, and just skip this post. (If you need something good to read, here’s a recommended post from the archives. It’s about Medicare Advantage, circa 2009. Though it is slightly dated, the main concepts remain relevant.)

    A slightly different version of the following appears in a forthcoming paper by me, Roger Feldman, and Steve Pizer. If you need to defend reduced form models using HHIs as a measure of market structure, cite this as:

    Frakt AB, Pizer SD, and Feldman R, “The Effects of Market Structure and Payment Rates on Private Medicare Health Plan Entry,” forthcoming in Inquiry.

    A working paper version is available on SSRN.

    ***

    There is an ongoing debate about the strengths and limitations of both structural and reduced form models in empirical industrial organization (IO) (Angrist and Pischke 2010, Nevo and Whinston 2010). I’m not going to take a side in that debate. However, I raise the existence of it to point out that it is by no means settled in the broad economics community that either paradigm is preferred for all applications. In this post I explore the strengths and limitations of both types of models, though largely defend the reduced form approach while acknowledging it is not necessarily ideal in all cases.

    Recent work on health care markets has used both structural and reduced form approaches: Maruyama (2011), Starc (2010), and Lustig (2010) using structural models and Dafny et al. (2009), Danfy (2010),  Schneider et al. (2008), Shen et al. (2010), Moriya et al. (2010), and Bates and Santerre (2008) using reduced form models. Structural models of entry have been applied in health care (most recently by Maruyama (2011)) and, for decades, to problems in non-health industries as well (e.g., Berry 1992, Seim 2006). Many of the reduced form models employ the Herfindahl-Hirschman Index (HHI) as an independent variable (Dafny et al. 2009, Bates and Santerre 2008, Schneider et al. 2008, Shen et al. 2010, Moriya et al. 2010). Although these are ad hoc, Gaynor and Town (2011) write that “one can think of them as attempting to capture the impacts of relative bargaining power on price, using buyer and seller HHIs as proxies for bargaining power.”

    Though there are strengths of structural models, which I’ll get to, one limitation is tractability. For some applications, they cannot be applied for this reason. For instance, Mazzeo (2002) examined entry into motel markets by firms endogenously choosing high, medium, or low quality.  This approach requires firms to choose only one product type in each market and it becomes intractable with more than three types.  If one considers, for example, applying such an approach to the Medicare market of health plans, this intractability becomes a barrier. Although that market has only three main types of products a firm could offer — CCPs (coordinated care plans, like HMOs or PPOs), PDPs (stand alone prescription drug plans), and PFFS (private fee-for-service) plans — Mazzeo’s approach cannot be applied because firms may enter with one of seven configurations (CCP only, PDP only, PFFS only, CCP-PDP, CCP-PFFS, PFFS-PDP, or CCP-PDP-PFFS).

    A fundamental distinction between the structural models espoused by practitioners of new empirical IO and the reduced form models coming from the structure-conduct-performance (SCP) paradigm is in the type of assumptions required. Structural modelers correctly point out that market structure is endogenous. In entry models, for instance, concurrent market structure is, in essence, the dependent variable. As an alternative to including market structure (even if lagged and/or instrumented) as an independent variable, structural models instead impose assumptions about the nature of competition between firms. Evaluating the validity of those assumptions requires a substantial research program, sometimes including access to exogenous data on markups (Nevo 2001).

    The reduced form SCP approach, on the other hand, permits one to be agnostic about the underlying game and, thereby, to avoid any game-theoretic assumptions (Gaynor and Town 2011). Naturally, the trade off is that one is not estimating fundamental parameters associated with a game. In addition, one must justify the use of HHI on the right-hand side. In particular, one must defend one’s instruments for HHI. However, this is not qualitatively different from a problem faced in structural models that include (instrumented) price as an independent variable. In both contexts, instruments must be defended as reflective of exogenous factors and excludable from the second stage.

    When valid instruments for HHI can be found, they may reflect unobserved factors affecting market structure. For example, in the case of health plan entry such factors might include local marketing organizations or longstanding provider networks.  When those factors shift, they affect HHI and entry. By putting (instrumented) HHI on the right-hand side, one gains some insight into the aggregate effects of those factors.  This is a less precise insight than might be gained from a more detailed structural model, but with the advantages of weaker assumptions and simpler interpretability and econometric methodology. In addition, despite its shortcomings, HHI remains an important market measure for policy. The antitrust agencies still use it to inform their analysis of markets for anticompetitive mergers (DoJ and FTC 2010). Consequently, there are both practical empirical and policy relevancy rationales for use of HHIs in a reduced form framework.

    Further Reading from TIE

    References

    Angrist J and Pischke S. The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics. The Journal of Economic Perspectives 2010;24(2).

    Berry S. Estimation of a Model of Entry in the Airline Industry. Econometrica 1992;60(4); 889-97.

    Bates L, Santerre R. Do health insurers possess monopsony power? International Journal of Health Care Finance and Economics 2008;8; 1-11.

    Dafny L, Duggan M, Ramanarayanan S. Paying a Premium on Your Premium? Consolidation in the U.S. Health Insurance Industry. National Bureau of Economic Research Working Paper No. 15434; 2009; October.

    Dafny L. Are health insurance markets competitive? American Economic Review 2010;100;1399-1431.

    Gaynor M and Town R. Competition in Health Care Markets. Chapter for the Handbook of Health Economics, Volume 2. T. McGuire, M.V. Pauly, and P. Pita Barros, Editors 2011.

    Lustig J. Measuring welfare losses from adverse selection and imperfect competition in privatized Medicare. Unpublished manuscript, Boston University 2010.

    Maruyama S. Socially Optimal Subsidies for Entry: The Case of Medicare Payments to HMOs. International Economic Review 2011;52(1).

    Mazzeo M. Product Choice and Oligopoly Market Structure. RAND Journal of Economics 2002;33(2); 221–42.

    Moriya A, Vogt W, and Gaynor M. Hospital prices and market structure in the hospital and insurance industries. Health Economics, Policy and Law 2010;5; 459-479.

    Nevo A. Measuring Market Power in the Ready-to-Eat Cereal Industry. Econometrica 2001;69(2); 307-342.

    Nevo A and Whinston M. Taking the dogma out of econometrics: Structural modeling and credible inference. Journal of Economic Perspectives 2010;24(2).

    Schneider J, Li P, Klepser D, Peterson N, Brown T, and Scheer R. The effect of physician and health plan market concentration on prices in commercial health insurance markets. International Journal of Health Care Finance and Economics 2008;8:13-26.

    Seim K. An Emperical Model of Firm Entry with Endogenous Product-Type Choices. RAND Journal of Economics 2006;37(3); 619-40.

    Shen Y, Wu V, and Melnick G. Trends in hospital cost and revenue, 1994-2005: How are they related to HMO penetration, concentration, and for-profit ownership? Health Services Research 2010;45(1); 42-61.

    Starc A. Insurer pricing and consumer welfare: Evidence from Medigap. Unpublished manuscript, Harvard University 2010.

    U.S. Department of Justice (DoJ) and the Federal Trade Commission (FTC). Horizontal Merger Guidelines. August 19, 2010.

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Causality in health services research (a must read)

      6 comments

    A new paper in Health Services Research by Bryan Dowd, Separated at Birth: Statisticians, Social Scientists, and Causality in Health Services Research, is a must read for anyone wishing a deeper understanding of estimation and experimental methods for causal inference. Here’s the abstract:

    Health services research is a field of study that brings together experts from a wide variety of academic disciplines. It also is a field that places a high priority on empirical analysis. Many of the questions posed by health services researchers involve the effects of treatments, patient and provider characteristics, and policy interventions on outcomes of interest. These are causal questions. Yet many health services researchers have been trained in disciplines that are reluctant to use the language of causality, and the approaches to causal questions are discipline specific, often with little overlap. How did this situation arise? This paper traces the roots of the division and some recent attempts to remedy the situation.

    The paper is followed by commentaries by Judea Pearl and A. James O’Malley.

    There are too many excellent paragraphs and points to quote, and I really want you to read the paper. (I’m looking into whether an ungated version can be made available. It’s unlikely, but I’ll try.) Here are just a few of my favorite passages from the introduction and conclusion, strung together:

    Determining whether changing the value of one variable actually has a casual effect on another variable certainly is one of the most important questions faced by the human race. In addition to our common, everyday experiences, all of the work done in the natural and social sciences relies on our ability to learn about causal relationships. [...]

    It is not unusual for a health services research study section (the group of experts who review research proposals and make funding recommendations) to include analysts who maintain that only randomized control trials (RCTs) yield valid causal inference, sitting beside analysts who have never randomized anything to anything. Two analysts debating the virtues of instrumental variables (IV) versus parametric sample selection models might be sitting next to analysts who never have heard of two-stage least squares.

    Academic disciplines routinely take different approaches to the same question, but it is troubling when approaches to the same problem are heterogeneous across departments and homogeneous within departments and remain so for decades——suggesting an unhealthy degree of intellectual balkanization within the modern research university. It is one thing to disagree with your colleagues on topics of common interest. It is another thing to have no idea what they are talking about. [...]

    The challenge for health services research and the health care system in general is to contemplate the physician’s decision problem as she sits across the table from her patient. On what evidence will her treatment decisions be based? A similar case she treated 5 years ago? Results from an RCT only? What if there are not any RCT results or the RCT involved a substantially different form of the treatment applied to patients substantially different from the one sitting across the table? What if the results came froman observational study, but the conditions required for the estimation approach were not fully satisfied?

    Between the introduction and conclusion is the history of methods for causal inference and how they relate and diverged. Many points are ones I’ve made on this blog. But Dowd is far more expert than I in many respects and illuminates nuances I’ll probably never approach in a post.

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Debating Medicaid

      0 comments

    If you want to keep up, here’s Avik Roy responding to Harold Pollack. And here’s Harold Pollack’s reply. I don’t have much more evidence-based to add beyond what I’ve already written and what Pollack contributed.

    What’s ironic is that Avik and I actually share a fair bit of common ground in terms of reform. Things I’ve written about Medicare I think should be applied to the entire system (competitive bidding that includes private and public options, income-based and risk-adjusted subsidies, etc.). Moreover, within that framework I do not have a problem with high-deductible health plan options, provided they’re a choice not a requirement.

    My main concern in this debate over Medicaid is how policy outcomes are evaluated and how the results of such evaluations are interpreted. To me, this is principally about research design and methodology, not policy. If one thinks that multivariate controls using observable factors are sufficient in health care, one is ignoring an overwhelming body of work that convincingly shows there is selection based on unobservable factors into all types of insurance coverage. Instrumental variables (IV) can address that.

    Now, one can always attack IV, though it takes a good story to do so convincingly. (I’ve not seen one yet that deals a substantial blow to the instruments in the studies I reviewed.) However, even if one were to convincingly dismiss an IV approach to evaluation of Medicaid outcomes that does not mean an observables-based study is any good. Recall that the UVa surgical outcomes study that includes quite a large set of controls illustrated that not only Medicaid but also Medicare is associated with worse health outcomes than no insurance at all.

    Why is that? One can claim that Medicaid leads to “family breakdown and social disrepair” (though one had better point to quite a pile of scientifically credible literature before I believe that’s the source of the problem with the IV approaches). But where does that leave Medicare? What’s the story there? Why is the UVa study telling us the right causal story in that case? It just doesn’t hang together.

    Ultimately, I don’t see why we need to reject the studies that do reveal a credible causal link between Medicaid and improvements in health. They do not, and cannot, tell us that Medicaid is great in all possible ways. It is a program in need of reform. We can agree on that without needing to reject the good work that shows it is not bad for health. As I wrote before, I would worry about claiming that a study like the UVa one is sufficient for causal inference. My concern would be that any reform to Medicaid – even the one advocated by Avik Roy — would yield similar results based on a similar study, and, therefore, one would have to conclude that there is no program for that population that beats no insurance. (The results for Medicare show us that is likely since it does not have an association with the same social dynamics or provider restrictions as Medicaid.)

    What will those who interpret such a study’s results causally say then? Actually, under a causal interpretation, the policy implication would be clear. Revoke Medicaid. Revoke Medicare. Replace them with nothing. Save a fortune, and produce better outcomes at the same time. The only problem is, that’s totally wrong because the study is one of associations, not causation, and the findings suffer from some selection bias. Even the authors of the UVa study admit as much. On what grounds could any reader of their paper steadfastly claim otherwise?

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Medicaid and health outcomes (again)

      4 comments

    Avik Roy has read and posted about the papers I reviewed as part of my Medicaid-IV series. If you’ve forgotten, the purpose of that series of posts was to examine studies that use proven, sound methods to infer the causal effect of (as opposed to a correlation between) Medicaid enrollment on health outcomes. From that series, I concluded that there is no credible evidence that Medicaid is worse for health than being uninsured. Considering only studies that show correlations (not causation), Avik disagrees.

    Avik’s post is long, but you can save yourself some trouble by skipping the gratuitous attack on economists in general, and Jon Gruber in particular, as well as the troubled description of instrumental variables (IV).* About halfway down is his actual review of the papers; look for the bold text.

    The point I want to drive home in this post is why an IV approach is necessary in studying Medicaid outcomes. People enrolling in Medicaid differ from those who don’t. They differ for reasons we can observe and for those we can’t. An ideal study would be a randomized controlled trial (RTC) that randomizes people into Medicaid and uninsured status. Thats neither practical nor ethical. So we’re stuck, unless we can be more clever.

    The next best thing we can do is look for natural experiments. That’s what IV exploits. In this case, the studies I examined use the state-level variation in Medicaid eligibility (and related programs). That variation obviously affects enrollment into Medicaid (you can’t enroll unless you’re eligible), though it is not determinative. Importantly, state-level variation in Medicaid eligibility rules does not itself affect individual-level health. Other than figuratively, do you suddenly take ill when a law is passed or a regulation is changed? Do you see how Medicaid eligibility rules are somewhat like the randomization that governs an RTC, affecting “treatment” (Medicaid enrollment) but not outcomes directly? (If this is unclear, go here.)

    Note that IV studies can, and should in some cases, control for observable factors. (The studies I reviewed use quite sophisticated controls, including fixed effects and interactions, that greatly reduce the ambitiousness of the assumptions required to obtain causal estimates. In contrast, assumptions for inference of causality in the studies Avik prefers are far greater.) But controlling for observable factors alone is insufficient. That brings me to a study that Avik has cited many times as evidence that Medicaid produces worse health than no insurance at all. Tyler Cowen referenced the same study in his book, about which I wrote earlier. It’s the UVa surgical outcomes study, formerly known as: Primary Payer Status Affects Mortality for Major Surgical Operations, by LaPar and colleagues.

    Avik has summarized this study, so I’ll skip that. It examines 11 surgical outcomes by insurance status, adjusting for many observable factors, but, crucially, with no controls for unobservable factors that affect selection. All adjusted outcomes for Medicaid enrollees are worse than for the uninsured. With only one exception, adjusted outcomes for Medicare beneficiaries are worse than for the uninsured too. Got that? Not just Medicaid enrollees, but Medicare beneficiaries too, fare worse than the uninsured. Any theory to explain what’s going on in Medicaid had better explain Medicare too. It cannot be just that Medicaid enrollees see lower quality providers.

    You know what theory is consistent with these results? It’s a pretty famous one? I just described it above: selection (or omitted variable) bias. It is well known that studies that do not exploit purposeful (i.e., an RTC) or natural (i.e., natural experiment or instrumental variables) randomness can suffer from selection bias. Even controlling for observable characteristics is not enough in the field of health care. This is well known. I’ve explained it before, even in a diagram.

    The authors of the UVa surgical outcomes study acknowledge the possible presence of selection bias in trying to explain their results. They say as much in many places in the text of their paper,  writing,

    Another possible explanation for the differences we observed among payer groups is the possibility of incomplete risk adjustment due to the presence of comorbidities that are either partially or unaccounted for in our analyses [sic]. [...]

    Several explanations for inherent differences in payer populations have been suggested. Factors including decreased access to health care, language barriers, level of education, poor nutrition, and compromised health maintenance have all been suggested. [...]

    There are several noteworthy limitations to this study. First, inherent selection bias is associated with any retrospective study. [...]

    For example, the proportion of Medicaid patients may be artificially inflated due to the fact that normally Uninsured patients may garner Medicaid coverage during a given hospital admission. [...] [I]n our data analyses and statistical adjustments there exists a potential for an unmeasured confounder. Due to the constraints of NIS data points, we are unable to include adjustments for other well-established surgical risk factors such as low preoperative albumin levels or poor nutrition status.

    Kudos to the authors for acknowledging the limitations of their study. That the results have been repeated elsewhere without such disclaimers is a disservice to science.

    Moving on, to Avik’s great credit, he unearthed a Medicaid-IV study I had overlooked: The Link Between Public and Private Insurance and HIV-Related Mortality, by Bhattacharya, Goldman, and Sood (ungated PDF available). It examines mortality outcomes in an HIV population using IV methods to control for selection into insurance category (uninsured, public, and private). Table 5 is the key table. It confused me at first, as it has Avik. Just reading the table, it looks as if the “best” model produces the results in the bottom row, which suggest private insurance decreases mortality by 50% and public insurance increases it by 8%, relative to no insurance.

    But, reading the text, it is clear that the results in that bottom row are based on a faulty model, which the authors explain. (I will too, below.) The model based on sound methodology produces results in the second to last row of Table 5, a 79% and 66% reduction in mortality for the privately and publicly insured, respectively, relative to the uninsured. Table 6 also reports the results of the preferred model, though there is a typographical error on the mortality results: they’re missing minus signs in the first two rows. (I confirmed this with the authors.)

    The results of this study are stated very clearly by the authors, “both private and public insurance decrease the likelihood of death.”

    Now, what’s wrong with the model that shows Medicaid killing people, the one Avik thinks is best? It includes AZT and HAART** treatment indicators on the right hand side. That’s a problem because AZT and HAART treatment are more likely for those with insurance and HAART is indicative of poor health. Essentially, they’re “caused” by insurance and highly predictive of the outcome of interest, mortality. This is an example of “bad control,” i.e. controlling for an outcome. It should be clear that having the outcome — or something very close to it — on both the left and right hand sides is a problem. It soaks up too much of the effect of insurance but, being an outcome, it isn’t a proper control. About this, the authors write, “Of course, there is concern that HAART itself may be endogenous, since receipt of therapy almost certainly reflects disease severity and ability to adhere to the complicated regimen.” (Why was this faulty model even included in the paper? My guess has been confirmed by the authors via email: reviewers requested the authors include it. It was not included in their NBER working paper that predates the peer reviewed one. It really is too bad this model was inserted into the paper because it seems to have tricked some readers. However the authors very clearly indicate which model is most sound. Anyone appreciating the essence of good research design will understand it, as I explained above.)

    Bottom line: once again, we find that Medicaid is shown not to be bad for health, but only if proper econometric techniques are employed. Sadly, it is easier to ignore the need for such techniques and to misunderstand them than to do the work to educate oneself in their use. The real tragedy is that it leads to an unwarranted conclusion that Medicaid is harming people. We can certainly craft a better Medicaid program, and we should. But we should always use proper science in considering any program. If we don’t, we may mistake improvements to Medicaid as harmful. I’m sure advocates for change, myself included, would not welcome such an outcome.

    * Avik dismisses IV as a “fudge factor,” casually and erroneously discrediting a vast amount of mainstream work by economists and several entire sub-disciplines. Since IV is a generalization of the concepts that underlie randomized controlled trials (differing in degree, but not in spirit, from purposeful randomization), and can be used to rehabilitate a trial with contaminated groups — a not infrequent occurrence – it is unwise to trivialize IV and what it can do.

    ** HAART = highly active anti-retro-viral therapy.

    UPDATE: I fixed my explanation of the “bad control” problem in the Bhattacharya, Goldman, and Sood study.

    UPDATE 2: The authors of the Bhattacharya, Goldman, and Sood study confirmed the typos in Table 6.

    UPDATE 3: Those authors also confirmed that the faulty model was requested by reviewers, as I suspected.

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Go ahead, make my day

      8 comments

    Never mind what it is about, one of Paul Krugman’s recent posts includes this paragraph on instrumental variables (IV):

    Instrumental variables is a statistical technique that you use to avoid having your results contaminated by reverse causation — say, if stimulus funds were directed to states with especially severe unemployment problems, you might find a spurious negative correlation between stimulus and unemployment. What you need to get around this is some variable that is correlated with stimulus but not affected by the job changes; in effect, you use this other variable to create a predicted stimulus level, then look at how employment is affected by the predicted level, not the actual level. If I’ve just lost you, never mind.

    If, by virtue of my blogging on IV, a single reader isn’t lost who might otherwise be, I’ll be very happy. If that reader is you, maybe you’d be so kind to tell me in the comments. It’d make my day.

    While I’m asking, if any readers use this blog as part of college/university course curriculum or otherwise wish to express their appreciation, that will help. No, I’m not having a bad day. But you can make it a very good one.

     

     

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Selection bias and Medicaid

      1 comment

    The figure below, taken from a paper by Steve Pizer, is a conceptual model for a generic observational study of some health-related treatment or intervention. The arrows are to be interpreted as causal influences. (I’ll explain why some lines are dashed later.)

    Suppose “treatment” is “Medicaid enrollment” and “comparison” is “uninsured.” Since we do not have the luxury of randomly assigning individuals to Medicaid (treatment) or uninsured (control), we must base our studies on populations that self-select into these two groups.

    That self selection process is in the “sorting” box in the diagram. Individuals are eligible for and motivated to enroll in Medicaid (or remain uninsured) for many reasons, related to individual factors (like health status, income, etc.) and institutional factors (like Medicaid eligibility rules). Provider characteristics might also play a role if providers are more or less motivated to assist patients in enrolling in Medicaid.

    If this were a randomized trial, the sorting into treatment and control would be independent of all patient, provider, and institutional factors. I wouldn’t even put them in the diagram. They’d be irrelevant. But, like I said, Medicaid enrollment or being uninsured is not random. Worse, some of the factors that affect self-selection into Medicaid or uninsured status also affect (health and non-health) outcomes. This is a source of selection bias. For example, if sicker individuals are more motivated to enroll in Medicaid we will find that Medicaid is correlated with worse health outcomes. But that’s a selection effect, not an effect of Medicaid.

    What’s a researcher to do? Well, if one can observe (measure) the things that affect sorting and outcomes, one can control for them. There are well-established statistical ways of doing this (I won’t go into it, but Steve has). But here’s the kicker, there are some factors that affect sorting and outcomes that we cannot even observe in data. There are unmeasured aspects of health, skills attitudes, and culture, among other things, that relate to both sorting and outcomes. Unobservably sicker individuals may enroll in Medicaid. Or unobservable characteristics (quality) of providers that patients visit may be related to both Medicaid enrollment and health outcomes. This effect of unobservable factors on the outcome of interest is emphasized by the dashed lines in the figure.

    Thus, even a study with quite sophisticated statistical controls for observed factors can reveal correlations that should not be causally interpreted due to unobservable selection bias. However, there is a remedy to this problem too. It is based on the fact that some institutional factors that affect sorting do not affect outcomes (no line from the institutional factors box to the outcome box in the figure). By teasing out the effect of such institutional factors on the sorting mechanism and using just that aspect of sorting to infer the relationship between treatment and outcomes, one obtains an estimate free of the confounding effects of observed and unobserved characteristics. This is the instrumental variables (IV) approach.

    I’m obviously glossing over the technical nitty-gritty of how one actually implements an IV estimation strategy. One can find that in the literature. Such approaches have been applied to the very question used as an example above: what’s the effect of Medicaid on health? I’ve already posted about them, explained a little bit about why and how they work. After that review, I concluded that studies that do not address selection into Medicaid on unobservables (i.e. do not employ an IV or other quasi-randomized design) are likely biased. If such studies show that Medicaid is associated with worse health outcomes than being uninsured, it would be a mistake to interpret that as a causal effect of Medicaid on health.

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Treatment/control randomization is IV

      2 comments

    Just a quick follow-up to my prior post on IV. I used a (flawed) treatment/control randomization as an example. My guess is that many people are not aware that the randomization (the coin flip) is an instrumental variable. If this is not clear to you, I highly suggest reading Steve Pizer’s paper on this exact point. His (and my) hope is that it makes IV much more accessible to health services researchers.

    TwitterFacebookDiggDeliciousStumbleUponShare
     
  • Does repeated use of instruments reveal problems with IV?

      0 comments

    The NBER working paper by Randall Morck and Bernard Yeung, “Economics, History and Causation,” has received some attention in the blososphere and in the comments on this blog. I’ve now read the paper. It’s a good one, and one I think even non-economists can (mostly) understand. I recommend you try if you’ve got a little interest in how economists think (or don’t) and work (or don’t).

    The key issue raised in the paper about instrumental variables (IV) is this, from the abstract: “[E]ach successful use of an instrument potentially creates an additional latent variable bias problem for all other uses of that instrument – past and future. [...] Useful instrumental variables are, we fear, going the way of the Atlantic cod.” I’ve made bold the key word, potentially.

    OK, what’s the problem? In the next three paragraphs I’ll say it in technical terms, though with a silly example. That still may not help some people, but I urge you not to throw up your hands and conclude IV is doomed. If you like, skip my unpacking of the problem and go write to the paragraph that begins, “About this, I agree with Alex Tabarrok.”

    Suppose one study exploits a seemingly good instrument Z to estimate a causal effect of X on Y1. That is, Z is an instrument for X and Y1 is the dependent variable. That means, among other things, that Z is correlated with Y1. For example, if the coin flip (Z) turns out heads, all the patients live, otherwise they die (treatment X is very, very important to the life/death outcome, Y2).

    Now imagine a future study that does the same thing, uses the instrument Z to estimate a causal effect of something on a new dependent variable, Y2. That means Z is correlated with Y2. For example, it turns out that everyone that did not get the treatment (because the coin flip Z was tails) was also in the ICU but nobody who did get treatment was (it turned out to be a very biased coin; who knew?). Imagine Y2 is an ICU stay indicator.

    If Y1 and Y2 are also correlated (which is not certain in general, but obviously is in the made up example), then the estimates of the first study are called into question if Y2 (ICU stay) was not included among the control variables. If Y2 was not a control variable in the first study, then it is an unobserved factor correlated with both the instrument and the dependent variable. That makes the instrument endogenous (omitted variable bias or incomplete risk adjustment), which exactly violates the assumptions of a good instrument. It invalidates the instrument. The study is flawed. That coin flip was really not as random as we thought!

    About this, I agree with Alex Tabarrok, who wrote, “I don’t see this as a fundamentally new problem or one specific to IVs.” I view this as the natural and welcome march of scientific progress. We discover all kinds of things that invalidate prior theory or call into question prior observation. One can always say, “Maybe the instrument is flawed!” (This applies to physics as well as economics, only more so in the latter.)

    Yet, if one is going to make a claim of instrument failure, one had better explain how. Well, a second study that shows the prior use of the instrument was flawed is even better. It’s not just a story of how the instrument could be flawed (in the first study), but a demonstration exactly how it is. Still, it doesn’t prove that the results are wrong, at least qualitatively.

    What one then should do, if one wants to attempt to repair the reputation of the results of the first, earlier study is to redo it, only addressing head on the issue raised by the second by including the omitted variable it suggests. That’s not a fundamental problem. Maybe the conclusions of the first study turn out to be qualitatively right anyway, that the omitted variable wasn’t a big deal.

    There is always doubt in science, and more so with observational studies than randomized experiments. But we need observational studies. Our life is based on them. We can’t randomized everything, nor should we. So, march on we must. When one result begins to look shaky we should revisit that area and try to do better. For some time an instrument may look potentially bad. It may turn out to be bad. Or maybe not. In time, the fisheries can be replenished.

    TwitterFacebookDiggDeliciousStumbleUponShare