• In Defense of IO?

      0 comments

    In a comment to my review of Angrist’s  and Pischke’s The Credibility Revolution in Empirical Economics squidy wrote “You may also want to check out the response to this paper by Liran Einav and Jon Levin. http://www.stanford.edu/~jdlevin/Papers/IO.pdf” . That paper is titled Empirical Industrial Organization: A Progress Report.

    In fact a paper by the same authors with the same title is one I noted on this blog that I would print and read. I did so, though it was the NBER version, which seems have disappeared from the web. I can’t be certain that the two versions of the paper are identical. However both include the following passage, which actually seems to support Angrist’s and Pischke’s view of new empirical IO.

    In part because it is hard to find independent movement of many product prices, some of the most popular identification strategies rely on restrictions across equations in the demand system. One such approach is to use a product’s price in other markets as an instrumental variable, under the theory that cross-market correlation in the price of a given product, conditional on observed demand characteristics, will be due to common cost factors rather than unobserved features of demand. An alternative is to instrument for prices using the non-price characteristics of competing products, which proxy for the degree of competition.

    Neither is a perfect solution, so the source of price variation and its power has to be evaluated in each application.

    … Our own view is that many applications of these methods — while they are often very careful in clarifying the statistical conditions under which their identification strategy is valid — tend to be rather thin in explaining the precise source of identifying variation, in arguing why the required statistical condition is likely to hold, or in providing first stage regressions and other diagnostics.

    So, there seems to be rather broad agreement that IO may have gone a bit too far out on the limb of creativity in its acceptance of arguments of identification.

    Having said that, Einav and Levin do offer what I think is a worthwhile critique of Angrist and Pischke, and it gets to the heart of the real debate while abstracting from IO. There is a fundamental disagreement about the role of theoretical versus empirical evidence in applied economics. Where data are thin, one can make progress by leveraging theory. Where data are abundant, theory is still important but one requires somewhat fewer assumptions to obtain a result. How much credibility does work relying on a greater number of assumptions deserve? Well, this is of course debatable, and it matters a great deal what those assumptions are and the extent to which they comport with prior beliefs and methodological conventions.

    The bottom line, however, is that the use of economic theory and the search for compelling sources of identifying variation are not enemies. Indeed, we hope to have conveyed that the applied work we often find most exciting relies on careful measurement based on data with good underlying variation, but then continues by framing the empirical exercise in terms of a coherent economic model. The model can then provide a way to think about the operation of the industry and potentially to draw conclusions about policy or general principles.

    To the extent that we have a concern about the current state of industrial organization research, it is that there is not sufficient emphasis on this kind of applications, relative to, say, expanding the set of econometric methods. Of course, better methods are valuable, provided they eventually get used in compelling ways and do not become an end in themselves. If we return again to the demand estimation literature, it is possible that one reason researchers have been willing to tolerate less than ideal price variation is that in some cases the main contribution is not the estimated price elasticities per se but the econometric method, which can be applied more broadly. While this is not terribly objectionable, it is important that the field at large strikes a balance between building tools and using them convincingly. Whether the field has tipped too far is debatable, but the fact that one might engage in a serious debate suggests some grounds for concern.

    With that last sentence, one senses that Einav and Levin really do not wish to argue too strongly against Angrist and Pischke. Both sets of authors have valid perspectives. Research design (a focus on sound reasoning for claims of identification of causal effects) and the exploitation of theoretical models are not mutually exclusive. In fact, both are necessary and reinforcing. In many cases they amount to the same thing. While there’s room for debate over whether the reasoning and modeling is convincing in one application or another (or even in a whole body of work), I think few applied economists would reject the importance of design or theory.

    What is most important for the practitioner and student is the understanding of the role of both. By raising this debate and writing clearly and accessibly about it, Angrist, Pischke, Einav, and Levin all deserve credit. And their papers ought to be widely read and discussed.

    Note: Now both papers seem to have been removed from the NBER website. I suspect that site itself is having difficulties. Perhaps they’ll be back up later.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark
  • What Took the “Con” out of “Econometrics”?

      1 comment

    This is an uncharacteristically but justifiably long post on some fairly technical aspects of applied economics. If it isn’t for you, skip it and try the next one.

    A March 2010 NBER paper by Joshua Angrist and Jörn-Steffen Pischke may be the best economics paper I’ve ever read. The Credibility Revolution in Empirical Economics reviews the extent to which greater emphasis on research design in microeconomics studies have dramatically increased the credibility of empirical work. That credibility was called into question by Edward Leamer in a 1983 American Economic Review paper titled Let’s Take the Con Out of Econometrics. In it Leamer lamented, “[H]ardly anyone takes anyone else’s data analysis seriously.” (Leamer’s paper is itself a fun and worthwhile read.)

    Leamer and others in the early 1980s were distressed by the lack of testing of implications of assumptions in specification and functional form of econometric models. His proposed solution was to analyze the changes in results based on model variations (sensitivity analysis). Angrist and Pischke make a strong case that Leamer was correct in his diagnosis but not necessarily in his prescription. They argue that the “credibility revolution” experienced in empirical microeconomics since Leamer’s critique is due principally to a greater focus on research design not on sensitivity analysis.

    A “research design” is a characterization of the logic that connects the data to the causal inferences the researcher asserts they support. It is essentially an argument as to why someone ought to believe the results. It addresses all reasonable concerns pertaining to such issues as selection bias, reverse causation, and omitted variables bias. In the case of a randomized controlled trial with no significant contamination of or attrition from treatment or control group there is little room for doubt about the causal effects of treatment so there’s hardly any argument necessary. But in the case of a natural experiment or an observational study causal inferences must be supported with substantial justification of how they are identified. Essentially one must explain how a random experiment effectively exists where no one explicitly created one.

    I view their paper in three parts, one on improvements in research design in microeconomics  (which I understand well), one on the degree to which industrial organization (IO) has harnessed those improvements (an area about which I’m learning), and one on macroeconomics (which I am not qualified to judge). Taking them in reverse order, let’s begin with macro. Angrist and Pischke essentially characterize it as too little data and design chasing far too much theory. Whether that is fair or not I will leave to others to evaluate (paging macro bloggers). Nevertheless, they point to “[s]ome rays of sunlight pok[ing] through the grey clouds” of macro and continue by summarizing a few design-based macro studies (quotations © 2010 by Joshua Angrist and Jörn-Steffen Pischke).

    If their critique of macro is strong, their attack on IO is withering.

    The dominant paradigm for merger analysis in modern academic studies, sometimes called the “new empirical industrial organization,” is an elaborate exercise consisting of three steps: The first estimates a demand system for the product in question … Next, researchers postulate a model of market conduct …  Finally, industry behavior is simulated with and without the merger of interest….

    [T]his elaborate superstructure should be of concern. The postulated demand system implicitly imposes restrictions on substitution patterns and other aspects of consumer behavior about which we have little reason to feel strongly. The validity of the instrumental variables used to identify demand equations—prices in other markets—turns on independence assumptions across markets that seem arbitrary. The simulation step typically focuses on a single channel by which mergers affect prices—the reduction in the number of competitors—when at least in theory a merger can lead to other effects like cost reductions that make competition tougher between remaining producers. In this framework, it’s hard to see precisely which features of the data drive the ultimate results.

    Angrist and Pischke ask whether characteristics of simulated mergers based on this new empirical IO framework match those from other credible design-based merger studies. Their answer based on a survey of comparisons to date is that the evidence is mixed, which in their view diminishes the credibility of the new empirical IO approach.

    Finally, turning to domains of empirical microeconomics in which a focus on design has been most prominent, Angrist and Pischke make some superb points. Among them is the notion that the gold standard of the randomized experiment is not without deficiencies. Such experiments are “time consuming, expensive, and may not always be practical.” To this I would add that they are also not always decisive. Even the RAND health insurance experiment (HIE) has been critiqued (and defended). That is not to suggest that it is certainly flawed (or certainly perfect), it is merely to say that variations in interpretation exist for results of randomized experiments just as they do for non-experimental studies.

    Indeed, Angrist and Pischke (and I) agree with Leamer that “randomized experiments differ only in degree from nonexperimental evaluations of causal effects.” The authors add that “a well-done observational study can be more credible and persuasive than a poorly executed randomized trial.” It is for this and the other foregoing features of randomized experiments that I believe the half-billion dollars or so that some advocate spending on another RAND HIE would arguably be better spent funding well-conceived observational or natural experiment-based studies. (A half-billion dollars could found on the order of 1,000 observational studies.)

    In perhaps the clearest possible example of why Leamer’s suggested remedy for empirical economics–sensitivity analysis–was not how it regained its credibility, Angrist and Pischke summarize a 1997 American Economic Review paper by Sala-i-Martin that reported results of two million variations of regression analysis. (The paper is titled I Just Ran Two Million Regressions.) The author chose three fixed control variables and selected three others at random from a set of nearly 60. He obtains some “wonderfully robust” predictors but Angrist and Pischke are not impressed.

    Are these the right controls? Are six controls enough? How are we to understand sources of variation in one variable when the effects of three others, arbitrarily chosen, are partialed out? Wide-net searches of this kind offer little basis for a causal interpretation.

    For all that, sensitivity analysis does have a place in the canon of empirical technique. Angrist and Pischke may be correct that it is a focus on design and not more sensitivity analysis that deserves the lion’s share of credit for distinguishing econometrics from whimsical alchemy. However, once one is working within a framework of sound design sensitivity analysis is an important check on the robustness of results. Therefore, Leamer’s advice is valid as an enhancement to, not instead of, good design. And that may, in fact, be the sense in which he meant it. That is certainly the sense in which it ought to be interpreted today.

    In conclusion, Angrist’s and Pischke’s paper is an excellent review of issues pertaining to causal inference. It cites and summarizes a substantial number of high-quality work in numerous applied economics domains. And it makes a compelling case for how attention to elements of research design have taken the “con” out of “econometrics.” If you’re a student or practitioner of applied economics, consider reading the whole thing. As long as this post is, it hardly does it justice.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark
  • Best of xkcd: Correlation

      0 comments

    (Terms of use.)

    Regular readers will recall my many posts on correlation and causation.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark
  • Causal Speculation: A Meditation on Theory

      4 comments

    In a post late last year on correlation and causality I touched on the role theory plays in making causal inferences. Specifically, I indicated that a causal inference cannot be made from the analysis of data from an observational study without a theoretical model. If conclusions of causality rely on theory, where does the theory come from? Fundamentally, how do we know when something causes another?

    Before addressing those questions, let’s clarify the issues with a hypothetical example. Suppose physicians begin to notice that males who eat a particular exotic beetle of Zimbabwe also have no hair loss, independent of age. There are precisely four possibilities:

    1. the observations are coincidental,
    2. consumption of such beetles (causally) prevents hair loss,
    3. lack of hair loss (causally) leads to consumption of the beetles,
    4. beetle consumption and lack of hair loss are both jointly caused by something else.

    Each of these possibilities is essentially a theory about the world, and each implies something about the causal relationship (or lack thereof) between beetle consumption and hair loss. Theories 1-3 are relatively simple since the causality implied only runs at most one way. Theory 4 is the source of many problems. Even when there is strong evidence in support of theories 2 or 3, one can never fully rule out theory 4. “Something else” could be anything.

    That’s essentially why causal inferences cannot be made from correlations (or statistical analysis) alone. One needs to put a fence around the problem and assert that only the factors one has considered, measured, and included in the analysis are relevant, that there is no “something else” left out. With that assertion one can make causal inferences from statistical models and correlations (assuming the correct application of appropriate technique).

    Where does this assertion that all relevant factors have been considered come from? Its origin is outside the data, outside the analysis. It is extra-empirical. Put simply, it is theory, a hypothesis about the nature of causality in the world that can be rejected, but never fully confirmed by the data. Without it no causal inference can be made no matter the quality of the data or what is done with it.

    Where do causal theories–the fences around problems–come from? Why do we believe that x causes y and not vice versa or that some other factor z causes both? These questions are puzzling because all our experience is empirical yet theory stands outside the data.

    Perhaps theory comes from extrapolation from the subset of our experience that is exactly like or darn near a randomized trial (either explicitly so or due to a natural experiment that makes it close enough)? If the “cause” seems random we’re comfortable inferring that it is responsible for much of what seems to “result.” No doubt this is hard-wired into our brains, a consequence of evolution. For example, “See lion eat chief. See lion again, run!”

    But such causal inferences are formed quickly and easily can be wrong. Moreover, we frequently hold mutually exclusive causal ideas in our brains at the same time. The role of  theory is to force us to organize our causal ideas, to be explicit, and to iron out logical inconsistencies. Then we go to the data to test the theory.

    It is tempting to believe that the world can be understood from data alone, that if x causes y we should not need theory to tell us so. Evidently that is not the case, at least insofar as observational studies are concerned. Observational methods comprise a great deal of science and, in far less rigorous form, most of our experience. This leads to a version of the anthropic principle: we can’t exist without theory (nor, I assert, could many animals). A world in which humans don’t rely on theory would be one in which humans, as we know them, do not exist.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark
  • Causation and Evolution

      0 comments

    James Bronzan sent me a follow-up e-mail to my post based on his earlier comment. He agreed to let me quote it:

    But again, it still seems to me that there’s likely a reason that we look for instances of causation to be associated with instances of correlation precisely because they are.  It turned out to be a good evolutionary strategy to do this — we adapted to the way the world is.  (Grossly coarse example: it was useful to go beyond “man, every time someone happens to eat that berry, he or she dies,” to “perhaps eating that berry causes death.”)…

    Of course it doesn’t require that they be this way, or indeed that there aren’t other effects of causation that we don’t detect whatsoever.  But I’d posit that those effects didn’t turn out to be useful in decision making linked to survival, which suggests (to me, anyway) that the causation-correlation link is more prevalent or stronger.

    I agree with James’ perspective to a point. I might even be caught making the same or similar arguments some day. But today I’ve decided to differ, if only slightly, for entertainment value. Plus I’d like to think I add an important nuance or two in what follows.

    We actually can’t say that we are reacting to correlation when we intuit causation. Correlation is such a precise measure and, therefore, is only meaningful when talking about statistical analysis. I don’t believe our berry-eating ancestors were running regressions in their heads. Few of us do that today.

    When it comes to statistical analysis and quantitative studies, one can speak precisely about correlation and causation. Linear models and their Gaussian assumptions reduce everything to correlations. There’s nothing left. The tools are blind to all else, though they’re very helpful when we have a theory upon which we choose to rely. Even in cases where models are not linear and assumptions of Gaussianity are relaxed, our intuition (by which I mean that of researchers) is based on the linear/Gaussian/causal world, so some of its distortions remain.

    But back to the non-research, intuitive world we inhabit. On what do we base our causal inferences? It is correlations (maybe) but could be more or less than that. It is some vague interpretation of sensory data, I know not what. Yes it is evolved and therefore is (or was, rather) of great utility for reproduction.

    Is it of great utility today and for other purposes? Yes, but it also leads to errors, a subset of which we notice. But very few cases in which it is applied casually are in areas relevant to reproduction. So there is this muscle we use in domains beyond that in which it was strengthened by evolution.

    What can we say about the degree to which correlation (or whatever we do) is associated with causation in such domains? I think strictly speaking not much. We have only our bias and intuition. Beyond that I’m willing to say, “I don’t know.” Not everyone is comfortable with giving the unknown that much scope. It isn’t necessary that we do. But very often it is important if we do. Noticing this causality bias really can be an eye opener. Though if one goes too far it becomes hard to know anything.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark
  • Reader Response: Causation Bias

      0 comments

    In a thought provoking comment to my post Causation without Correlation is Possible James Bronzan wrote

    To my uneducated eye, I’d say that even though correlation does not mean causation and causation does not mean correlation, they nonetheless travel very closely together. In other words: the instances in which causation results in correlation are far more frequent in the world than those in which it does not. (Maybe that’s why the examples had so much mathyness?)

    Surely if X causes Y instances of X are more likely to hang around with instances of Y than not.

    There’s a lot going on here. Let me try to unpack it. First of all, causation itself and any claim of it are theories or mental models, if you prefer that terminology. (A subsequent post will go more into this point.) However, causation and causal thinking has proven to be incredibly useful. No doubt it is an evolved modality of thought. Even if the world is not objectively causal in any sense I’m not ready to abandon causality. Let us say, if only as shorthand, causality exists and is ubiquitous. (As an exercise try for a moment to imagine any time and place in the universe where causality ceases. Good luck.)

    I assert that there are far more things that are related by correlation than by causality. To build on an example from the prior post, among children, reading comprehension is correlated with, among other things, shoe size, mathematical ability, height, weight, and age. Yet only one of those plays a causal role (or so a reasonable person can believe). If causation exists and is ubiquitous, what do we say about correlation? It is hyper-ubiquitous. It is over-abundant. There is far too much of it to be useful. If we based inferences on correlation the universe would be over-determined. There’s just far too much of it.

    That’s related to the fact that correlation is not as useful as we’d like to think. Or, rather, it is a very blunt tool, especially for causal inference. Using it is like trying to catch water with bare hands. It leaks out all over the place unless one carefully plugs all the holes. That’s not so easy to do, but it isn’t impossible to do a credible job. Sometimes we can hold just enough water to get a drink.

    Very often we think correlation is carrying water when it is not. The conclusions of many studies on many subjects and much of what is believed in general (I can’t say how much) about many things are based on very casual causal inferences from correlations. I’d say we have a bias to think this way. It’s a sub-type of confirmation bias. We so adore our causal theories that we search for and believe correlations that support them. Sometimes we learn later how wrong we were. The history science is full of such stories (flat Earth, the Ptolemaic model, blood letting, and more recently, though less profound, arthroscopic knee surgery for arthritis, among many others).

    But James’ point isn’t that correlation very often implies causation. Rather, his point is that if X really does cause Y, it is far more likely that X and Y are correlated than not. It certainly seems that way. But an honest look at this issue has to account for the bias in our minds and in our tools. We have very good tools for discovering correlation and certain other measures of relatedness. There could very well be (in fact must be) a class of causal phenomena that escape the detection of those tools. That is to say, our minds and tools are biased in favor of contemplation, detection, and study of evidence of correlation in support of causal inference. It is tempting to conclude that causation and correlation are frequently or tightly associated. But I don’t know how one could substantiate such a claim.

    Nevertheless, correlations are useful in the study of causal phenomena. Though they do not by themselves confirm or reject causal assertions they do measure degree of relatedness. That is to say, if we take as given X causes Y, the next question is “to what extent?” Correlation provides an answer, though an incomplete one (being only one statistic).

    So, to address James’ point directly, I think we do find that instances of causation to be associated with correlation. But that’s because that’s what we look for and that’s what we can see. The brightness of the street lamp tells us nothing about the extent of the universe it illuminates.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark
  • Causation without Correlation is Possible

      7 comments

    It is well known that correlation does not prove causation. What is less well known is that causation can exist when correlation is zero. The upshot of these two facts is that, in general and without additional information, correlation reveals literally nothing about causation. It is neither necessary nor sufficient for it.

    Correlation without causation. My favorite hypothetical example of this is a study of thousands of middle and high school kids. The poorly informed investigators measure shoe size and reading comprehension scores. They find that the two are positively correlated. Their manuscript claiming that larger feet cause better reading skills is rejected, of course. Foot size does not cause better reading skills despite the correlation of the two.

    Two elements are missing from this study. One is the measurement of age, which is related to both foot size and reading comprehension. The other missing element is a conceptual or theoretical model that provides a basis for causal interpretations of the relationships between age and foot size and between age and reading comprehension. Getting older is correlated with both and we say it is the cause of both because we have a plausible conceptual model of human development that is consistent with such an interpretation.

    Causation without correlation. It is a common misconception that correlation is required for causation. Let’s start with a simple example that reveals this to be a fallacy. Suppose the value of y is known to be caused by x. The true relationship between x and y is mediated by another factor, call it A, that takes values of +1 or -1 with equal probability. The true process relating x to y is y = Ax.

    It is a simple matter to show that the correlation between x and y is zero. Perhaps the most intuitive way is to imagine many samples (observations) of x, y pairs. Over the sub-sample for which the pairs have the same sign (i.e. for which A happened to be +1) y=x and the correlation is 1. Over the sub-sample for which the pairs have the opposite signs (i.e. for which A happened to be -1) y=-x and the correlation is -1. Since A is +1 and -1 with equal probability, the contributions to the total correlation from the two sub-samples cancel, giving a total correlation of zero.

    Since x really does have a causal role in determining the value of y we see that causation can exist without correlation. This result hinges on the precise definition of correlation. It is a specific statistic and reveals only a little bit about how x and y relate. Specifically, if x and y are zero mean and unit variance (which we can assume without loss of generality), correlation is the expected value of their product. That single number can’t possibly tell us everything about how x might relate to y. If we didn’t know the true process y=Ax and the statistics of A in advance we might be tempted to say that x cannot cause y due to a lack of correlation. That would be an incorrect conclusion. Correlation and our lack of understanding of it would be misleading us.

    But there are other statistics to consider. In the example above x and y are uncorrelated but their magnitudes are not. That is, there are functions of x and functions of y that are correlated. This must be so because the two relate to each other (causally) somehow. In general, evidence consistent with the causal relationship is found in the probability density of y conditioned on x. If x causes y then that conditional probability, p(y|x), must be a function of (vary with) x. It is possible for p(y|x) to depend on x yet for the correlation of x and y to be zero. But causation cannot exist if p(y|x) is independent of x. Or, put even more simply, though x and y can be both uncorrelated and causally related, they cannot be statistically independent and causally related.

    Advanced example. (This is a bit more advanced so some readers may wish to skip it.) I’ll close with a nice real world like example offered by my colleague Steve Pizer. Suppose we have good theoretical reasons to believe that illness causes death. Let

    y = death (1 if dead, 0 if alive),
    x = illness (1 if sick, 0 if not),
    t = administration of treatment (1 if treated, 0 if not),
    e = other unobservable factors (could be anything).

    The true (hypothetical!) model of death is y = (1-t)x + e. That is if an individual is ill (x=1) and doesn’t get treatment (t=0) they would surely die apart from the effects of other factors denoted by e. On the other hand, sick individuals who do get treated live, again ignoring e. Assume the correlation of t and x is very high (like 0.99). That is, nearly everyone who is ill gets treatment and almost nobody who is not ill does. Therefore, hardly anyone who contracts the illness actually dies from it.

    If we estimate this model without observing t, we would find that illness and death are uncorrelated. Such a finding might tempt us to question our theory that illness causes death. This would be a mistake because we’ve omitted an important factor, treatment t, in the analysis. However, if we can observe t, then the high but imperfect correlation between t and x might make it possible to estimate the true effect of illness on death, using appropriate econometric techniques. We might therefore learn the degree to which illness (untreated) causes death, consistent with our theory.

    The foregoing is an illustration of the type of incorrect conclusions that can result from improper analysis of observational study data (as opposed to a randomized trial). Steve has written a very handy tutorial paper [pdf] on this topic, which I recommend highly to anyone working on observational studies or wishing to better understand them. Additional exploration of the econometric issues is provided in the Background (Section 2) and Set-up (Section 3.1) of a recent NBER paper by Millimet and Tchernis.

    Later: For more on this topic, see my follow-up posts.

    • Twitter
    • Facebook
    • Digg
    • Delicious
    • Google Buzz
    • Yahoo Buzz
    • StumbleUpon
    • Share/Bookmark