Fixed vs. random effects. (Or, how to have fun at a party.)

Here’s a sure fire way to have fun at a party: invite an econometrician and biostatistician. After a few drinks have been served, casually drop, “I could never decide whether fixed or random effects are better. Thoughts?” Make sure you have plenty of popcorn.

OK, listening to the econometrician rant about the bias of random effects (RE) and the biostatistician fret about the inefficiency of fixed effects (FE) might not be your kind of fun. I mean, it’s not “death panels” debate fun.

Nevertheless, it’s a real debate, one which has been repeated many, many times, probably with about the same lack-of-consensus outcome driven by very similar considerations. (The within party variation in the debate is a lot higher than the between party variation. Har har.)

A few, recent papers offer some insight as to how to decide which to choose. Though the papers address other considerations (e.g., how one might want to use the model for prediction or whether one needs to assess the effects of covariates that only vary between units), I will only consider the bias-precision trade off here. Both papers use similar simulation techniques to evaluate bias and mean square error (MSE) of FE and RE and both come to the conclusion that a Hausman test, sadly, is frequently not helpful. But all is not lost.

In a May 2015 article in Political Science Research Methods [ungated here], Clark and Linzer conclude with the following rules of thumb:

[CL1] When variation in the independent variable is primarily within units—that is, the units are relatively similar to one another on average—the choice of random versus fixed effects only matters at extremely high levels of correlation between the independent variable and the unit effects.

This is intuitive because FE only exploits within-unit variation and RE relies on both within- and between-unit variation. When variation is largely within units, they’re both largely driven by the same thing. This takes RE’s greater efficiency off the table as a first-order concern. But one might still worry about bias when the independent variable is highly correlated with unit effects, which are effectively “unobserved” by the RE estimator.


[CL2] [w]hen the independent variable exhibits only minimal within-unit variation, or is sluggish, there is a more nuanced set of considerations. In any particular dataset, the random-effects model will tend to produce superior estimates of β when there are few units or observations per unit, and when the correlation between the independent variable and unit effects is relatively low. Otherwise, the fixed-effects model may be preferable, as the random-effects model does not induce sufficiently high variance reduction to offset its increase in bias.

The intuition here is that when within-unit variation is low and there are few observations, an estimator that is driven only by it (FE) will exhibit high imprecision. RE supplements within-unit variation with between-unit variation, increasing precision. Yay! That’s all fine, until endogeneity concerns start to dominate (high enough correlation between the independent variable and unit effects), at which point FE may start to look worthwhile again.

Those rules of thumb aren’t very specific. What does “low” or “high” variation or correlation mean? The specific simulation results, as expressed in handy charts in the paper help answer that question. I’ll leave it to the interested reader to take a look.

In PLOS One last October, Dieleman and Templin published a similar paper using similar methods. In addition to FE and RE, they also included simulation analysis of a “within-between” (WB) estimator, which I will leave to you to read about. Here are there rules of thumb, as they apply to FE vs. RE:

[DT1] Another unique scenario when the RE estimator is consistently MSE-preferred and should be considered is for small samples that have relatively small within-group variation for the variable of interest. Again, in these cases, the imprecision of the FE and WB estimators might be more caustic than the RE estimator’s bias. In simulation cases with less than 500 observations and within-group variation less than 20% of the total variation, RE estimation leads to a smaller absolute error 53% of the time.

This is the same advice as CL2, above. Sticking to consideration of small samples, the authors write that they

[DT2] mark the circumstances under which a practitioner might consistently choose precision over bias. […] One scenario is when the estimated model explains a very small portion of the variation in the outcome measurement. When small sample size is combined with a poorly-fit model, the imprecision of FE and WB estimation tends to mislead the researcher more than the bias of RE estimation, even at large ρ [correlation between the independent variable and unit effects]. The goodness-of-fit [] can be explored by examining the R2 statistic associated with [FE] estimation. Considering only simulations with R2<0.5 and less than 500 observations, the traditional RE estimator had a smaller absolute error than the FE estimator 57% of the time.

This seems to offer a reason to prefer RE, even when endogeneity concerns might seem to be high (large ρ). I haven’t figured out the intuition on this one. Why would failure to explain a considerable amount of variation (R2<0.5) make bias (endogeneity) a lesser concern?

So much for small sample sizes. What about larger ones?

[DT3] [A]s a general rule, the larger the sample size, the more a practitioner should avoid traditional RE estimation. Applying FE estimation on all simulated samples with greater than 500 observations led to a median absolute error of 4% of the true marginal effect. RE estimation led to a median absolute error of 8% of the true marginal effect. In simulations with more than 1,000 observations, RE estimation was only MSE-preferred beyond a trivial threshold (0.005) in a very few cases where 90% of variation of y could not be explained by the model.

This may be a safe general rule but CL1 indicates circumstances when RE estimation is just fine, even at large sample sizes. Again, turn to the paper for specifics.

Putting all this together, the statistics one should examine to make an FE vs. RE decision include:

  • Sample size in general and number of units and number of observations within units in particular
  • Correlation coefficient between independent variable and unit effects. (Clark and Linzer are explicit that it’s the correlation between unit means of the independent variable and unit effects one should consider, but I think they’re only considering balanced panel data, for which this would be the same thing as the correlation between the independent variable and unit effects.)
  • The proportion of the variance in the independent variable that is within units as opposed to between units.
  • The FE R-squared.

With these, one can stare at the charts provided by Clark/Linzer and Dieleman/Templin and ponder one’s choices. Or, one could throw a party, invite some economists and biostats geeks and have at it.

(What did I get wrong in this post? Comments open for one week.)


Hidden information below


Email Address*