In a post today Nate Silver reminds us just how prevalent cellphone use is and how and why it is skewing polls. He brings the data:
And he brings this geek-o-licious ‘graph that is co-linear with one of this blog’s current themes:
The pollsters’ usual defense mechanism against this [under-sampling of cellphone-only households] is to weight their polls by demogrpahics — something which they need to do anyway, since polls are subject to many forms of non-response bias (for instance, it’s harder to get men on the phone then women). But this is potentilly an inadequate response for several reasons. First, some characteristics that correlate with both cellphone usage and political preferences may not correspond to those that are most commonly used to weight polls. It is somewhat rare, for instance, for pollsters to weight their polls by characteristics like urban/rural location or marital status, which are predictive of both cellphone usage and political beliefs. Being cellphone-dependent also appears to be significantly correlated with media consumption habits (in particular, getting more of one’s news from the Internet and less from television), which also seems to be increasingly important in determining one’s political views. And there are some characteristics that may be even more subtle. For instance, there are some hints in the CDC data (such as the higher prevalance of binge drinking) that cellphone-only adults are less “domestic” and more “bohemian”. I suspect that, in young adults, this is correlated with more liberal political views. (Bold mine.)
By now readers of this blog can sum all this up much more succinctly. Polls are suffering from cellphone-use selection bias. Silver is telling us that attempts to correct it based on observable characteristics are inadequate. Being in the cellphone-only group (“treatment”) and political preferences (“outcome”) are both related to hard to measure factors (“unobservables”). Exclusive cellphone use is endogenous, causing poll results to be biased, even after controlling for observable factors.
I wish I could say there is an obvious exogenous factor that could be exploited via instrumental variables techniques to address this issue. But I can’t think of anything that affects cellphone use and that is not related to political preferences. It’s awfully hard because it needs to be something that works for local polls (statewide at a minimum), so large scale geographic variations can’t be exploited. I don’t think this is approachable with IV, do you?