Don’t trust the polls (too much)

It’s natural in an election with high stakes to follow closely any signs of who is going to win. The polls are by far the most noticeable of those signs, but everyone knows that the polls aren’t perfect predictors. There are two obvious reasons for this.

First, polls measure voters’ opinions at the moment, and those opinions can easily change before election day (and of course, some are undecided). Second, polls are a random sample of voters, and therefore suffer from random errors based on who happens to get polled. I think everyone is basically aware of these problems.

There are, however, other problems with them that I think people don’t really think about as much. The problem with these errors is that they don’t just make the polls less useful by introducing noise. They actually bias the results consistently in one direction or another. This is the kind of error that no amount of polling or averaging of multiple polls can eliminate. Here are a couple of the issues:

  • Organization: The Obama campaign, which has a huge amount of money, has been spending a lot more on organizing (as opposed to ads) than is traditional. The campaign’s employees and volunteers are working incredibly hard to register new voters, and as soon as the registration deadlines pass, they’ll start preparing for getting out the vote (and in many cases, will start instantly getting out the early vote in states that allow it). Now McCain has his own forces, but overall Obama’s outnumber his substantially. Now, this varies a lot from state to state. In some states Obama has a huge advantage, while in others he has none. No one knows exactly what kind of advantage Obama might get in each state, but what we do know is that this isn’t picked up in the polls.
  • Turnout: This overlaps with organization, but is different in some ways. Polls don’t just call X random people and ask them who they’re voting for. They call a bunch, then try to adjust their sample to match “likely voters”. This involves asking a bunch of questions to try to determine if each respondent is likely to come out and actually vote.  It also involves weighting their samples so that various demographic/ideological groups make up the same portion of the sample as they will voters in November.  This is always tricky, but it’s trickier this time.  There are issues of race and gender to play with, as well as the old question of whether young voters will actually show up.  The more likely this election is to violate patterns from previous elections, the more these models of who will vote are going to be guesswork and unreliable.
  • Lying to pollsters: People sometimes tell people they’ll vote for X and then vote for Y, or that they’re undecided when they’re not.  It’s not just about people changing their mind since the poll.  Sometimes they just don’t tell the truth.  Why?  Well, there is a long history of people telling pollsters the things that they think the pollsters want to hear, or hiding things they find embarrassing.  Polls routinely show much higher levels of exercise, for example, or church attendance, than actually happens.  You could imagine several ways this would happen in this election.  One is the so-called “Bradley effect,” where voters say they are voting for a black candidate only to then not vote for him.  This seems to me like it would be likely in instances where the perception is that the main reason to vote against the candidate is racism.  If it’s widely accepted that non-racists can vote against the candidate, I wouldn’t expect it so much.  I could also imagine something of this sort based on the media message.  If the current media narrative is that Bush has bungled his presidency and the Republicans are hopeless, the voter might feel as if the pollster will look down on them for voting Republican.  This could lead to an artificially high number for any Democrat right now.  (Incidentally, my guess is that this effect existed and was largely deflated by the Republican convention, which is where McCain’s bounce came from.)  You could also imagine that this effect in general makes the polls more extreme in states with a clear favorite, because voters feel like they’re the odd ones out if they vote for candidate less favored in their area.
  • Cell phones: Most pollsters don’t call cell phones.  If cell phone use is correlated with particularly political preferences, this could matter a lot.  Younger voters are more frequently cell-only, but this can be compensated for by overweighting other young voters who are contacted.  The real question is, within a given demographic group, whether those with cell phones likely to have a different political preference than those without.  Pollsters can’t control for everything with weighting, and I would assume cell phone ownership (to the exclusion of land lines) correlates with not just age and race, but also education level, income level, urban/rural location, etc.  This could mean a big difference is hidden here.

So to what extent do these effects exist, and if so, whom do they favor?  Really, no one has the slightest clue.  The best we can do is look at previous elections (including the primary) to see if they existed there.  Of course, there are multiple, possibly contradictory effects, and teasing out what’s going on is near impossible.  My guess is that all effects above do exist, if only in small amounts.  I also would be willing to bet that all except the lying favors Obama being better off than the polls imply.  The best analysis I’ve seen of this stuff is at FiveThirtyEight, but the analysis there of the Bradley effect is based on the Democratic primary, with a very different universe of voters and a lot of other complicating factors.  Same with the cell phone analysis, with similar problems (plus some others).

The bottom line is just that there isn’t enough information out there for us to really know anything that exactly, regardless of how much polling we do.  Don’t think of a state as guaranteed unless the polling margins are pretty big.  This is an unusual election, so don’t be surprised by unusual results.

  • email
  • Digg
  • StumbleUpon
  • Reddit
  • del.icio.us
  • Facebook

Comments

One Response to “Don’t trust the polls (too much)”

  1. Justin Identicon Icon Justin on October 8th, 2008 6:38 pm

    This is great! This is the best analysis I have heard in a long time. I agree with everything you said. Especially when you look at past elections, you will see that even John Kerry four years ago was always ahead in the polls and you see how that one turned out!

Leave a Reply