You may have seen the famous 1948 picture where a jubilant Harry S Truman holds up the Chicago Tribune whose front page headline was written based on faulty poll data. I don’t do much political work, but in the survey world, this “oops” moment is often cited as a classic case of coverage error: Pollsters had sampling frames (basically a fancy term for a list) with names and phone numbers. So they did phone surveys and asked people who they planned to vote for. All the rich folks with phones answered “Dewey!” But people without phones (that is, those not covered by the pollsters’ lists) voted too, and they voted for…Truman.
So is this month’s election an “oops, we did it again” of coverage error? All the polls said we were on our way to having a female president and then…well, we all know what happened. I think it’s a case of total survey error. I know this sounds like an action movie, but let me explain.
We can consider three sources of non-sampling error in surveys: Coverage error, Response error, and Measurement Error.
It’s true that reputable political polls are phone polls, and phone surveying, especially using landlines has come under fire in the past 10 years or so as more and more people go cell-only. Call centers that conduct phone surveys will usually make some proportion of their calls to cell phones, but those numbers can’t be auto-dialed and so are more expensive to call, so call centers have to make a combined survey methodology/ business decision on what proportion of numbers will be landline vs cell. So of course, we we do end up with potential for coverage error.
But coverage error is not the only culprit here. Most survey experts will be quick to tell you that non-response error played a role as big as or bigger than coverage error in predicting this year’s election. Just like coverage error is the difference between people covered by your sampling frame (aka list) and people who are not covered, non-response error is the difference between people who respond and those who don’t. And if the people who respond to your survey are civic- or community- minded people who are happy to answer a few questions and tell you who they’re voting for, while the people who refuse to respond are disgruntled and distrusting of random pollsters calling them to ask questions… well, there you go. Non-response error.
It also doesn’t seem so far-fetched to me that given the tension, strife, and ugliness surrounding the election, measurement error also played a part. We define measurement error as the difference between the true value and reported value in a survey– so, I might plan to vote for one candidate, but for whatever reason (it doesn’t fit with my image of myself, I don’t want to admit it to a random person on the phone, my spouse is listening to my answers…) I tell the interviewer I’m going to vote for the other candidate.
So, there you have it, readers. All three sources of non-sampling survey error applied to this election. As soon as I can stomach it, I promise I’ll write about why you should even bother with this error-prone survey thing.