hpr_transcripts/hpr1421.txt

Episode: 1421
Title: HPR1421: Statistics and Polling
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1421/hpr1421.mp3
Transcribed: 2025-10-18 02:04:55

---


Hello, this is Ahuka and welcome to Hacker Public Radio for another exciting episode.
And this one is a little bit of a change of pace, but it was something that I got some
inquiries about and figured what the heck, let's do this.
We have a fellow Charles in New Jersey who's been doing a series on mathematics.
This is sort of kind of related, but I'm going to do it without actually doing very much
in the way of math at all.
I'm going to talk about polling, particularly political polling and the statistical background,
just to understand what's going on.
Because I've noticed that a lot of people really don't have a very good handle on how to interpret
this stuff.
You see poll results thrown around all the time, but, you know, are they meaningful?
What should I be looking for?
So I'm going to try and address this.
Now you might wonder, gee, what are your qualifications for doing that?
Well, the first I was at one point a professor who taught classes in statistics at the university
level.
So I've got pretty good handle on the mathematics involved in all of this.
Again, I'm not really going to get into that, but I have done the math.
Also I have worked for a political consulting company and the company that I worked for
did do polling for clients.
So I have some exposure to what it's actually like to do political polling.
So on that basis, now you understand what makes me think I have some valid grounds for offering
an opinion.
You can decide whether or not you want to listen to it.
So to get started, the basic question of epistemology, that's everything comes back to that.
And that is how do we know those things that we say we know?
Always a very good question.
Now in the case of statistics, how do we know things about statistics?
Well, the mathematics of this started to be worked out as a way of analyzing gambling.
If when you play poker and you're told a hand with three of a kind beats a hand with
two pair, why is that?
Well, that's because the hand with two pair, you've got something that shows up 4.75% of
the time.
And that's a lot more likely than three of a kind, which shows up 2.11% of the time.
So it more than twice is common, in other words.
So that's why the less common the hand is, the higher the value, and it beats the ones
that are more common.
So everything starts with that.
But then another big jump in the development of statistics during the Napoleonic Wars.
For the first time, large armies were involved, and the casualties were pretty substantial.
And some doctors involved started to realize, oh, maybe we should gather evidence about
these wounds and investigate which treatments actually work.
And so they started to develop a bio statistics, a medical branch of this, that expanded
the universe a little bit more.
And the thing that you need to bear in mind about all this, it's based on probability.
This is one of those things that, for a lot of people, it's hard to wrap their minds
around that.
Because we tend to like things that are black and white, is this true or not?
Some questions can be answered that way.
And in fact, that's one of the reasons I have argued that, in fact, statistics and mathematics
are, in fact, not very closely related.
Because mathematics, generally speaking, you do get real definitive answers.
In statistics, you don't.
You get probabilities, can drive people nuts.
The Albert Einstein, who was a fairly smart guy, according to everything I've been able
to read about him, had problems with this.
He was one of the people who developed quantum mechanics and discovered that everything
is based on probabilities, and that bothered him so much that he started looking for any
kind of way to get rid of the statistical probabilities involved.
And he famously said, God does not play dice with the universe.
And the physicist from Denmark, Niels Bohr, said, Albert, stop telling God what to do.
And it turns out that God does play dice with the universe, or to put it another way,
if you don't like to put it in terms of theology.
The universe is based on probabilities.
It just is.
It's one of those facts.
Now, how do we think about probability?
I think a good way to do that is what would happen if you did the same thing over and over?
You would get a range of outcomes, but some outcomes would show up more often.
And that's the essence of how we understand probability.
What are the outcomes that show up most often?
Now one of the things that throws a lot of people, because they're not used to thinking
this way, what if something is very unlikely?
It has a very low probability.
Does that mean you'll never see it?
No.
You will see it.
Or someone will see it, a certain percentage of the time.
Unlikely things do happen.
They just don't happen as often.
One of the things that I'd like to say to people to illustrate this is that it's kind
of a joke.
If you are one in a million, then there are 1,500 people in China exactly like you.
Do the math.
It works out that way.
Within probably another couple of decades, we'll be able to say that there are 1,500 people
in India exactly like you.
I think India is scheduled at this point to overtake China as the country with the largest
population somewhere around 2040, but that's just a projection.
So that's how probabilities work.
That leads to one of the techniques we use to develop an idea of how these things work.
It's called a Monte Carlo simulation.
Monte Carlo is, of course, a casino in Europe, a very famous one where wealthy people in
Tuxedo go to wager.
And in statistics, a Monte Carlo simulation is like an experiment that you run over and
over and over, generally with a computer algorithm that's going to generate random data that
you can use to test your theories.
Now a very famous mathematician named John Funnoiman understood this very well and programmed
one of the very first computers, the ENIAC, to carry out Monte Carlo simulations.
Now another concept I want to bring in is called the Law of Large Numbers, which in
layman's terms says that if you repeat the experiment many times, the average result
should be equal to the expected result.
It's an average we're talking about.
Any particular experiment could give weird results that are nothing like the expected result,
and that is to be expected in a distribution.
But when you average it out over a whole range of experiments, the occasional high ones
are offset by the occasional low ones, and the average result is pretty good.
But to get this, you may need to do it many, many times, and the more times you repeat
the experiment, the closer your results on average should be when you average them out.
Our third key concept, random sampling.
Random sampling says that every member of a population has an equal chance of being selected
for a sample.
Now, population, what does that mean?
That population is whatever group you want to make a claim about.
If you are investigating the properties of a particular group of things, or people,
or what have you, that particular group is your population, and you're going to make
a claim about that.
So if you want to make a claim about left-handed Mormons, your samples should exclude anyone
who's right-handed, or anyone who's a Lutheran, but it should afford an equal chance of selection
for all left-handed Mormons.
Now, this is where a lot of problems can arise.
For example, many medical studies in the 20th century included all, or mostly all, men.
But the results were applied to all adults.
Now, can you tell who got left out?
Yes, women are women and men identical medically, not necessarily.
I have it on good authority that there are some differences in the endocrine system and
hormones and things like that, could have an effect on the results.
Now, fortunately, at some point, they started to realize that, and a lot of the studies
now are done in a better manner.
But you need to be careful if something works and adults isn't going to work with children,
or vice versa.
If that's not part of your sample, you cannot make that claim.
Now, if you do something like select a sample that doesn't really represent the population
you're talking about, that's called sampling bias.
That can be a big problem.
So we've got some basic concepts.
Notice I haven't had to do any real math at this point.
And so we can start looking at polling and just how good it is, or isn't, as the case
may be.
And it is often very good, but history does show some big blunders along the way.
But to understand how this stuff works, the first thing that you need to get out of the
way is that sampling, if it is done properly, does work.
This is a mathematical fact and has been proven many times over.
Now, you may have trouble believing that a thousand people are an accurate measure of
what a million people or even a hundred million people will do, but in fact, it does work.
And there are problems it is usually because someone made a mistake, such as drawing a sample
that is not truly an unbiased sample from the population in question.
This does happen, and you need to be careful about this in examining polling results.
In the earlier part of the 20th century, there were some polls that were done via telephone
surveys, but because telephones were not universally available at that time, these polls overstated
the number of people who were more wealthy and affluent, and they may have tended to vote
for a particular candidate or a particular party more so than the population at large.
And so there were some fairly notable examples of polls that went awry that way.
Now that was the early part of the 20th century.
By the latter part of the 20th century, those telephone surveys were considered perfectly
valid because there was a point was reached, we're just about everyone had a phone.
And frankly, the very few who didn't have a phone were very unlikely to be voters anyway.
So that was considered perfectly valid, and in fact, polls done that way worked fine
until recently.
Now what happened recently was it turned out that the way they were doing it was they
were calling landlines only.
And I can tell you how this is done in a lot of cases is that you can do a poll, a random
sample by going through the telephone book, and every fourth page, and then you look
up a random number in a table of random numbers, and you count down that many spaces
in the page, and whoever that is, that's one of the people you're going to call, perfectly
valid way of getting a random sample.
But the telephone book only has people with landlines.
Well, what has happened in the last 10 years or so is that a lot of people, and I am one
of them, have gone to using mobile phones exclusively, and that means then that the polls
stopped being valid if they were only done with landlines.
So the polling companies, they started to realize there was a problem, and they started
to make adjustments.
So if you see a poll done now, chances are they will have made an effort to get a representative
number of people from cell phones in there.
Now why would that be a problem?
Well, I think if you did an analysis of this, you would find that the cell phone only
group on average is younger, and younger people may have different political views than
older people, and in some respects they do, that's a pretty much a known phenomenon.
So that's, it's important that you take that into account.
Other things you need to watch out for, will pollsters limit the sample in a given way?
A big issue, should you include all registered voters?
Now, in the United States you need to be registered before you can vote, and I'm just going
to say I'm not familiar with how other countries handle this, but you could go for all registered
voters, or you could limit it to what are called likely voters.
And this is where it gets very dicey, because deciding who is a likely voter is pretty
much a judgment call by the pollster, and bias can creep in.
One of the things, if you study political polls, is that certain companies, we refer to
as the House bias or the House effect.
There are some companies that tend to report results that are more favorable to Republicans
and others that report results that are more favorable to Democrats, those are the two
major political parties in the United States, and so that's one of those things you'd
need to take into account, and some of that is going to come from their likely voter
screen, as it's referred, that's a place where you're going to bias the numbers a little
bit.
So how do we know that samples actually work, now that I've explained everything that's
involved?
Well, we have two strong pieces of evidence.
First we know from Monte Carlo simulations how well samples compare to the underlying
populations and controlled experiments.
You create a population with known parameters, pull a bunch of samples, and see how well
they match up to the known population.
And so we've got some really pretty good results on all of this.
Secondly, we have the results of many surveys, and in political polls, there's always
the acid test, and that is what happens when the election is held, all right?
And then you're going to get the definitive result, and either your surveys match up with
what actually happened, or they don't.
And generally speaking, polls done by reputable pollsters usually do match up pretty well
with what happens in the election, occasionally you'll find someone who's consistently biased
a certain way, but you can take that into account.
So if this particular place always gives Republicans numbers that are three points higher than
actually happens in the election, well, if you know that, you just subtract three points
off of the result, and at least in that case, it's still a fairly accurate guide once
you adjust for that.
Now, I'm going to introduce another concept.
The confidence interval.
Now the confidence interval comes from the fact that even an unbiased sample will not match
the population exactly.
To see what I mean, consider what happens if you toss a fair or unbiased coin.
If it is a truly fair coin, you should get heads 50% of the time on average, and tails
50% of the time, again, on average, but the key here is on average.
If you toss this coin a hundred times, would you always get exactly 50 heads and exactly
50 tails?
Of course not.
You might get 48 heads and 52 tails the first time, 53 heads and 47 tails the second time,
and so on.
You know, each time you get slightly different results.
But if you did this a whole bunch of times, and averaged your results, you would get ever
closer to that 50-50 split when you averaged things, but probably not hit it exactly.
And what this means is that your results will be close to what is in the population most
of the time, but terms like close and most of the time are very imprecise.
How close and how often really should be specified more precisely, and we can do that
with the confidence interval.
Now this starts with the how often question, and the standard is usually 95% of the time.
This is called a 95% confidence interval.
Sometimes the complement of 95 is used, and so you'll see it referred to as accurate
to the O5 level.
This is essentially the same thing for our purposes.
And if you're a real statistician, and you think I'm doing violence to these concepts,
please remember this is not a graduate level statistics course, it's just a podcast
for the intelligent layperson who wants to understand polling.
So this 95% level of confidence is kind of arbitrary, and in some scientific applications
this can be raised or lowered, but in polling you can think of this as the best practice
industry standard.
So what does that mean?
If I did this poll this way, and it's a 95% confidence interval, 95 times out of 100,
my result should be pretty close to the actual figure in the population, 95 times out
of 100.
That's also like 19 times out of 20, all right?
So we're aiming to do something that is going to be correct 19 times out of 20.
Now that's the most of the time part of this.
Now the other part, how close?
Now this is not at all arbitrary, this is called the margin of error.
And once you've chosen the level of confidence, it's a pretty straightforward function of
the sample size.
In other words, if you toss a coin 10 times, getting six heads and four tails is very likely.
But if you toss it 100 times, getting 60 heads and 40 tails is less likely.
In other words, the bigger the sample size, the closer it should match the population.
Now, you might think, therefore, pollsters should just use very large sample sizes to get
better accuracy, but you run into a problem.
Sampling costs money, all right?
If you do in a poll, I say I was in this business at one point, you have to hire people to do what
we call them interviews, you have to hire people to do the interviews.
You have to get telephone lines to cost money.
So you can work it out that on average, every interview you do when you take into account
the pay for the person doing the interview, the telephone lines overhead and what have
you.
It's going to cost you $10 or whatever per interview.
If you do double the number of interviews, you double the cost of the survey.
Well, if you got double the accuracy that might be worth it, but in fact, you don't, because
the increase in accuracy tails off very, very quickly.
So doubling the sample size might get you 10% more accuracy in your results.
If you double it again, it might get you 5% more accuracy.
And is that worth spending two or four or eight times the money, generally not.
What you're looking for is a sweet spot where the cost of the survey is not too much, but
the accuracy is acceptable.
That's why you tend to see numbers anywhere from 1,000 to 3,000 for a survey of a large population
because the sweet spot is going to be somewhere in that range.
Now any reputable poll should make available some basic information.
So here's some of the facts that should be reported.
First of all, when the poll was taken, timing can mean a lot.
No, there's a joke about the only thing that would sink a candidate in certain places
is being caught having sex with a live man or a dead woman.
Well, suppose a candidate did have something terrible revealed.
I'd like to know, was it revealed before the poll was taken or after the poll was taken?
To make a big difference to the results.
How big was the sample?
That's something that should get reported.
What kinds of people were sampled?
Was there an attempt to limit it to likely voters?
What is the margin of error?
There's going to be one in there somewhere.
So I want to know what that is.
What is the confidence interval?
OK.
Now, generally speaking, a reputable pollster will make that available, however, that doesn't
mean that a television or newspaper or magazine report is going to give you all that information.
Usually they don't because they think, yeah, no one cares about that stuff.
Or if they do give any of it, it might get into a footnote somewhere.
A television in particular does a terrible job with this.
But that's because they do a terrible job with most things.
But all of these are factors that would affect how you interpret what you see.
So I did a quick look up and I will put the link to this story into my show notes.
This was a story from a poll site called Politico.
They tend to lean somewhat conservative.
And they report two polls on something called Obamacare, which is a major health care
initiative here in the United States.
And as I'm recording this in the first half of December, we're going to see that in fact,
these polls were just done.
One of them finished December 8th and the other finished December 9th.
So of 2013.
So very, very current kind of stuff.
So what does Politico say?
The Pew survey of 2001 adults was conducted December 3rd to December 8th and has a margin
of error of plus or minus 2.6 percentage points.
That gets at a lot of the stuff we were talking about.
When was the poll taken?
Well, the interviews were done from December 3rd to December 8th.
So I could look at it and say, was there any big news thing that happened before or after
that that I would want to take into account?
How big a sample?
Well, it says it was 2001.
What kinds of people were sampled?
Well, it says it was adults.
Is there an attempt to limit it to likely voters?
No, I don't think so.
What is the margin of error?
Well, it says it was plus or minus 2.6 percentage points.
What is the confidence interval?
Now, that I do not see here.
But I could probably go back to the website and find that.
So what was the other poll?
It says the Quinnipiac survey of 2,692 voters was conducted from December 3rd to December 9th
and has a margin of error of plus or minus 1.9 percentage points.
Very similar information.
Now, this is Politico deciding what to report on each of these.
So they reported them equivalently.
Good for them.
What are the differences?
Well, the first poll, the Pew survey,
says it was the poll of adults.
The Quinnipiac survey says it was a survey of voters.
You know, that could make a big difference.
And in fact, the polls did have somewhat different results.
They were sampling different populations.
So the results are not really comparable.
Now, at this point, you'd have to say,
well, what was the purpose of the survey?
And if the purpose of the survey is to look at how people in general feel about this,
survey of adults probably makes pretty good sense.
If the purpose was to forecast how this will affect candidates in the 2014 elections,
that second poll that was a survey of voters might be more relevant.
You need to pay attention to these things to interpret what's going on.
Now, notice then that the second one had a slightly larger sample size,
2,692 versus 2001.
And it had a smaller margin of error, plus or minus 1.9 points,
compared to plus or minus 2.6 points.
That's exactly what we should expect to see.
Remember that the whole thing about margin of error,
the larger the sample size, the smaller the margin of error should be.
Third, I note that the second poll, pollsters use the term in the field.
The second poll was in the field one day longer than the first poll.
They both started on December 3rd.
But on December 8th, the Pew survey did their last interview.
And on December 9th, Quinnipiac did their one day, and it may not matter.
But again, if I'm doing political polling, I'd say did anything happen
December 9th that would affect this.
If there was a very significant news event on December 9th,
that could have affected the results.
Now, I don't see anything in this about how the people were contacted
or that kind of stuff.
But for instance, I went to the Quinnipiac website and got their analysis.
And I'll just a brief quote from that.
From December 3rd to 9th, Quinnipiac University surveyed 2,692 registered voters nationwide.
So that's the first thing.
I now know that it was registered voters, as opposed to likely voters.
That could be significant.
With a margin of error of plus or minus 1.9 percentage points,
live interviewers call landlines and cell phones.
Now, to me, that's very significant.
It's significant in two ways.
First of all, live interviewers.
There are some polls that are what we call Robo polls.
And that is an completely automated system that just starts calling numbers
and asks people to punch things into their phone
in response to pre-recorded questions.
We know that those have different results from having live interviewers.
Part of that is what we call self-selection bias.
Some people, if they hear a robot thing, they just hang up the phone.
They just don't want to be bothered.
And when people do self-selection, that is a form of sampling bias.
You're getting a survey that is representative of people who are willing to put up with your poll.
But does that mean they are representative of the population in general, perhaps not.
So live interviewers is considered the gold standard on this, and generally much superior.
Now, it also costs more.
So there are places that like to do daily polling
on particular races that are of great significance.
And in order to do that, they use Robo calling.
And that can be valuable.
It keeps down the cost so they can be polling much more frequently.
But you may need to make some adjustment to the results you get.
Now, the second thing I see there is it said they called landlines and cell phones.
So I know that there was not an age-related bias due to only calling landlines.
And that's worth knowing.
So moral of the story is, if you dig a little, you can get all of this stuff.
All right?
You may need to go to the website, but you can do it.
Now, one last thing I want to get into here.
When I said 95% confidence level, and I didn't see that in the report, but I'm going to assume
that because that really is pretty much the industry standard for all of this stuff.
That means every one out of 20 on average, one out of every 20 polls will be to use the
technical term, that crap crazy.
That's why you should never assign too much significance to any one poll, particularly
if it gives you results different from all other polls.
You may well be looking at that one out of 20 that is just totally crazy.
Now, you know, there's a human tendency to seize on it if it tells you what you want
to hear, but that is usually a mistake.
That's when a number of pollsters do a number of polls and get roughly the same result
that you should start to believe it.
That does not mean they will agree exactly.
There is still the usual margin of error.
That's why if you see a poll that says, you know, candidate a 51% and her opponent 49%
and then they say, well, it's a dead heat.
You say, wow, isn't one of them ahead?
Yeah, margin of error.
If you've got a 2% margin of error, candidate a could be getting 53% on one end or 49% on
the other, assuming the poll is accurate and unbiased.
So you need to get outside of the margin of error before you start believing it at all,
but as I said, if every other poll that's out there is showing something very different
from what your poll shows, you may have that one out of 20.
And the thing I want to emphasize here is that that happens not because the pollster made
a mistake, it's because the nature of random sampling is such that every once in a while
a random sample will just randomly come up with very unrepresentative group.
That's the nature of randomness.
You know, the mathematics of how we construct all of this says we can at least put boundaries
around this and say, well, how often is it going to be like that and how different can
it be?
We can put numbers on that, but it's still probabilities in the final analysis.
So one of the things in the United States, we had our presidential election last year.
And there was a lot of discussion about all of this that there were, the polls were basically
showing Obama leading.
And you know, not by a huge margin, but it was generally, you know, he was up by, let's
say, you know, five points on average and most of the polls.
And a lot of people said, no, the polls were skewed, which what they were actually saying
was the sampling was biased.
So he had all these people saying, ah, they're not getting as many Republicans in the survey
as they should have.
And if they correct for that, then, and what were the, they're looking at that likely voter
screen was a big part of it.
So they say, well, you know, how many people in the last election voted Republican and
were there as many Republicans in this sample as voted in the last election and stuff like
that?
You know, all the things we've talked about were part of this discussion.
No, the thing you need to bear in mind about all of that is that the polls, you know, they
were pretty much all saying the same thing.
It turns out we've had reports since then that say the internal pollster for the Republican
campaign was telling them the same thing as we were seeing from all the other polls.
They were just making up stuff because it made them feel better.
You know, that can happen.
But in general, if, if a number of reliable pollsters are telling you the same story, you
probably want to believe that story, all right, occasionally a pollster will just have
a really bad year.
And usually what happens as a result of that is they're going to go back and say, okay,
where did we go wrong because our numbers were not matching.
And of course, you always do get the actual result of the election and the actual result
of the election was almost bang on what the polls said it was going to be.
So they really were accurate, particularly if you averaged out all of these polls.
Well, you know, maybe this gives you a little bit of an understanding of how this stuff
works and how to interpret.
And so this is a hookah for Hacker Public Radio reminding everyone support free software.
Thank you.
You have been listening to Hacker Public Radio or Hacker Public Radio does all right.
We are a community podcast network that releases shows every weekday Monday through Friday.
Today's show, like all our shows, was contributed by a HBR listener like yourself.
If you ever consider recording a podcast, then visit our website to find out how easy
it really is.
Hacker Public Radio was founded by the digital dog pound and the infonomicum computer
club.
HBR is funded by the binary revolution at binref.com.
All binref projects are proudly sponsored by Lina Pages.
From shared hosting to custom private clouds, go to LinaPages.com for all your hosting needs.
Unless otherwise stasis, today's show is released under a creative commons, attribution,
share a line, free dose of license.