Music
Hello, this is Ahuka and welcome to Hacker Public Radio for another exciting episode.
And this one is a little bit of a change of pace, but it was something that I got some
increase about and figure what the heck, let's do this.
We have a fellow Charles and New Jersey who's been doing a series on mathematics.
This is sort of kind of related, but I'm going to do it without actually doing very
much in the way of math at all.
I'm going to talk about polling, particularly political polling, and these statistical
background, just to understand what's going on, because I've noticed that a lot of people
really don't have a very good handle on how to interpret this stuff.
You see poll results thrown around all the time, but are they meaningful?
What should I be looking for?
So I'm going to try and address this.
Now you might wonder, gee, what are your qualifications for doing that?
Well, the first, I was at one point a professor who taught classes in statistics at the university
level.
So, I've got pretty good handle on the mathematics involved in all of this.
Again, I'm not really going to get into that, but I have done the math.
Also I have worked for a political consulting company, and the company that I worked for
did do polling for clients, so I have some exposure to what it's actually like to do
political polling.
So on that basis, now you understand what makes me think I have some valid grounds for offering
an opinion.
You can decide whether or not you want to listen to it.
So to get started, the basic question of epistemology, that's everything comes back
to that, and that is, how do we know those things that we say we know?
Always a very good question.
Now in the case of statistics, how do we know things about statistics?
Well, this all began.
The mathematics of this started to be worked out as a way of analyzing gambling.
If when you play poker and you're told a hand with three of a kind beats a hand with
two pair, why is that?
Well, that's because the hand with two pair, you've got something that shows up four
point seven five percent of the time, and that's a lot more likely than three of a kind
which shows up two point eleven percent of the time.
So it's more than twice as common, in other words.
So that's why, the less common the hand is, the higher the value, and it beats the ones
that are more common.
So everything starts with that, but then another big jump in the development of statistics
during the Napoleonic Wars, for the first time, large armies were involved, and the casualties
were pretty substantial.
And some doctors involved started to realize that, oh, you know, maybe we should like gather
evidence about these wounds and investigate which treatments actually work, and so they
started to develop biosetistics, a medical branch of this, that expanded the universe
a little bit more.
And the thing that you need to bear in mind about all this, it's based on probability.
This is one of those things that, for a lot of people, it's hard to wrap their minds around
that, because we tend to like things that are black and white, is this true or not.
All right, well, some questions can be answered that way.
And in fact, that's one of the reasons I have argued that, in fact, statistics and mathematics
are, in fact, not very closely related, because mathematics generally speaking, you do get
real definitive answers.
In statistics, you don't, you get probabilities, can drive people nuts, all right?
The Albert Einstein, who was a fairly smart guy, according to everything I've been able
to read about him, had problems with this, you know, he was one of the people who developed
quantum mechanics and discovered that everything is based on probabilities, and that
bothered him so much that he started looking for any kind of way to get rid of the statistical
probabilities involved.
And he famously said, God does not play dice with the universe.
And the physicist from Denmark, Neil's boy, said Albert stopped telling God what to do.
And it turns out that God does play dice with the universe, or to put it another way if you
don't like to put it in terms of theology.
The universe is based on probabilities, and it just is.
One of those facts, now how do we think about probability?
I think a good way to do that is what would happen if you did the same thing over and over?
You would get a range of outcomes, but some outcomes would show up more often.
And you know, that's the essence of how we understand probability.
What are the outcomes that show up most often?
Now one of the things that throws a lot of people because they're not used to thinking
this way, what if something is very unlikely?
It has a very low probability.
Does that mean you'll never see it?
No, you will see it, or someone will see it, a certain percentage of the time.
Unlikely things do happen, they just don't happen as often.
One of the things that I like to say to people to illustrate this is that it's kind of a
joke.
If you are one in a million, then there are 1,500 people in China exactly like you.
Do the math, it works out that way.
Within probably another couple of decades, we'll be able to say that there are 1,500 people
in India exactly like you.
I think India is scheduled at this point to overtake China as the country with the largest
population somewhere around 2040, but that's just a projection.
So that's how probabilities work, and that leads to one of the techniques we use to develop
an idea of how these things work.
It's called a Monte Carlo simulation, and Monte Carlo is, of course, a casino in Europe,
a very famous one where wealthy people in Tuxedo's go to wager, and in statistics, a Monte
Carlo simulation is like an experiment that you run over and over and over, generally with
a computer algorithm that's going to generate random data that you can use to test your
theories.
Now, a very famous mathematician named John von Neumann understood this very well and programmed
one of the very first computers that he'd be acting to carry out Monte Carlo simulations.
Another concept I want to bring in is called the Law of Large Numbers, which in layman's
terms says that if you repeat the experiment many times, the average result should be equal
to the expected result.
Now, it's an average we're talking about, any particular experiment could give weird
results that are nothing like the expected result, and that is to be expected in a distribution.
But when you average it out over a whole range of experiments, the occasional high ones
are offset by the occasional low ones, and the average result is pretty good.
But to get this, you may need to do it many, many times, and the more times you repeat
the experiment, the closer your results on average should be when you average them out.
Our third key concept, random sampling.
Random sampling says that every member of a population has an equal chance of being selected
for a sample.
Now, population, what does that mean?
That population is what ever group you want to make a claim about.
If you are investigating the properties of a particular group of things or people or would
have you, that particular group is your population, and you're going to make a claim about
that.
But if you want to make a claim about left-handed Mormons, your samples should exclude
anyone who's right-handed, or anyone who's a Lutheran, but it should afford an equal chance
of selection for all left-handed Mormons.
Now, this is where a lot of problems can arise.
For example, many medical studies in the 20th century included all, or mostly all, men.
But the results were applied to all adults.
Now, can you tell who got left out?
Yes, women are women and men identical medically, not necessarily.
I have an endot authority that there are some differences in the endocrine system and hormones
and things like that.
Could have an effect on the results.
Now, fortunately, at some point, they started to realize that, and a lot of the studies
now are done in a better manner.
But you need to be careful if something works and adults is going to work with children
or vice versa.
If that's not part of your sample, you cannot make that claim.
Now, if you do something like, select a sample that doesn't really represent the population
you're talking about that's called sampling bias.
And that can be a big problem.
So we've got some basic concepts.
Notice I haven't had to do any real math at this point.
And so we can start looking at polling and just how good it is or isn't as the case
may be.
But it is often very good, but history does show some big blunders along the way.
But to understand how this stuff works, the first thing that we need to get out of the
way is it's sampling if it is done properly does work.
This is a mathematical fact and has been proven many times over.
Now, you may have trouble believing that a thousand people are an accurate measure of what
a million people or even a hundred million people will do, but in fact it does work.
When there are problems, it is usually because someone made a mistake, such as drawing a sample
that is not truly an unbiased sample from the population in question.
This does happen and you need to be careful about this in examining polling results.
In the earlier part of the 20th century, there were some polls that were done via telephone
surveys.
But because telephones were not universally available at that time, these polls overstated
the number of people who were more wealthy and affluent and they may have tended to vote
for a particular candidate or a particular party more so than the population at large.
And so there were some fairly notable examples of polls that went awry that way.
Now, that was the early part of the 20th century, but the latter part of the 20th century,
those telephones surveys were considered perfectly valid because a point was reached.
We're just about everyone had a phone and frankly, the very few who didn't have a phone
were very unlikely to be voters anyway.
So that was considered perfectly valid and, in fact, polls done that way were fine until recently.
Now what happened recently was it turned out that the way they were doing it was they
were calling landlines only.
And I can tell you how this is done in a lot of cases is that you can do a poll, a random
sample by going through the telephone book and, you know, every fourth page, and then you
look up a random number in a table of random numbers and you count down that many spaces
in the page and whoever that is, that's one of the people you're going to call.
Perfectly valid way of getting a random sample, but the telephone book only has people
with landlines.
Well, what has happened in the last 10 years or so is that a lot of people and I am one
of them have gone to using mobile phones exclusively and that means that the polls stopped
being valid if they were only done with landlines.
So the polling companies, you know, they started to realize there was a problem and they
started to make adjustments.
So if you see a poll done now, chances are they will have made an effort to get a representative
number of people from cell phones in there.
Now, why would that be a problem?
Well, you know, I think if you did an analysis of this, you would find that the cell phone
only group on average is younger.
And younger people may have different political views than older people and in some respects
they do.
That's pretty much a known phenomenon.
So it's important that you take that into account.
Other things you need to watch out for will pollsters limit the sample in a given way.
A big issue should you include all registered voters now in the United States, you need
to be registered before you can vote and I'm just going to say I'm not familiar with
how other countries handle this.
But you could go for all registered voters or you could limit it to what are called
likely voters.
And this is where it gets very dicey because deciding who is a likely voter is pretty
pretty much a judgment call by the pollster.
And bias can creep in.
And one of the things, if you study political polls, is that certain companies, we
refer to as the House bias or the House effect.
There are some companies that tend to report results that are more favorable to Republicans
and others that report results that are more favorable to Democrats, those are the two major
political parties in the United States.
And so that's one of those things you need to take into account.
And some of that is going to come from their likely voter screen as it's referred.
It's a place where you're going to bias the numbers a little bit.
So how do we know that samples actually work now that I've explained everything that's
involved?
Well, we have two strong pieces of evidence.
First we know from Monte Carlo simulations how well samples compared to the underlying
populations and controlled experiments.
You create a population with known parameters, pull a bunch of samples and see how well
they match up to the known population.
And so we've got some really pretty good results on all of this.
Secondly we have the results of many surveys and you know in political polls there's
always the acid test and that is what happens when the election is held.
And then you're going to get the definitive result and either your surveys match up with
what actually happened or they don't.
And generally speaking polls done by reputable pollsters usually do match up pretty well
with what happens in the election.
Occasionally you'll find someone who's consistently biased a certain way.
But you can take that into account.
So if this particular place always gives republicans numbers that are three points higher
than actually happens in the election, if you know that you just subtract three points off
of the result and at least in that case it's still a fairly accurate guide once you
adjust for that.
Now I'm going to introduce another concept.
The confidence interval comes from the fact that even an unbiased sample will not match
the population exactly.
To see what I mean consider what happens if you toss a fair or unbiased coin.
If it is a truly fair coin you should get heads 50% of the time on average and tails 50%
of the time again on average.
But the key here is on average.
If you toss this coin a hundred times would you always get exactly 50 heads and exactly
50 tails of course not.
You might get 48 heads and 52 tails the first time.
53 heads and 47 tails the second time and so on.
Each time you get slightly different results.
But if you did this at whole bunch of times and average your results you would get ever closer
to that 50 50 split when you averaged things.
But probably not hit it exactly.
What this means is that your results will be close to what is in the population most of
the time but terms like close and most of the time are very imprecise.
How close and how often really should be specified more precisely and we can do that with
the confidence interval.
Now this starts with the how often question and the standard is usually 95% of the time.
This is called a 95% confidence interval.
Sometimes the compliment of 95 is used and so you'll see it referred to as accurate to
the 05 level.
This is essentially the same thing for our purposes and if you're a real statistician and
you think I'm doing violence to these concepts please remember this is not a graduate
level statistics course it's just a podcast for the intelligent layperson who wants to
understand poorly.
But this 95% level of confidence is kind of arbitrary and in some scientific applications
this can be raised or lowered but in polling you can think of this as the best practice
industry standard.
So what does that mean?
If I did this poll this way and it's a 95% confidence interval 95 times out of a hundred
and my results should be pretty close to the actual figure in the population 95 times out
of a hundred that's also like 19 times out of 20 so we're aiming to do something that
is going to be correct 19 times out of 20 now that's the that's the most of the time
part of this now the other part how close now this is not at all arbitrary this is
called the margin of error and once you've chosen the level of confidence it's a pretty
straightforward function of the sample size in other words if you toss a coin 10 times
getting six heads and four tails is very likely.
But if you toss it a hundred times getting 60 heads and 40 tails is less likely in
other words the bigger the sample size the closer it should match the population now you
might think therefore pollsters should just use very large sample sizes to get better
accuracy but you run into a problem sampling costs money all right if you do an
poll I say I was in this business at one point you have to hire people to do what we
call them interviews you have to hire people to do the interviews you have to get
telephone lines the cost money so you can work it out that on average every interview
you do when you take into account the pay for the person doing the interview the
telephone lines overhead and what have you it's going to cost you $10 or whatever per
interview if you do double the number of interviews you double the cost of the
survey well you know if if you've got double the accuracy that might be worth it
but in fact you don't because the increase in accuracy tails off very very
quickly so you know doubling the sample size might get you 10% more accuracy in
your results if you double it again it might get you 5% more accuracy and is that
worth spending two or four or eight times the money right generally not which
you're looking for is a sweet spot where the cost of the survey is not too much
but the accuracy is acceptable and that's why you tend to see numbers in anywhere
from 1,000 to 3,000 for a survey of a large population because the the
sweet spot is going to be somewhere in that range now any reputable poll should
make available some basic information so here's some of the facts that should
be reported first of all when the poll was taken timing can mean a lot no
there's a joke about the only thing that would sink a candidate in certain
places is being caught having sex with a live man or a dead woman well
suppose a candidate did have something terrible revealed I'd like to know
was it revealed before the poll was taken or after the poll was taken could
mean a big difference to the results how big was the sample that that's
something that should get reported what kinds of people were sampled was there
an attempt to limit it to likely voters what is the margin of error so there's
going to be one in there somewhere so want to know what that is what is the
confidence interval now generally speaking a reputable pollster will make
that available however that doesn't mean that a television or newspaper or
magazine report is going to give you all that information usually they don't
because they think and no one cares about that stuff or if they do give any
of it it might get into a footnote somewhere I can tell a vision in particular
does a terrible job with this but that's because they they do a terrible job with
most things but all of these are factors that would affect how you interpret
what you see so I I did a quick look up and I will put the link to this story
in the my show notes this was a story from a poll site called Politico they
tend to lean somewhat conservative and they report two polls on something called
Obamacare which is a major healthcare initiative here in the United States and
as I'm recording this in the first half of December we're going to see
that in fact these these polls were just done you know one of them finished
December 8th in the other finished December 9th so of of 2013 so very very
current kind of stuff so so what is Politico say the Pew survey of 2001
adults was conducted December 3rd to December 8th and has a margin of error
of plus or minus 2.6 percentage points that gets at a lot of the stuff we
were talking about when was the poll taken well the interviews were done from
December 3rd to December 8th so I could look at it and say was there any big
new thing that happened before or after that that that I would want to take
into account how big a sample well it says it was 2001 what kinds of people
were sampled well it says it was adults was an attempt to limit it to likely
voters no I don't think so what is the margin of error well it says it was
plus or minus 2.6 percentage points what is the confidence interval now that I
do not see here okay but I could probably go back to the website and find that
so what was the other poll says the Quinnipiac survey of 200692 voters was
conducted from December 3rd to December 9th and has a margin of error of
plus or minus 1.9 percentage points very similar information now this is
Politico deciding what to report on each of these so they reported them
equivalently good for them what are the differences well the first poll the
Pew survey says it was the poll of adults the Quinnipiac survey says it was a
survey of voters you know that could make a big difference and in fact the polls
did have somewhat different results they were sampling different populations
so the results are not really comparable now at this point you have to say
what was the purpose of the survey if the purpose of the survey is to look
at how people in general feel about this survey of adults probably makes
pretty good sense if the purpose was to forecast how this will affect candidates
in the 2014 elections that second poll that was a survey of voters might be
more relevant they see it you need to pay attention to these things to interpret
what's going on now notice then that the second one had a slightly larger
sample size 2692 versus 2001 and it had a smaller margin of error plus or
minus 1.9 points compared to plus or minus 2.6 points that's exactly what
we should expect to see all right remember that the whole thing about margin
of error the larger the sample size the smaller the margin of error should be third
I note this the second poll poll poll poll pollsters use the term in the field the
second poll was in the field one day longer than the first poll they both started
on December 3rd but on December 8th the Pew survey did their last interview
and on December 9th Quinnipiac did their so one day and it may not matter but
again if I'm doing political polling I'd say did anything happen to
December 9th that would affect this if there was a very significant news event
on December 9th that could have affected the results now I don't see anything
in this about how the people were contacted or that kind of stuff but for
instance I went to the Quinnipiac website and got their analysis and I'll just
a brief quote from that from December 3rd to 9th Quinnipiac University
surveyed 2692 registered voters nationwide so that's the first thing I now know
that it was registered voters as opposed to likely voters that could be
significant with a margin of error of plus or minus 1.9 percentage points live
interviewers call land lines and cell phones not to me that's very significant
it's significant in two ways the first of all live interviewers there are
some polls that are what we call robo polls and that is an odd a completely
automated system that just starts calling numbers and ask people to punch things
into their phone and response to pre-recorded questions we know that those have
different results from having live interviewers part of that is what we call
self-selection bias some people if they they hear a robot thing they just
hang up the phone this don't want to be bothered and when people do self-
selection that is a form of sampling bias you're you're getting a survey that
is representative of people who are willing to put up with your poll but does
that mean they are representative of the population in general perhaps not so
live interviewers is considered the gold standard on this and generally much
superior now it also costs more so there are places that like to do daily
polling on a particular races that are of great significance and in order to do
that they they use robo calling and you know that can be valuable it keeps
down the cost so they can be polling much more frequently but you may need
to make some adjustment to the results you get now the second thing I see
there is it said they called land lines and cell phones so I know that there
was not an age related bias due to only calling land lines and that's worth
knowing so moral of the story is you if you dig a little you can get all of this
stuff alright you may need to go to the website but but you can do it now one last
thing I want to get into here that when I said 95% confidence level and I
didn't see that in the report but I'm gonna assume that because that really
is pretty much the industry standard for all of this stuff that means every one
out of 20 on average one out of every 20 polls will be to use the technical
term bad crap crazy that's why you should never assign too much significance
to any one poll particularly if it gives you results different from all other
polls you may well be looking at that one out of 20 that is just totally crazy
now you know there's a human tendency to seize on it if it tells you what you
want to hear but that is usually a mistake it's when a number of pollsters do a
number of polls and get roughly the same result that you should start to believe
it that does not mean they will agree exactly there is still the usual margin of
error that's why if you see a poll that says you know candidate A 51% and her
opponent 49% and then they say well it's a dead heat as well isn't one of
them ahead well margin of error right if you've got a 2% margin of error candidate
A could be getting 53% on one end or 49% on the other assuming the poll
is accurate non biased so you need to get outside of the margin of error before
you start believing it at all but as I said if it's if every other poll that's
out there is showing something very different from what your poll shows you may
have that one out of 20 and the thing I want to emphasize here is that that
happens not because the pollster made a mistake it's because the nature of
random sampling is such that everyone's no while a random sample will just
randomly come up with very unrepresentative group that's the nature of
randomness you know that the mathematics of how we construct all of this
says we can at least put boundaries around this and say well how often is it
going to be like that and you know how different can it be we can put numbers on
that but it's still probabilities in the final analysis so one of the things in
the United States we had our presidential election last year and there was a
lot of discussion about all of this that there were the polls were basically
showing Obama leading and you know not by a huge margin but it was
generally you know he was up by let's say you know five points on average and
most of the polls and a lot of people said no the polls were skewed which what
they were actually saying was the sampling was biased so you had all these
people saying ah they're not getting as many Republicans in the survey as they
should have and if they correct for that then and and what were they looking at
that likely voter screen was a big part of it so they say well you know how
many people in the last election voted Republican and where there is many
Republicans in this sample as voted in the last election and stuff like that so
you know all the things we've talked about were part of this discussion no the
thing you need to bear in mind about all of that is that the polls you know they
were pretty much all saying the same thing it turns out we've had reports since
then that say the internal pollster for the Republican campaign was telling
them the same thing as we were seeing from all the other polls they were just
making up stuff because it made them feel better you know that can happen but in
general if if a number of reliable pollsters are telling you the same story you
probably want to believe that story all right it occasionally a pollster will just
have a really bad year and usually what happens as a result of that is they're
going to go back and say okay where did we go wrong because our numbers were
not matching and of course you always do get the the actual result of the
election and the actual result of the election was almost bang on what the
polls said it was going to be so they really were accurate particularly if you
averaged out all of these polls so you know maybe it gives you a little bit of
understanding of how this stuff works and and how to interpret and so this is
a hookah for hacker public radio reminding everyone support free software thank
you you have been listening to hacker public radio and techer public radio
does are we are a community podcast network the release of shows every week
day and one day for a day today show like all our shows was contributed by a
HBR listener like yourself if you ever consider recording a podcast then visit
our website to find out how easy it really is hacker public radio was found
by the digital dot pound and the economic and computer club HBR is
funded by the binary revolution at binref.com all binref projects across
this sponsored by linear pages from shared hosting to custom private clouds
go to lunar pages.com for all your hosting needs on this otherwise
stages today show is released on the creative comments at tribution share
online.