Music Hello, this is Ahuka and welcome to Hacker Public Radio for another exciting episode. And this one is a little bit of a change of pace, but it was something that I got some increase about and figure what the heck, let's do this. We have a fellow Charles and New Jersey who's been doing a series on mathematics. This is sort of kind of related, but I'm going to do it without actually doing very much in the way of math at all. I'm going to talk about polling, particularly political polling, and these statistical background, just to understand what's going on, because I've noticed that a lot of people really don't have a very good handle on how to interpret this stuff. You see poll results thrown around all the time, but are they meaningful? What should I be looking for? So I'm going to try and address this. Now you might wonder, gee, what are your qualifications for doing that? Well, the first, I was at one point a professor who taught classes in statistics at the university level. So, I've got pretty good handle on the mathematics involved in all of this. Again, I'm not really going to get into that, but I have done the math. Also I have worked for a political consulting company, and the company that I worked for did do polling for clients, so I have some exposure to what it's actually like to do political polling. So on that basis, now you understand what makes me think I have some valid grounds for offering an opinion. You can decide whether or not you want to listen to it. So to get started, the basic question of epistemology, that's everything comes back to that, and that is, how do we know those things that we say we know? Always a very good question. Now in the case of statistics, how do we know things about statistics? Well, this all began. The mathematics of this started to be worked out as a way of analyzing gambling. If when you play poker and you're told a hand with three of a kind beats a hand with two pair, why is that? Well, that's because the hand with two pair, you've got something that shows up four point seven five percent of the time, and that's a lot more likely than three of a kind which shows up two point eleven percent of the time. So it's more than twice as common, in other words. So that's why, the less common the hand is, the higher the value, and it beats the ones that are more common. So everything starts with that, but then another big jump in the development of statistics during the Napoleonic Wars, for the first time, large armies were involved, and the casualties were pretty substantial. And some doctors involved started to realize that, oh, you know, maybe we should like gather evidence about these wounds and investigate which treatments actually work, and so they started to develop biosetistics, a medical branch of this, that expanded the universe a little bit more. And the thing that you need to bear in mind about all this, it's based on probability. This is one of those things that, for a lot of people, it's hard to wrap their minds around that, because we tend to like things that are black and white, is this true or not. All right, well, some questions can be answered that way. And in fact, that's one of the reasons I have argued that, in fact, statistics and mathematics are, in fact, not very closely related, because mathematics generally speaking, you do get real definitive answers. In statistics, you don't, you get probabilities, can drive people nuts, all right? The Albert Einstein, who was a fairly smart guy, according to everything I've been able to read about him, had problems with this, you know, he was one of the people who developed quantum mechanics and discovered that everything is based on probabilities, and that bothered him so much that he started looking for any kind of way to get rid of the statistical probabilities involved. And he famously said, God does not play dice with the universe. And the physicist from Denmark, Neil's boy, said Albert stopped telling God what to do. And it turns out that God does play dice with the universe, or to put it another way if you don't like to put it in terms of theology. The universe is based on probabilities, and it just is. One of those facts, now how do we think about probability? I think a good way to do that is what would happen if you did the same thing over and over? You would get a range of outcomes, but some outcomes would show up more often. And you know, that's the essence of how we understand probability. What are the outcomes that show up most often? Now one of the things that throws a lot of people because they're not used to thinking this way, what if something is very unlikely? It has a very low probability. Does that mean you'll never see it? No, you will see it, or someone will see it, a certain percentage of the time. Unlikely things do happen, they just don't happen as often. One of the things that I like to say to people to illustrate this is that it's kind of a joke. If you are one in a million, then there are 1,500 people in China exactly like you. Do the math, it works out that way. Within probably another couple of decades, we'll be able to say that there are 1,500 people in India exactly like you. I think India is scheduled at this point to overtake China as the country with the largest population somewhere around 2040, but that's just a projection. So that's how probabilities work, and that leads to one of the techniques we use to develop an idea of how these things work. It's called a Monte Carlo simulation, and Monte Carlo is, of course, a casino in Europe, a very famous one where wealthy people in Tuxedo's go to wager, and in statistics, a Monte Carlo simulation is like an experiment that you run over and over and over, generally with a computer algorithm that's going to generate random data that you can use to test your theories. Now, a very famous mathematician named John von Neumann understood this very well and programmed one of the very first computers that he'd be acting to carry out Monte Carlo simulations. Another concept I want to bring in is called the Law of Large Numbers, which in layman's terms says that if you repeat the experiment many times, the average result should be equal to the expected result. Now, it's an average we're talking about, any particular experiment could give weird results that are nothing like the expected result, and that is to be expected in a distribution. But when you average it out over a whole range of experiments, the occasional high ones are offset by the occasional low ones, and the average result is pretty good. But to get this, you may need to do it many, many times, and the more times you repeat the experiment, the closer your results on average should be when you average them out. Our third key concept, random sampling. Random sampling says that every member of a population has an equal chance of being selected for a sample. Now, population, what does that mean? That population is what ever group you want to make a claim about. If you are investigating the properties of a particular group of things or people or would have you, that particular group is your population, and you're going to make a claim about that. But if you want to make a claim about left-handed Mormons, your samples should exclude anyone who's right-handed, or anyone who's a Lutheran, but it should afford an equal chance of selection for all left-handed Mormons. Now, this is where a lot of problems can arise. For example, many medical studies in the 20th century included all, or mostly all, men. But the results were applied to all adults. Now, can you tell who got left out? Yes, women are women and men identical medically, not necessarily. I have an endot authority that there are some differences in the endocrine system and hormones and things like that. Could have an effect on the results. Now, fortunately, at some point, they started to realize that, and a lot of the studies now are done in a better manner. But you need to be careful if something works and adults is going to work with children or vice versa. If that's not part of your sample, you cannot make that claim. Now, if you do something like, select a sample that doesn't really represent the population you're talking about that's called sampling bias. And that can be a big problem. So we've got some basic concepts. Notice I haven't had to do any real math at this point. And so we can start looking at polling and just how good it is or isn't as the case may be. But it is often very good, but history does show some big blunders along the way. But to understand how this stuff works, the first thing that we need to get out of the way is it's sampling if it is done properly does work. This is a mathematical fact and has been proven many times over. Now, you may have trouble believing that a thousand people are an accurate measure of what a million people or even a hundred million people will do, but in fact it does work. When there are problems, it is usually because someone made a mistake, such as drawing a sample that is not truly an unbiased sample from the population in question. This does happen and you need to be careful about this in examining polling results. In the earlier part of the 20th century, there were some polls that were done via telephone surveys. But because telephones were not universally available at that time, these polls overstated the number of people who were more wealthy and affluent and they may have tended to vote for a particular candidate or a particular party more so than the population at large. And so there were some fairly notable examples of polls that went awry that way. Now, that was the early part of the 20th century, but the latter part of the 20th century, those telephones surveys were considered perfectly valid because a point was reached. We're just about everyone had a phone and frankly, the very few who didn't have a phone were very unlikely to be voters anyway. So that was considered perfectly valid and, in fact, polls done that way were fine until recently. Now what happened recently was it turned out that the way they were doing it was they were calling landlines only. And I can tell you how this is done in a lot of cases is that you can do a poll, a random sample by going through the telephone book and, you know, every fourth page, and then you look up a random number in a table of random numbers and you count down that many spaces in the page and whoever that is, that's one of the people you're going to call. Perfectly valid way of getting a random sample, but the telephone book only has people with landlines. Well, what has happened in the last 10 years or so is that a lot of people and I am one of them have gone to using mobile phones exclusively and that means that the polls stopped being valid if they were only done with landlines. So the polling companies, you know, they started to realize there was a problem and they started to make adjustments. So if you see a poll done now, chances are they will have made an effort to get a representative number of people from cell phones in there. Now, why would that be a problem? Well, you know, I think if you did an analysis of this, you would find that the cell phone only group on average is younger. And younger people may have different political views than older people and in some respects they do. That's pretty much a known phenomenon. So it's important that you take that into account. Other things you need to watch out for will pollsters limit the sample in a given way. A big issue should you include all registered voters now in the United States, you need to be registered before you can vote and I'm just going to say I'm not familiar with how other countries handle this. But you could go for all registered voters or you could limit it to what are called likely voters. And this is where it gets very dicey because deciding who is a likely voter is pretty pretty much a judgment call by the pollster. And bias can creep in. And one of the things, if you study political polls, is that certain companies, we refer to as the House bias or the House effect. There are some companies that tend to report results that are more favorable to Republicans and others that report results that are more favorable to Democrats, those are the two major political parties in the United States. And so that's one of those things you need to take into account. And some of that is going to come from their likely voter screen as it's referred. It's a place where you're going to bias the numbers a little bit. So how do we know that samples actually work now that I've explained everything that's involved? Well, we have two strong pieces of evidence. First we know from Monte Carlo simulations how well samples compared to the underlying populations and controlled experiments. You create a population with known parameters, pull a bunch of samples and see how well they match up to the known population. And so we've got some really pretty good results on all of this. Secondly we have the results of many surveys and you know in political polls there's always the acid test and that is what happens when the election is held. And then you're going to get the definitive result and either your surveys match up with what actually happened or they don't. And generally speaking polls done by reputable pollsters usually do match up pretty well with what happens in the election. Occasionally you'll find someone who's consistently biased a certain way. But you can take that into account. So if this particular place always gives republicans numbers that are three points higher than actually happens in the election, if you know that you just subtract three points off of the result and at least in that case it's still a fairly accurate guide once you adjust for that. Now I'm going to introduce another concept. The confidence interval comes from the fact that even an unbiased sample will not match the population exactly. To see what I mean consider what happens if you toss a fair or unbiased coin. If it is a truly fair coin you should get heads 50% of the time on average and tails 50% of the time again on average. But the key here is on average. If you toss this coin a hundred times would you always get exactly 50 heads and exactly 50 tails of course not. You might get 48 heads and 52 tails the first time. 53 heads and 47 tails the second time and so on. Each time you get slightly different results. But if you did this at whole bunch of times and average your results you would get ever closer to that 50 50 split when you averaged things. But probably not hit it exactly. What this means is that your results will be close to what is in the population most of the time but terms like close and most of the time are very imprecise. How close and how often really should be specified more precisely and we can do that with the confidence interval. Now this starts with the how often question and the standard is usually 95% of the time. This is called a 95% confidence interval. Sometimes the compliment of 95 is used and so you'll see it referred to as accurate to the 05 level. This is essentially the same thing for our purposes and if you're a real statistician and you think I'm doing violence to these concepts please remember this is not a graduate level statistics course it's just a podcast for the intelligent layperson who wants to understand poorly. But this 95% level of confidence is kind of arbitrary and in some scientific applications this can be raised or lowered but in polling you can think of this as the best practice industry standard. So what does that mean? If I did this poll this way and it's a 95% confidence interval 95 times out of a hundred and my results should be pretty close to the actual figure in the population 95 times out of a hundred that's also like 19 times out of 20 so we're aiming to do something that is going to be correct 19 times out of 20 now that's the that's the most of the time part of this now the other part how close now this is not at all arbitrary this is called the margin of error and once you've chosen the level of confidence it's a pretty straightforward function of the sample size in other words if you toss a coin 10 times getting six heads and four tails is very likely. But if you toss it a hundred times getting 60 heads and 40 tails is less likely in other words the bigger the sample size the closer it should match the population now you might think therefore pollsters should just use very large sample sizes to get better accuracy but you run into a problem sampling costs money all right if you do an poll I say I was in this business at one point you have to hire people to do what we call them interviews you have to hire people to do the interviews you have to get telephone lines the cost money so you can work it out that on average every interview you do when you take into account the pay for the person doing the interview the telephone lines overhead and what have you it's going to cost you $10 or whatever per interview if you do double the number of interviews you double the cost of the survey well you know if if you've got double the accuracy that might be worth it but in fact you don't because the increase in accuracy tails off very very quickly so you know doubling the sample size might get you 10% more accuracy in your results if you double it again it might get you 5% more accuracy and is that worth spending two or four or eight times the money right generally not which you're looking for is a sweet spot where the cost of the survey is not too much but the accuracy is acceptable and that's why you tend to see numbers in anywhere from 1,000 to 3,000 for a survey of a large population because the the sweet spot is going to be somewhere in that range now any reputable poll should make available some basic information so here's some of the facts that should be reported first of all when the poll was taken timing can mean a lot no there's a joke about the only thing that would sink a candidate in certain places is being caught having sex with a live man or a dead woman well suppose a candidate did have something terrible revealed I'd like to know was it revealed before the poll was taken or after the poll was taken could mean a big difference to the results how big was the sample that that's something that should get reported what kinds of people were sampled was there an attempt to limit it to likely voters what is the margin of error so there's going to be one in there somewhere so want to know what that is what is the confidence interval now generally speaking a reputable pollster will make that available however that doesn't mean that a television or newspaper or magazine report is going to give you all that information usually they don't because they think and no one cares about that stuff or if they do give any of it it might get into a footnote somewhere I can tell a vision in particular does a terrible job with this but that's because they they do a terrible job with most things but all of these are factors that would affect how you interpret what you see so I I did a quick look up and I will put the link to this story in the my show notes this was a story from a poll site called Politico they tend to lean somewhat conservative and they report two polls on something called Obamacare which is a major healthcare initiative here in the United States and as I'm recording this in the first half of December we're going to see that in fact these these polls were just done you know one of them finished December 8th in the other finished December 9th so of of 2013 so very very current kind of stuff so so what is Politico say the Pew survey of 2001 adults was conducted December 3rd to December 8th and has a margin of error of plus or minus 2.6 percentage points that gets at a lot of the stuff we were talking about when was the poll taken well the interviews were done from December 3rd to December 8th so I could look at it and say was there any big new thing that happened before or after that that that I would want to take into account how big a sample well it says it was 2001 what kinds of people were sampled well it says it was adults was an attempt to limit it to likely voters no I don't think so what is the margin of error well it says it was plus or minus 2.6 percentage points what is the confidence interval now that I do not see here okay but I could probably go back to the website and find that so what was the other poll says the Quinnipiac survey of 200692 voters was conducted from December 3rd to December 9th and has a margin of error of plus or minus 1.9 percentage points very similar information now this is Politico deciding what to report on each of these so they reported them equivalently good for them what are the differences well the first poll the Pew survey says it was the poll of adults the Quinnipiac survey says it was a survey of voters you know that could make a big difference and in fact the polls did have somewhat different results they were sampling different populations so the results are not really comparable now at this point you have to say what was the purpose of the survey if the purpose of the survey is to look at how people in general feel about this survey of adults probably makes pretty good sense if the purpose was to forecast how this will affect candidates in the 2014 elections that second poll that was a survey of voters might be more relevant they see it you need to pay attention to these things to interpret what's going on now notice then that the second one had a slightly larger sample size 2692 versus 2001 and it had a smaller margin of error plus or minus 1.9 points compared to plus or minus 2.6 points that's exactly what we should expect to see all right remember that the whole thing about margin of error the larger the sample size the smaller the margin of error should be third I note this the second poll poll poll poll pollsters use the term in the field the second poll was in the field one day longer than the first poll they both started on December 3rd but on December 8th the Pew survey did their last interview and on December 9th Quinnipiac did their so one day and it may not matter but again if I'm doing political polling I'd say did anything happen to December 9th that would affect this if there was a very significant news event on December 9th that could have affected the results now I don't see anything in this about how the people were contacted or that kind of stuff but for instance I went to the Quinnipiac website and got their analysis and I'll just a brief quote from that from December 3rd to 9th Quinnipiac University surveyed 2692 registered voters nationwide so that's the first thing I now know that it was registered voters as opposed to likely voters that could be significant with a margin of error of plus or minus 1.9 percentage points live interviewers call land lines and cell phones not to me that's very significant it's significant in two ways the first of all live interviewers there are some polls that are what we call robo polls and that is an odd a completely automated system that just starts calling numbers and ask people to punch things into their phone and response to pre-recorded questions we know that those have different results from having live interviewers part of that is what we call self-selection bias some people if they they hear a robot thing they just hang up the phone this don't want to be bothered and when people do self- selection that is a form of sampling bias you're you're getting a survey that is representative of people who are willing to put up with your poll but does that mean they are representative of the population in general perhaps not so live interviewers is considered the gold standard on this and generally much superior now it also costs more so there are places that like to do daily polling on a particular races that are of great significance and in order to do that they they use robo calling and you know that can be valuable it keeps down the cost so they can be polling much more frequently but you may need to make some adjustment to the results you get now the second thing I see there is it said they called land lines and cell phones so I know that there was not an age related bias due to only calling land lines and that's worth knowing so moral of the story is you if you dig a little you can get all of this stuff alright you may need to go to the website but but you can do it now one last thing I want to get into here that when I said 95% confidence level and I didn't see that in the report but I'm gonna assume that because that really is pretty much the industry standard for all of this stuff that means every one out of 20 on average one out of every 20 polls will be to use the technical term bad crap crazy that's why you should never assign too much significance to any one poll particularly if it gives you results different from all other polls you may well be looking at that one out of 20 that is just totally crazy now you know there's a human tendency to seize on it if it tells you what you want to hear but that is usually a mistake it's when a number of pollsters do a number of polls and get roughly the same result that you should start to believe it that does not mean they will agree exactly there is still the usual margin of error that's why if you see a poll that says you know candidate A 51% and her opponent 49% and then they say well it's a dead heat as well isn't one of them ahead well margin of error right if you've got a 2% margin of error candidate A could be getting 53% on one end or 49% on the other assuming the poll is accurate non biased so you need to get outside of the margin of error before you start believing it at all but as I said if it's if every other poll that's out there is showing something very different from what your poll shows you may have that one out of 20 and the thing I want to emphasize here is that that happens not because the pollster made a mistake it's because the nature of random sampling is such that everyone's no while a random sample will just randomly come up with very unrepresentative group that's the nature of randomness you know that the mathematics of how we construct all of this says we can at least put boundaries around this and say well how often is it going to be like that and you know how different can it be we can put numbers on that but it's still probabilities in the final analysis so one of the things in the United States we had our presidential election last year and there was a lot of discussion about all of this that there were the polls were basically showing Obama leading and you know not by a huge margin but it was generally you know he was up by let's say you know five points on average and most of the polls and a lot of people said no the polls were skewed which what they were actually saying was the sampling was biased so you had all these people saying ah they're not getting as many Republicans in the survey as they should have and if they correct for that then and and what were they looking at that likely voter screen was a big part of it so they say well you know how many people in the last election voted Republican and where there is many Republicans in this sample as voted in the last election and stuff like that so you know all the things we've talked about were part of this discussion no the thing you need to bear in mind about all of that is that the polls you know they were pretty much all saying the same thing it turns out we've had reports since then that say the internal pollster for the Republican campaign was telling them the same thing as we were seeing from all the other polls they were just making up stuff because it made them feel better you know that can happen but in general if if a number of reliable pollsters are telling you the same story you probably want to believe that story all right it occasionally a pollster will just have a really bad year and usually what happens as a result of that is they're going to go back and say okay where did we go wrong because our numbers were not matching and of course you always do get the the actual result of the election and the actual result of the election was almost bang on what the polls said it was going to be so they really were accurate particularly if you averaged out all of these polls so you know maybe it gives you a little bit of understanding of how this stuff works and and how to interpret and so this is a hookah for hacker public radio reminding everyone support free software thank you you have been listening to hacker public radio and techer public radio does are we are a community podcast network the release of shows every week day and one day for a day today show like all our shows was contributed by a HBR listener like yourself if you ever consider recording a podcast then visit our website to find out how easy it really is hacker public radio was found by the digital dot pound and the economic and computer club HBR is funded by the binary revolution at binref.com all binref projects across this sponsored by linear pages from shared hosting to custom private clouds go to lunar pages.com for all your hosting needs on this otherwise stages today show is released on the creative comments at tribution share online.