SCI199Y: October 8, 1996
Required for next week
Some technicalities on polls
The margin of error
In many newspaper articles on polls, the result of the poll will be accompanied by a statement something like "polls of this size are accurate to within 3 percentage points, 19 times out of 20". None of the polls at Politics Now on the internet seem to include this, but they do provide information on the number of people polled for each poll, which is usually good enough.
The margin of error is computed using the assumption that people in the population are one of two types: i.e., they will either vote for Clinton or Dole, they will either vote Yes or No on the referendum, they are either pro-choice or pro-life, and so on. Let's call the two types Red and Green. The margin of error is computed as

In fact, we don't know the fraction Red, (that's why we're conducting the poll),
so this is usually replaced by one-half.
So now the margin of error is a simple function of the size of the sample:

Here is a little table:

Three (or more) candidates
The formula given above doesn't work for more than three candidates, and although it's possible to compute a margin of error for such polls, it's a bit trickier. Since Perot is a really distant third, the more complicated formula would not change the Clinton-Dole comparison by more than a tenth of a percentage point or so.
Problems with polls
Of course, people aren't really red and green balls, and their opinions can change due to a variety of factors. Pollers try to poll 'likely voters', rather than 'registered voters', but how they determine this is not always clear. In the Quebec referendum last year, a significant number (10-15%) of the respondents to the polls said they were 'undecided'. When the 'yes' tally was prepared for the headline articles, these undecided voters were allocated to either Yes or No, according to a complicated formula based on their answers to other questions. Typically that meant that undecideds were split about 70-30 for No.
Often the results of many polls are plotted in a time series, so people can play 'spot-the-trend'. However, the margin of error applies to each poll separately, and not to the whole sequence of polls. The luck of the draw would be expected to lead to a few 'rogue' polls. What is perhaps more worrisome, from a social policy perspective, is that the polls themselves might influence the outcome of later polls, and in fact the election.
In the Globe & Mail this week
Monty Hall again... This rather elegant solution was posted on the internet. I think it's okay...
There are two strategies:In case 1, the probability of winning is 1/3. In case 2, the probability of losing is 1/3, because you will lose only if you had picked the winning door first.
- You stay with your initial pick all the time.
- You switch doors every time.
Technical note: theory behind the margin of error
. We don't know p, but use the sample estimate again
(or just use
which gives the largest possible margin of error
for 0<p<1.)

catches the true proportion 95% of the time (19 times out of 20).
The factor 2, the 95%
and the
comes from the 'bell curve', or normal
distribution, used here as an approximation to the binomial distribution.