Required for next week: February 13, 1996

Quebec referendum: addendum

On January 27, 1996, a story on the front page of the Globe and Mail said ``Most Quebeckers expect separation within 10 years''. The article summarized results from a poll conducted by Léger and Léger, for the Journal de Montréal and the Globe and Mail. In a continuation of the article on p.A8, the proportion polled indicating support for the ``yes'' and ``no'' sides for all the polls taken by Léger and Léger since last July is plotted. We saw most of this graph before: two points added were a post-referendum poll conducted in November, 1995, and the poll discussed in this article.

The first paragraph says ``three in four Quebeckers believe the province will become a sovereign country some day and about 60 per cent expect that the change will occur within 10 years''. The poll was reported carefully, though, and on p.8 we find ``When asked `Do you believe that Quebec will become a sovereign country within...', 2.9\% replied within 1 year, 20.2\% said within 2 or 3 years, 24.2\% answered within 4 or 5 years, 14.5\% predicted it would be within 6 to 10 years and 12\% thought it would take more than 10 years. Only 21.9\% thought Quebec would never become sovereign.''

What do you think about the phrasing of the question?

A series of questions were asked on whether Quebeckers feel they get a good financial deal from the rest of the country. Here's an example: ``Does Quebec receive or not receive less than its share of federal government spending in the provinces?''. Result: 50\% said less; 37\% said not less.

(Statistics Canada data quoted in the same article indicates that in fact in Quebec Ottawa raises \$4,107 per capita and spends \$4,286.)

Technical Note: Regression

In the simplest version of regression, we have just two variables, usually called $y$ and $x$. Examples are given in the Paulos handout of last week, and in the Chance article for this week. Sometimes this is called ``simple linear regression''.

The least squares line that is fitted to a graph of $y$ against $x$ is meant to summarize the relationship between $y$ and $x$. In particular, if this line has a slope of zero, then there is no association between the two variables. Since there is always some variability in the measurements, the statistical question of interest is whether or not the slope is 'significantly different from zero', usually assessed by a hypothesis test, as described in the notes of November 21 and December 5. If the slope is greater than (less than) zero, then this is evidence that an increase in the $x$ variable is associated with an increase (decrease) in the $y$ variable. (This is equivalent to the assertion that $x$ and $y$ are positively (negatively) correlated.)

This type of analysis can be generalized in several ways. In the marathon runners example, least squares was used to fit a cubic polynomial instead of a straight line. There was a reference to the cubic coefficient being 'marginally significant' for the Osaka races, meaning that the estimate for the cubic term was just barely larger than zero. A second very common generalization is to use least squares to summarize the relationship between a single $y$ variable, such as health status, and several different $x$ variables, such as treatment, age, sex, prior history of disease, and so on. This usually goes under the name 'multiple regression', because there are multiple $x$ variables that are potentially associated with the variable $y$. Although it is almost impossible to plot a picture of multiple regression, it is basically a fairly straightforward extension of simple linear regression. There are a number of important technical aspects to both these generalizations, though: there is a 3rd year half course (STA302) on regression.

In the Globe and Mail this week