Required for next week: March 5, 1996
-
Chapters 3 and 4 of Tainted Truth by Cynthia Crossen.
Come prepared to discuss these and ask questions about the
parts you didn't understand.
- Short Project 5.
Technical Note: multiple regression
In the cookie experiment, I did a simple linear regression of 'taste'
on 'price'; that is, I found the least squares line for a plot of
taste (on the $y$ axis) against price (on the $x$ axis). The line was
not a very good fit to the data, although it did at least have a positive
slope. The variable chosen for the $y$ axis is usually called the
dependent variable: it is the variable assumed to be influenced
by the socalled independent variable that is on the $x$ axis.
A very common generalization of simple linear regression is to the
setting of several independent variables, all thought to potentially
influence the dependent variable. An example mentioned in
Tainted Truth (p. 61) is the use of multiple regression in
the analysis of the association between coffee intake and, for example,
heart disease. The dependent variable would typically be the
probability of heart disease (more precisely, the log of the odds
of heart disease), and several independent variables, in addition to
coffee consumption, are suggested
by Crossen: cigarette consumption, amount of exercise, fat consumption,
for example.
The standard statistical/mathematical
expression for a multiple regression equation is the following equation:
y=b0 + b1 x1 + b2 x2 + ... + bp xp
where x1, x2, ... , xp are the independent variables,
and b1, b2, ... , bp are the coefficients of the
regression model that are estimated using least squares.
(In our cookie example, we had p=1, x1=price, and an
estimated value of b1 of 0.89.)
The coefficients bi have the interpretation as a marginal rate:
on average y increases (decreases) by bi
for every unit increase (decrease) in xi, when
all other variables are held fixed. In our taste test,
the average taste rating went up by 0.89
for every 1 dollar increase in price per 100 grams.
When several independent variables are used, the
hope is that the coefficient for the variable of interest, coffee
consumption, say, is an accurate measure of the effect of coffee consumption
that is not contaminated by other factors, such as smoking, because they
have already been 'controlled for' in the equation.
Of course, if the data don't follow a line, at least on average,
then linear regression doesn't make much sense. The same thing
is true for multiple regression: if the equation doesn't fit the data,
then the equation isn't telling you much. However, regression is
a pretty reliable and simple technique in a lot of cases, and there
are lots of data sets for which the model does fit reasonably well.
In the Globe and Mail this week
- ``Leukemia lab finds selective cell-killer'', Feb.15, A1\&10 (Wallace Immen).
A report in the Feb.~15 issue of Nature by Dr. C. Roifman of the Toronto's
Hospital for Sick Children, with collaborators from the Hebrew University
in Jerusalem, describes discovery of an enzyme that stops uncontrolled
growth of leukemia cells. It has proved successful in mouse experiments,
but is not yet ready for testing in humans.
- ``Cancer debate is revived'', Feb. 15, A10 (Reuters).
A study published by physicists at the University of Bristol (the article
does not say where the study was published) shows that electrical power
lines can attract radon gas, which has been linked to cancer. The article
mentions several earlier studies that indicate increased risk of cancer
associated with electro-magnetic fields, but does not provide any detailed
references.
- ``Mint goes for broke'', Feb. 21, A3 (Canadian Press).
An article on the new two dollar coin, which has a gold-coloured core
and a silver outer ring. Apparently one coin broke in quite ordinary
usage (fell to the ground), and this has led to a spate of coin-breaking
games around the country. The communications director for the mint,
Diane Reardon, was quoted as saying ``It's one in 60 million''. The article
goes on to state that ``the coins were put through a quality control process,
but a random one, Ms. Reardon said. `When we mass produce, we don't test every one'.''
- ``Alzheimer's may begin early in life, study finds'', Feb. 21, A11 (Brenda
Coleman).
A study published in the Journal of the American Medical Association
has correlated death from Alzheimer's disease with low linguistic ability
in early adulthood. The researchers are studying a group of nuns
who have agreed to donate their brains to medical research at their death.
A summary of the article is available at
{\tt http://www.ama-assn.org/sci-pubs/journals/}.
- ``Mercury fillings defended as safe'', Feb. 21, A6 (Wallace Immen).
``A controversial report that suggested having more than four
dental fillings containing mercury poses potential health risks has
been rejected as unscientific. `The study was good, but the data
was soft', said Dr. Philip Newfeld, a researcher with the health protection
branch.'' I didn't find anything in the article to explain what Dr.
Newfeld meant by that.
- ``Cream reduces wrinkles in study'', Feb. 22, A6 (Wallace Immen).
A prescription cream called Renova was tested in clinical
trials and brought ``measurable improvement in 78 per cent
of patients, beginning after about four weeks''.
- ``Poll finds Quebeckers proud of Canada'', Feb. 24, A3 (Richard Mackie).
Just in case you thought Léger and Léger had run out of things to do.