**1.** Answer this letter: "Dear Abby: My husband and I just had our eighth child. Another girl, and I am really one disappointed women. I suppose I should thank God she was healthy, but, Abby, this one was supposed to have been a boy. Even the doctor told me that the law of averages were in our favor 100 to 1."

**2.** In 1942, a seed company sold 20 bags of rye seeds with labels indicating a 90 percent germination rate. The War Food Administration took a sample of seeds from these bags and found that only 240 of the 400 seeds that they tested germinated. [E. K. Hardison Seed v. Jones, 129 F.2d 252 (6th Cir. 1945).] Use a normal approximation to calculate a two-sided P value for the testing the null hypothesis that each seed has a 0.90 probability of germinating.

**3.** After observing that a tennis player had won two of eight tournaments
played on grass, a researcher assumed that this player had a p = 0.25 probability
of winning a tournament played on grass and calculated the probability of this
player winning at least one of the next five tournaments played on grass to
be

Assuming the binomial model to be appropriate with a p = 0.25 probability of winning, what is wrong with this calculation?

**4.** Explain how these graphs give the misleading impression that the
improvement in math and reading scores were quite similar over this four-year
period:

[adapted from Jay P. Greene and Paul E. Peterson, “School Choice Data Rescued From Bad Science,” The Wall Street Journal, August 14, 1996.]

**5.** Defendants awaiting trial are often allowed to leave jail if they
leave a cash amount (a "bond") that is forfeited if they do not return for their
trial. Bail bondsmen will put up the requisite cash in return for a payment
from the defendant. For example, a bail bondsman might put up a $10,000 bond
after the defendant pays the bondsman $1,000, which the bondsman keeps whether
or not the defendant returns for trial. In this example, for what values of
P, the probability that the defendant will disappear (causing the bondsman to
lose the $10,000 bond), is the expected value of the bondsman's profit greater
than 0?

**6.** Answer this letter [Marilyn Vos Savant, "Ask Marilyn," Parade Magazine, November 16, 1997.]: "If I repeatedly flip a pair of coins until at least one of them lands heads, what are the chances that the other coin also has landed heads?" --Jim Sandy, Crofton, Md.

**7.** A 1976 study of 225 female and 100 male residents of a Florida retirement community (all of whom werover the age of 65) obtained the data below on cholesterol level. The average ages were 73.7 for females and 74.1 for males, each with a standard deviation of 5.3 years. If the cholesterol level of elderly U.S. women is normally distributed with a mean of 228.9 and a standard deviation of 37.1, what is the probability that a randomly selected woman from this population has a cholesterol level above 250 mg/dl? If the cholesterol level of elderly U.S. men is normally distributed with a mean of 202.4 and a standard deviation of 36.1, what is the probability that a randomly selected man from this population has a cholesterol level above 250 mg/dl?

Females | Males | |

Mean cholesterol (mg/dl) | 228.9 | 202.4 |

Standard deviation | 37.1 | 36.1 |

[Craig J. Newschaffer, Trudy L. Bush. and William E. Hale, “Aging and Total Cholesterol Levels: Cohort, Period, and Survivorship Effects.” American Journal of Epidemiology, 136, No. 1, 1992, pp. 23-31.

**8.** Twenty five male college students and 25 female college students
were each asked to name their favorite singer; 21 of the males named a male
singer and 16 of the females named a male singer. Explain the error in this
test of the null hypothesis that male and female students are equally likely
to name a male singer: "The pooled proportion is (21 + 16)/(25 + 25) = 0.74.
We tested the null hypothesis p = 0.74 with the following z statistic:"

**9.** A statistics midterm and final examination are given to 1000 students,
from which a random sample of 50 students are selected. Explain carefully why
an ANOVA F test might show statistical significance at the 1 percent level,
while the simple regression model does not show statistical significance even
at the 5 percent level.

**10.** In order to see whether women are more successful at single-sex
or coeducational colleges, samples of women attending a women's college and
a coeducational college were asked, "Do you feel you are successful at your
college?" The results were as follows:

Women's College | Coeducational College | |

Yes | 37 | 30 |

No | 15 | 13 |

The researcher explained that, “A chi-square test was used to examine the data to see if they are statistically significant. We will do this by assuming that a positive answer is a valid determinate of true success, so that the null hypothesis is that the probability equals 0.5, because there is a 50-percent chance of agreeing that either yes, one is successful or no, one is not successful.” The observed values were compared to these expected values:

Women's College | Coeducational College | |

Yes | 26 | 21.5 |

No | 26 | 21.5 |

Explain why this procedure is not persuasive, and then make an appropriate statistical test.

**11.** Provide an alternative explanation for this observation:
"At retirement, a person can choose to take a single lump-sum payment [for
example, $100,000] or a fixed annual income [for example, $10,000] until death.
Those who choose the annual income live, on average, 2 1/2 years longer than
the general population. This shows that peace of mind is very important to
one's health."

**12.** In three careful studies, lie-detector experts examined several
persons, some known to be truthful and the others known to be lying, to see
if the experts could tell which were which. Overall, 83 percent of the liars
were pronounced "deceptive" and 57 percent of the truthful people were judged
"honest." Using these data and assuming that 80 percent of the people tested
are truthful and 20 percent are lying, what is the probability that a person
pronounced "deceptive" is in fact truthful? What is the probability that a person
judged "honest" is in fact lying?

**13.** The following regression equation was estimated using data on college
applicants in the spring of 1992 who were admitted both to Pomona College and
to another college that was ranked by U.S. News & World Report as among the
top twenty small liberal arts colleges (the standard errors are in parentheses):

where

y = fraction of students who were admitted to both this college and Pomona College that enrolled at Pomona, average value = 0.602.

x = U.S. News & World Report ranking of this college.

a. Draw the estimated regression line in this scatter diagram of the data:

b. Use this graph to explain how the estimate 0.0293 was obtained. (Do not show the formula for calculating this estimate; explain in words the basis for this formula.)

c. Explain why you are not surprised that the R2 for this equation is not 1.0.

d. Does the estimated coefficient of x have a plausible value? Explain.

e. Is the estimated coefficient of x statistically significant at the 5 percent level? What is the null hypothesis?

f. What is the predicted value of y for x = 30? Why should we not take this prediction seriously? Be specific.

**14.** A researcher wanted to see whether the graduates of women's colleges
or the female graduates of comparable coeducational colleges are more likely
to obtain a postgraduate degree. Survey data were obtained in the fall of 1990
from 2,680 women who had graduated from Smith College (a women's college) during
the years 1980-1985; of these, 1,527 had obtained a postgraduate degree by September
1990. Similar data were obtained for Pomona College, a coeducational college
whose students are considered comparable to those at Smith College by U.S. News
& World Report and other independent groups. Of 1,191 people surveyed who had
graduated from Pomona during the years 1980-1985, 402 males and 343 females
had obtained a postgraduate degree by September 1990. This researcher calculated
the following z value for a difference-in-means test and concluded that the
observed difference is highly statistically significant. Explain the critical
error in this calculation.

**15.** Researchers calculated the difference between the
actual and predicted daily high temperatures at the Los Angeles Civic Center
for every day in 1996. Explain the error in this interpretation of their
results. "The t value for the null hypothesis m = 0 was 6.47, revealing
that there is zero percent chance that the population mean is zero. Thus
at the 1 percent level, we disproved the null hypotheses that weather forecasters
make no errors in their predictions."

**16.** College dining halls use attendance data to predict the number
of diners at each meal. Daily weekday lunch data for 11 weeks was used to estimate
the following regression equation (standard errors in parentheses):

a. What is predicted attendance on Wednesday during the tenth week of the semester?

b. Interpret the estimated coefficient of x3.

**17.** A study in the 1950s found that heart-disease patients who were
given bypass operations reported substantial relief from the pain of angina;
however, a control group that received surgical incisions without being given
a bypass operation reported as much relief from angina as did the bypass recipients.
How would you explain these results?

**18.** A 1989 Wall Street Journal editorial criticized a ruling by New
York state's highest court that called marriage "a fictitious legal distinction."
The Journal's promarriage editorial noted that "Adult single men are five times
more likely to commit violent crimes than married men," suggesting that marriage
is a major factor in reducing violent crime. How else might these data be explained?
[editorial, "A Legal Fiction," The Wall Street Journal, July 18, 1989.]

**19.** A Pomona graduate applied to three MBA programs, and was wait-listed
at all three. Because she had been told that a wait-listed student has a 1-in-3
chance of being accepted, she decided that she was certain to get into at least
one of these three programs and did not apply to any others. (This is a true
story!) If she has a 1/3 chance of being accepted by each program, and their
decisions are independent, what is the probability that she will get into at
least one of these three programs?

**20.** A student observer sat attended 8 different Pomona classes and
recorded the number of female and male students in each classroom; the number
of spoken remarks, questions, and comments that were directed at the professor;
and the gender of the student making each spoken remark. She found that there
were 88 females and 107 males in these classes and a total of 276 comments directed
at the professor, of which 138 were made by females and 138 by males. Are these
observed differences statistically persuasive?