10.2 Were weight and accident frequency independent, the expected number of accidents in each weight category would equal the registration frequency times the total number of accidents. For instance, since 21.04 percent of the cars weigh less than 3,000 pounds, the probability that a car involved in an accident weighs less than 3,000 pounds is 0.2104 and the expected number of such cars among the 1,204 involved in accidents is (.2104)1,204 = 253.3. Here are the expected and observed values for the four categories:
Weight | Expected E | Observed O | ((0 - E)^2)/E |
Less than 3,000 | 253.3 | 162 | 32.9 |
3,000 - 3,999 | 555.4 | 318 | 102.5 |
4,000 - 4,999 | 374.8 | 689 | 263.4 |
5,000 or more | 20.5 | 35 | 10.3 |
Total | 1204.0 | 1204 | 408.1 |
This P value is less than 0.005, the smallest given in Table 6 in the text, decisively rejecting the null hypothesis that weight and accident frequency are independent. (Statistical software shows the P value to be 0.000001.)
10.6 Here are the column percentages:
Fighters | Nonfighters | Total | |
Natural | 27 (36.5%) | 140 (60.1%) | 167 (54.4%) |
Foster | 47 (63.5%) | 93 (39.9% | 140 (45.6%) |
Total | 74 | 233 | 307 |
These show a disproportionate number of foster mice among the fighters and natural mice among the nonfighters. The expected values, assuming independence, are
Fighters | Nonfighters | Total | |
Natural | 40.25 | 126.75 | 167 |
Foster | 33.75 | 106.25 | 140 |
Total | 74 | 233 | 307 |
The value of the chi-square statistic works out to be 12.61, which is far larger than the 6.63 cutoff for a 1 percent test with (2 - 1)(2 - 1) = 1 degree of freedom. (Statistical software shows the P value to be 0.0003.)
10.32 For the female data, the expected values, assuming independence, are
Short | Long | Total | |
First 10 minutes | 20.79 | 20.21 | 41 |
Last 10 minutes | 15.21 | 14.79 | 30 |
Total | 36 | 35 | 71 |
As anticipated, missed shots were more likely to bounce short during the last ten minutes and to bounce long during the first 10 minutes. The chi-square value is
The observed 5.296 chi-square value is statistically significant at the 5 percent level since it is larger than the 3.84 cutoff for a test with (2 - 1)(2 - 1) = 1 degree of freedom; the P value is 0.021.
For the male data, the expected values, assuming independence, are
Short | Long | Total | |
First 10 minutes | 43.74 | 44.26 | 88 |
Last 10 minutes | 40.26 | 40.74 | 81 |
Total | 84 | 85 | 169 |
In contrast to the female data, missed shots were more likely to bounce short during the first ten minutes and to bounce long during the last 10 minutes.The chi-square value is
This time, the observed 2.624 chi-square value is not statistically significant at the 5 percent level since it is less than the 3.84 cutoff for a test with (2 - 1)(2 - 1) = 1 degree of freedom; the P value is 0.105.
10.36 The natural null hypothesis is that each of these three brands is equally likely to be picked by a randomly selected college student. If so, the expected values are
Chips Ahoy! | 23.33 |
Lady Lee | 23.33 |
Bakery (soft) | 23.33 |
The chi-square value is 36.03:
With 3 - 1 = 2 degrees of freedom, the probability of such a large (or larger) chi-square value is 0.0000035. The null hypothesis is strongly rejected.
10.40 To assess whether the effect is substantial, we can compare the fraction of those using a hot tub whose babies had neural tube defects with the faction of those who didn't use a hot tub whose babies had neural tube defects:
While the fraction for women using hot tubs who had neural tube defects is small (about half of one percent), the relative risk is 0.0056/0.0019 = 2.93; that is women who used a hot tub during the first two months of pregnancy were nearly three times more likely to have neural tube defects. For statistical significance, we can determine the expected values if there were no relationship between hot tub use and neural tube defects:
Hot Tub | No Hot Tub | Total | |
Neural tube defects | 2.65 | 45.35 | 48 |
No neural tube defects | 1251.35 | 21,399.65 | 22,651 |
Total | 1254 | 21445 | 22,699 |
The chi-square value is 7.563. With (2 - 1)(2 - 1) = 1 degree of freedom, the P value is 0.00596. This P value is only a rough approximation, however, because of the small expected value in one cell.
We can calculate the exact P value by using the hypergeometric distribution (which is beyond the scope of this textbook) with a population of N = 22,651, successes in the population S = 48, a sample size n = 1247, to calculate the probability that more 6 would have defects (x > 6); this probability turns out to be 0.015. Thus these data reject at the 5 percent level the null hypothesis that the chances of neural tube defects are unrelated to exposure to a hot tub during the first two months of pregnancy. Based on the results of this study, including sauna data, the March of Dimes recommended that women who are contemplating pregnancy or are in the first three months of pregnancy avoid hot tubs and saunas.