Econ 57 Fall 1999 Final Examination
Answer all 15 questions, always explain your reasoning and show your work.

1. Health Maintenance Organization (HMO) member surveys sometimes find that more than 90 percent of their members are satisfied. Identify two kinds of survivor bias that may affect these results. Will the reported satisfaction rate be biased upward or downward?

2. Use a histogram with intervals 0-9.99, 10-14.99, 15-19.99, and 20-34.99 to summarize these data on annual inches of precipitation from 1961-1990 at the Los Angeles Civic Center. Be sure to show your work.


5.83 15.37 12.31 7.98 26.81 12.91 23.66 7.58 26.32 16.54
9.26 6.54 17.45 16.69 10.70 11.01 14.97 30.57 17.00 26.33
10.92 14.41 34.04 8.90 8.92 18.00 9.11 9.98 4.56 6.49

3. Suppose that 5 percent of the population use a certain drug. If a person uses this drug, a drug test gives a positive result 95 percent of the time; if she or he doesn't, it gives a negative result 95 percent of the time. What percentage of those who test positive do, in fact, use this drug?

4. You have two large cans and fifty red marbles and fifty blue marbles that you can divide however you want between the two cans, as long as you put exactly fifty marbles in each can. Afterward, the marbles in each can will be shuffled thoroughly; you will be blindfolded and then select one can at random from which you will select one marble at random. If this marble is red, you will be paid $100. How do you divide the marbles in order to maximize your chances of winning 100 dollars?

5. Answer this letter to newspaper columnist Marilyn vos Savant: "I have a really confusing one for you. Let's say my friend puts six playing cards face-down on a table. He tells me that exactly two of them are aces. Then I get to pick up two of the cards. Which of the following choices is more likely? (A) That I'll get one or both of the aces, or (B) That I'll get no aces?"

6. A random sample of 30 college students yielded data on y = grade in an introductory college chemistry course (4-point scale), x1 = high school GPA (4-point scale), and x2 = combined math and verbal SAT scores (1600 maximum). The standard errors are in parentheses and the t values are in brackets.

     a. Explain why you either agree or disagree with this interpretation: "The primary implication of these results is that high school GPA and SAT scores are poor predictors of a student's success in this chemistry course."
     b. Suppose that an important omitted variable is x3 = 1 if the student has taken a high school chemistry course, 0 otherwise; that the true coefficients of x1, x2, and x3 are all positive, and that x3 is negatively correlated with x1 and uncorrelated with x2. If so, is the estimated coefficient of x1 in the above equation biased upward or downward? Be sure to explain your reasoning.

7. A random sample of 100 college students yielded data on y = student's height, x1 = student's gender (0 if female, 1 if male), x2 = mother's height, and x3 = father's height. All heights were measured in feet; for example, a 5 foot, 3-inch person is 5.25 feet tall.

     a. Explain why the following interpretation of the 0.74 R-squared value is misleading: "74% of a student's height can be explained by their gender and their parents' height."
     b. Explain why this conclusion is misleading: "The student's gender is important in determining height. In general, males are taller than females, which is shown in the t values."
     c. Do these results indicate that heights regress to the mean?

8. An April 1997 study collected data on the 15 bestselling fiction and 15 bestselling non-fiction hardcover books (according to the New York Times bestseller list): y = price, x1 = number of pages, and x2 = category (0 if fiction, 1 if nonfiction). The standard errors are in parentheses and the t values are in brackets.

     This equation was then estimated, where x3 = (number of pages)(category):

     a. Explain the conceptual difference between these two models, using the parameter estimates to illustrate your reasoning.
     b. Using the second equation, what is the predicted price of a nonfiction book with 300 pages?
     c. The given t values are for testing the null hypothesis that the coefficients equal 0. Use the second equation to test the null hypothesis that the coefficient of x1 is 0.05. What is the alternative hypothesis?

9. New Jersey's Juvenile Awareness Project sent juveniles to the maximum security prison in Rahway, New Jersey, for two-hour meetings with hardened convicts who told them scary tales of prison life. Scared Straight, an Oscar-winning film about this project claimed that 90 percent of the youths who participated "went straight." Many states subsequently adopted variations of this program. A scientific evaluation of this program compared the arrest records of 81 participants with a control group of 81 other juveniles from the agencies that referred these participants. During the six months following the program, 41 percent of the treatment group had at least one arrest, compared to only 11 percent in the control group. Furthermore, the treatment-group arrests tended to be for more serious crimes.
     a. Is the observed difference statistically persuasive?
     b. How would you explain the difference?

10. Ninety-nine break shots on a full rack of billiard balls resulted in 54 sunk balls, allocated among the 15 balls as follows:

     Thus the balls in the three corners of the rack were sunk 9, 6, and 10 times. Dividing the rack into three categories (corners, adjacent, and other),

     is there a statistically significant relationship between a ball's position in the rack and its likelihood of being sunk on the break?

11. The management of an office building monitors the temperature control system by taking four independent readings of the temperature at four locations where, if the system is functioning properly, the temperature is normally distributed with a mean of 68 degrees Fahrenheit and a standard deviation of 3 degrees. What is the mean and standard deviation of the average of these four readings if the system is functioning properly? What upper and lower control limits should the building's management set for the average of these four temperature readings for there to be a 0.01 probability of violating a control limit when the temperature control system is functioning properly?

12. A home scientist claims that people are weakened when they wear watches. To test this claim, ten volunteers have their strength measured with and without watches. As a statistical test, what is the null hypothesis and, if this null hypothesis is true, what is the (exact) probability that we will incorrectly reject it if we agree beforehand that we will be persuaded that the home scientist is correct if seven or more of the volunteers show a decrease in strength while wearing a watch? (Assume that the measuring equipment is so precise that no one ever gets exactly the same strength reading twice in a row.)

13. Consumer Sports magazine tested the gasoline mileage of three subcompact cars by driving 10 of each make from Claremont, California, to Brewster, Massachusetts. Do not do an ANOVA F test, but do explain why you believe that such a test either would or would not have a P value less than 0.05. What is the null hypothesis?

A: 32 31 30 27 31 33 31 34 28 31
B: 36 41 43 41 41 42 37 40 39 42
C: 39 37 33 35 32 36 36 34 37 33


14. From the 30 stocks in the Dow Jones Industrial Average, students identified the 10 stocks with the highest dividend-price (D/P) ratio and the 10 stocks with the lowest D/P ratio. The percentage returns on each stock were then recorded over the next year:

High D/P Low D/P
39.77 -37.74
32.01 46.83
28.93 -5.81
32.30 48.04
31.28 20.91
12.17 20.24
40.68 1.05
14.33 -28.10
40.39 17.03
37.74 -1.55

     a. Display these data in two box plots in a single graph. Do the data appear to have similar or dissimilar means and standard deviations?
     b. Considering these data to be two independent random samples of size 10, calculate the two-sided P value for a test of the null hypothesis that the population means are equal. Do not assume that the population standard deviations are equal.

15. A certain grade of apples has a weight that is normally distributed with a mean of 10 ounces and a standard deviation of 2 ounces.
     a. What is the probability that a randomly selected apple will weigh less than 8 ounces?
     b. The apples are sold by the bag with 20 apples in each bag. What is the probability that bag containing 20 randomly selected apples has a net weight of less than 160 ounces?