Career Trajectories in Baseball
Teddy Schall and Gary Smith
Pomona College, Claremont, CA 91711
Major league baseball rookies are scrutinized carefully for evidence of their ability to succeed in the big leagues. Many are weeded out. Others have long careers. We investigate here the typical performance record for baseball hitters and pitchers during the course of their careers. We also consider whether the length of a player’s career can be predicted accurately from his firstyear performance. When a team offers a promising young player a longterm contract, is it buying a star or rolling dice? It turns out that firstyear performance is an important predictor but that there is a great deal of individual variation.
Data
We examined the season batting averages (BAs) and earned run averages (ERAs) for all major league baseball players from 1901 through 1999, using the data in the sixth edition of Total Baseball, which is available on the web site www.baseball.com. To reduce the influence of outliers on our performance measures, we excluded any season in which a player had fewer than 50 official times at bat or 25 innings pitched. In order to study career lengths, we further restricted our attention to persons who did not play in the 1999 season, presumably because they had retired. A few players sat out the 1999 season because of injuries or were temporarily relegated to the minors, and will resume their careers at a later date. However, the exclusion of these few exceptions seemed preferable to restricting our analysis to persons who had not played in, say, the last 5 years.
Our final database contained 4728 hitters, with careers ranging from 1 to 24 seasons, and 3803 pitchers, with careers ranging from 1 to 26 seasons. The mean career length is 5.6 years for hitters and 4.8 years for pitchers. To compare performances in different seasons, each player’s performance was standardized by subtracting the mean for all players that year and dividing this difference by the standard deviation of performance across players that year. We reversed the signs of the standardized ERAs so that, like BAs, high standardized values represent aboveaverage performance.
Figure 1 shows the standardized season batting averages of three extraordinary ballplayers–Ty Cobb, Willie Mays, and Bob Uecker. Cobb’s standardized BA was —0.173 his rookie year; this was followed by an unparalleled run of 23 seasons in which his lowest standardized BA was 1.018, in his last season. Mays batted 0.465 standard deviations above the major league average his first year and 0.389 standard deviations below his second year. After many years of outstanding performances, his standardized BA fell to —0.963 in his last season. Uecker is noteworthy for being allowed to play five belowaverage years; his worst year was his last, when he batted 2.017 standard deviations below the major league average. Uecker later achieved fame as a baseball announcer and personality, often telling stories from his "illustrious" career.
Figure 2 shows the standardized ERAs of two Dodger teammates, Don Drysdale and Sandy Koufax. As is true of most pitchers with relatively long careers, each outperformed established players in his rookie season. Drysdale was a betterthanaverage pitcher for 11 of the next 12 years, then fell to 0.627 standard deviations below average in his final year. Koufax is a remarkable exception to the typical pitching career. After a good rookie year, he had five mediocre years, followed by six very, very good years. His best season was his last, when his ERA was 1.925 standard deviations better than the major league average.
Survival Probabilities
Figure 3 shows the yearly survival rates for baseball players; that is, the probability of playing at least one more year if a player has played a given number of years. (The 32 hitters and 31 pitchers with careers of 20 years or longer were consolidated, with the reported fraction at year 20 showing the average survival rate for all of the remaining years.) The survival rate is below average in the first year (77% for hitters, 73% for pitchers), strong for the next several years (peaking at 87% in year 7 for hitters, and at 84% in year 5 for pitchers), and then generally declines. The relatively low survival rate in the first year is perhaps due to the fact that the additional information gained about a player’s skills during his first year playing against major league competition is much greater than the additional information gained during, say, a player’s fifth season. The somewhat higher survival rates for the next several years for those who make it past the first year may be because their skills are improving during this period. We will return to this conjecture.
A logit model was used to predict the probability P of playing another season in the major leagues as a function of the number of seasons t that a person has played in the major leagues, the player’s current season performance Z and the average performance over the player’s career (Z):
The estimates of the coefficients (and their standard errors) are shown in Table 1. These estimates are derived using the method of maximum likelihood assuming that the multiple observations for a single player are independent. The effects of seasons are small but highly statistically significant. The effects of current performance are substantial and statistically compelling; the effects of career performance are much smaller and, for hitters, not statistically significant.
Evaluated at the mean values of the explanatory variables, the estimated coefficients imply that each additional year in the major leagues reduces a player’s survival probability by 0.008 for hitters and 0.005 for pitchers. A onestandard deviation improvement in current performance increases the survival probability by 0.103 for hitters and 0.102 for pitchers; a onestandard deviation improvement in career performance increases a player’s survival probability by 0.003 for hitters and 0.015 for pitchers.
Career Performance Patterns
Figure 4 shows the mean value of the standardized BA and ERA in each year of the careers of the players in our database, again consolidating players with careers of 20 years or longer. Players typically perform below the league average in their first year in the major leagues: on average, 0.337 standard deviations below for hitters and 0.176 standard deviations below for pitchers. The fact that the average standardized firstyear ERA is better than the average standardized firstyear BA means that, in comparison to established players, pitchers tend to do better in their rookie seasons than do hitters. Perhaps it is easier to predict a pitcher’s initial success in the major leagues. Or a rookie pitcher’s style may present more of a challenge to opposing hitters than does a rookie hitter’s style to opposing pitchers.
Figure 4 shows that the mean performance increases after the first year, though this may be due more to weak players being weeded out than to players improving with experience. Remember, 23% of the rookie hitters and 27% of the rookie pitchers don’t play a second year. In order to disentangle the confounding effects of player skills changing during their careers and career lengths depending on player skills, we grouped the players by career length and examined the average career trajectory for the players in each career group. Other influences surely matter too–a player’s health, teammates, ballpark, and skills that are not measured by BA or ERA. In reasonably sized samples, these idiosyncratic factors will average out and allow us to see the systematic patterns. We consequently restricted out analysis to career lengths for which we have data on at least 100 players: up to 13 years for batting averages, 10 years for earned run averages. Figures 5 and 6 show the results.
First, notice that for hitters and pitchers of all career lengths, the average standardized terminalyear performance is substantially below 0, the average performance of all major league players. The best average terminalyear values are —0.502 for ERAs and —0.539 for BAs, both for players terminated in their tenth year. Termination after such poor performances is not surprising because the decision to terminate is based on perceived inability to perform. Remember our earlier finding that the average firstyear performance is —0.337 for hitters and —0.176 for pitchers. It makes sense to replace an older player with a standardized performance of —0.5 with a rookie whose expected standardized performance is —0.3 or —0.2, particularly if the older player is more expensive. The precipitous decline in performance near career end challenges the wisdom of longterm contracts and confirms the wisdom of Branch Rickey’s dictum that it is better to trade a veteran a few years early than one year too late.
Second, hitters who last more than a few seasons tend to improve each year until the decline that precedes their termination. Pitchers, in contrast, typically do not improve with experience; pitchers with long careers just seem more successful in postponing the deteriorating performance that will end their careers. Perhaps hitters improve as they become accustomed to major league pitchers, while pitchers wear out their arms.
It should be emphasized that the patterns in Figures 5 and 6 do not say that every player with a standardized performance of below —0.5 should be terminated. There is considerable variability in performance from yeartoyear as we see in the next section.
FirstYear Performance and the Length of a Player’s Career
For both hitters and pitchers, there seems to be a positive relationship between firstyear performance and the length of a player’s career. We examine this more closely. Figure 7 shows a scatterplot of career length versus firstyear performance for batting averages; the pattern for earned run averages is similar. Although these data decisively reject the null hypothesis that there is no relationship between career length and firstyear performance (the correlation is 0.23 for batting averages and 0.21 for earned run averages), the relationship is far from perfect–explaining only 5 percent of the variation in career length. The estimated standard error of prediction of career length for a given firstyear performance is approximately 4 years. An individual player’s career cannot be predicted confidently from his firstyear batting average or earned run average.
The reason is not hard to find. The correlation between player performance from one season to the next is highly statistically significant, but the correlation coefficients are far from one: 0.38 for batting averages and 0.24 for earned run averages.
Conclusions
Part of the charm of baseball is that there are clear tendencies that are logical and statistically persuasive, even while specific outcomes deviate substantially from these tendencies. Baseball seasons have far more games than other sports, allowing the law of large numbers to work its magic and allowing plentiful opportunities for shortterm departures from longrun tendencies. A batter can go 0 for 4 one game and 4 for 4 the next. A pitcher can throw a shutout one game and give up 5 earned runs in the first inning of the next game. A hitter can bat .350 one season and .250 the next; a pitcher can have a 1.50 earned run average one season and 5.00 the next. A team can win 100 games in a season and lose a World Series in 4 straight games. So it is with baseball careers–logical, statistically persuasive tendencies from which individual players can deviate greatly.
Table 1 Logit Predictions of a Player’s Survival Probability
Batting Averages

Earned Run
Averages


bi

SE

bi

SE


constant 
2.010

0.030

1.625

0.031

seasons 
0.058

0.004

0.031

0.005

performance 
0.794

0.025

0.677

0.026

average performance 
0.022

0.033

0.100

0.036

Figure 1 Three Remarkable Hitters
Figure 2 Two Pitching Teammates
Figure 3 Fraction of Those Playing t Years Who Play at Least One Additional Year
Figure 4 Mean Standardized Performance, by Year of Career
Figure 5 Batting Average Patterns Over a Career, by Career Length
Figure 6 Earned Run Average Patterns Over a Career, by Career Length
Figure 7 Batting Averages, Predicting Career Length from FirstYear Performance