Baseball’s Black Swans

A black swan refers to highly improbable but impactful events. While the term is most frequently used in a financial context, it can apply to other fields as well. Baseball is a good example, especially given its long history and emphasis on individual statistics. What were baseball’s greatest black swans? I’ll look at some notable single-season accomplishments and use basic statistical concepts to calculate which were the most improbable.

The idea of a black swan goes back nearly two thousand years. The Roman poet Juvenal used the phrase, “rara avis in terris nigroque simillima cygno,” or “a bird as rare upon the earth as a black swan.” The ancient Romans didn’t just believe black swans were rare, but nonexistent. It was not until the 17th century that Dutch explorers discovered black swans in Western Australia. Until that time, Europeans had no way of knowing that black swans were even a thing.

Nassim Nicholas Taleb popularized the term in his book by the same name. A black swan means an event that was nearly impossible to predict based on prior experience. Taleb’s book came out just before the Global Financial Crisis, which made the book appear especially prescient. (The book itself is good, but the metaphor is better.) I’ll discuss the application of black swans to financial risk management in an upcoming post.

Black Swans in Baseball

Black swan can be a relative term as some events can be bigger outliers than others. A classic case of a black swan in baseball was Babe Ruth’s 54 home runs in 1920. Ruth’s closest competitor hit only 19. The prior record (set by Ruth himself in 1919) was only 29. Hall of Famer Frank “Home Run” Baker never hit more than 12 homers in a season.

But how much of an outlier was Ruth’s 1920 season? And how does it compare to other record setting performances? First, some ground rules. I compared the record-setting (or otherwise memorable) season to league leaders for previous seasons. Outliers were measured by the number of standard deviations (SDs) they exceeded the average (mean) for previous league leaders. More standard deviations mean a greater black swan.

Standard deviations are useful since they adjust for scale and variability. While home runs, ERA, and batting average have different scales, the number of standard deviations from the average means roughly the same for each. In addition, a particular season may stand out more if league leading totals are stable rather than volatile year-to-year.

The historical comparison for Ruth used the birth of the American League (1901) as the starting point, or 19 seasons in all. It also used the prior 19 seasons for the other record-setters to allow an apples-to-apples comparison. There’s nothing magic about using 19 seasons, but it appears reasonable. It provides enough data to make some statistical inferences while still being roughly contemporary to the record-setting season. The comparison focused on individual leagues rather than Major League leaders as an initial cut. Rules, ballpark dimensions, and playing style can vary by league. Results comparing standout seasons to previous Major League leaders come out slightly different, as we’ll discuss later.

The table below compares some black swan seasons. Under a normal (bell shaped) probability curve, 2 standard deviations should cover about 95% of the population; 3 SDs should cover 99.7%; and 4 SD should cover 99.99%. A standard deviation above 5.0 is off-the-charts improbable. You wouldn’t necessarily expect these records to follow a normal distribution, but the number of standard deviations provides some sense of orders of magnitude.

A possible surprise here is that the biggest black swan wasn’t Babe Ruth in 1920 but Maury Wills’ breaking the stolen base record in 1962. Wills didn’t pass Ty Cobb’s old record of 96 by that much, especially after taking the longer season into account. But stolen bases had become much less frequent since Cobb’s time.

Bill James’ otherwise terrific Historical Baseball Abstract (2001 edition) takes a somewhat dismissive view towards Wills’ stolen base accomplishments. James claims “the stolen base revolution began while Wills was in the minor leagues, and did not accelerate after Wills stole 104 bases in 1962.” It didn’t help that James has a low opinion of Wills as a person. (“Maury Wills is a creep.”) Singles hitters who rarely walk also don’t receive a lot of love from the sabermetrics community.

Wills stole at least twice as many bases as any previous National Leaguer since the Dead Ball Era. That sounds revolutionary to me. Wills stands out less when you also consider the American League, where Luis Aparicio stole more than 50 bases in each of the three preceding years, but Aparicio’s high mark was only 56.

James also claims that catchers of Wills’ era “couldn’t throw.” I don’t know about that, but it highlights a significant point. Ruth’s 54 home runs came with the introduction of the “lively ball.” What this really meant was that baseball adopted new rules around reuse and intentional scuffing of baseballs by pitchers. This made the baseball easier to hit and batting averages increased by 15 points that year. Ruth was the first and most successful in going for the home run in the new, more favorable environment. Likewise, if catchers indeed couldn’t throw, Wills exploited this weakness more than anyone else in his era.

Baseball’s Dodo Birds

A surprise on the other end was that Ted Williams’ .406 came in only 1.5 SDs above the league leader average, or about once every 15 years. Hitting .400 was still quite common in the 1920s and batters made serious runs at .400 in the 1930s. Rod Carew’s .388 in 1977 rates as a bigger black swan.

While Williams’ average wasn’t that much higher than his recent predecessors, few have even come close to .400 since 1941. Williams was less a black swan than a dodo bird. Like the dodo, .400 hitters have become extinct. If we look American League leaders for the 19 years following Williams’ .406 season, Williams stands out much more, at 3.47 SDs above the league mean. In contrast, Ruth broke his own homer record the next year, and again in 1927. Lou Brock broke Wills’ stolen base record in 1974, and Rickey Henderson broke Brock’s record in 1982.

Major League Black Swans

We can also compare a black swan’s record to Major League leaders. As shown below, the leader board looks a bit different.

This time, Ruth comes out a little ahead of Wills. The average stolen base total for AL stolen base leaders was slightly higher but also a lot more variable. Ruth and Wills remain well above the pack. Bob Feller’s 348 strikeout season presents an interesting contrast. While Feller’s strikeout total was 4.02 SDs above the American League mean, it was 4.90 SDs above the Major League mean. The average for ML leaders was obviously higher than for those who merely led the AL, but there was also less variation. Feller himself may account for much of the variation. For the 19 preceding years, no American Leaguer other than Feller had even 210 strikeouts. Feller beat that total four other times.

Not An Exhaustive List

The cases shown here won’t capture every standout season. The focus was on notable seasons that even a casual fan might recognize. I also relied largely on old school stats. One exception is the power–speed (P/S) number, which equals (2*HR*SB)/(HR+SB). Fans might not know that specific formula, but Ohtani’s 50/50 season generated plenty of publicity.

A black swan is supposed to be not just unexpected but also impactful. Home runs, stolen bases, and strikeouts fall into that category. There are more obscure stats that are nonetheless interesting. Consider Ron Hunt’s 1971 season, where he was hit by a pitch 50 times. Hunt’s modern record still stands. (Hughie Jennings had 51 in 1896.) And it would stack up with the biggest black swans, at 5.9 SDs above NL leaders. Hunt’s record stands out even more compared to Major League leaders, at 7.45 SDs, higher even than Wills and Ruth.

Although crowding the plate and not getting out of the way of oncoming pitches won’t make many highlight reels, Hunt’s 1971 season was very productive. Hunt hit only .279, but his on-base percentage was .402. That placed him fourth in the league, especially impressive for a middle infielder. As I noted in an earlier post on Dick Selma’s 1970 season, Hunt’s 1971 season deserves more recognition.


Posted

in

by

Tags:

Comments

One response to “Baseball’s Black Swans”

  1. Scott McKinstry Avatar
    Scott McKinstry

    Wow! I’m very impressed!!!

    I love baseball stats and I take pride in a lot of my findings. I like to tell people they should use baseball stats to teach math in school.

    This research is absolutely awesome! I still don’t understand some of the math of your findings, but that’s on me.

    I can get lost for hours doing baseball stats.

    Thank you so much!!!

Leave a Reply

Your email address will not be published. Required fields are marked *