Was Bob Welch’s 1990 Cy Young Award a Travesty?

Baseball fans love to engage in retrospective arguments. Who really deserved the MVP or Cy Young Award (CYA) twenty, thirty, or fifty years ago? These arguments often pit old school statistics against sabermetrics. One especially controversial choice was Bob Welch winning the 1990 AL Cy Young Award over Roger Clemens. But was Welch’s selection really such a travesty? Let’s look at the evidence.

Welch led the AL in wins that year, but Clemens was better in nearly every other category: ERA, ERA+, complete games, shutouts and wins above replacement (WAR). Clemens probably had a better year, but there’s a tendency to diminish Welch’s accomplishments. A 2014 article from the Sporting News is typical. “Winning 27 games is an achievement. It means you won 27 games. It does not mean you were the best pitcher in the league, or even on your own team. Tony La Russa knew this, which was why Dave Stewart started Games 1 and 4 of both the ALCS and the World Series.” (The A’s were swept in four.)

Welch’s award is also seen as the product of a benighted era where award selections focused only on wins and ignored everything else. While awards today give much more weight to sabermetrics, award designations weren’t quite as simplistic as some seem to assume. Otherwise, Bob Gibson (22-9) wouldn’t have won the 1968 CYA over Juan Marichal (26-9) and Tom Seaver (19-10) wouldn’t have won the 1973 award over Ron Bryant (24-12).

The Case Against Welch

Higher ERA in a pitcher’s park. Welch had a solid ERA of 2.95, good for sixth in the league, but well above Clemens’s 1.93. Welch also benefited from playing in a pitcher’s park with an ERA a full two runs higher on the road.

Few complete games. Welch pitched only two complete games that year. The game had already changed considerably since the 1970s when pitchers like Steve Carlton, Fergie Jenkins, and Catfish Hunter could pitch 30 complete games in a season. Still, Welch’s complete game total was well below that of Clemens (7) or teammate Dave Stewart (11). He also didn’t pitch that deep into games. His average innings pitched per start was 6.8, about a half an inning less than Clemens (7.36) or Stewart (7.42).

A WAR of only 2.9. Wins Above Replacement (WAR) shows a huge advantage for Clemens (10.4 to 2.9). While 2.9 is a respectable WAR (about what you’d expect for a starting pitcher), it’s well below the All-Star level (5+). While WAR can be a useful way to compare players across teams and even across eras, it’s hardly infallible. As discussed further below, the 2.9 WAR for Welch just isn’t very credible.

The case for Welch

He went 27-6! Sure, a pitcher’s won-loss record isn’t everything, but it’s still an important consideration. After all, the purpose of the game is to win and the most direct measure of that is wins rather than wins above replacement. The A’s were a very good team and that certainly helped. But they won 103 games, not the 132 games that would be consistent with Welch’s .808 winning percentage.

The Sporting News comment that Welch wasn’t even the best pitcher on his team is a common one. But it’s not as though the A’s sent out the first team for Welch and the junior varsity for Stewart. Some of that can be luck, but five more wins and four fewer losses on the same team in the same season is quite a difference. Stewart’s other pitching metrics (ERA, complete games) were better. Perhaps if you ran 10,000 simulated seasons based on these metrics, Stewart would come up with a better record most of the time. But in the actual 1990 season, Welch won five more games.

Was it Luck? Implicit in the dismissive view of Welch’s 27 wins was that Welch that year was pretty good and very lucky. But how lucky would he need to be? The binomial distribution looks at the probability of “success” for random, binary events (heads-tails, win-lose) given some specified number of attempts. In other words, it’s a way to quantify luck. In a simple coin flip, each outcome is equally likely. But you can also make the trial more realistic by weighting the probabilities. In Welch’s case, the 1990 A’s had a winning percentage of 63.6%. They won 29 of Welch’s 35 starts. They would be expected to win 22.26 of 35 games. The chance of winning 29 or more is only 1.1%. That would be a lot of luck.

It’s also better than Clemens. The Red Sox had a weaker team, with a winning percentage of 54.3%. They went 22-9 in Clemens’s starts. The Red Sox would have been expected to win 16.83 of Clemens’s 31 starts. The chance of winning 22 or more was 4.50%.

As with any statistic, the binomial distribution has its limitations. It doesn’t control for the quality of the rest of the pitching staff. Welch could have merely been the best of a bad lot. But that doesn’t seem to be the case. The other four A’s starters (Stewart, Scott Sanderson, Mike Moore, and Curt Young ranged in age between 30 and 33. They combined for 561 career wins with a cumulative WAR of 86.2.

Wins in Historical Context

Welch not only led the league in wins that year but had the most wins by any American League pitcher since Denny McLain’s 31 in 1968. An earlier post discussed the concept of a black swan performance, which identifies outlier seasons by comparing season totals to recent league leaders in the same statistical category. Welch wasn’t that much of an outlier in that 24 or 25 wins in a season weren’t especially unusual during the preceding two decades. Welch’s 27 wins were 1.64 standard deviations from the mean for AL leaders between 1971 and 1989. Welch was about as much an outlier as Ted Williams’ .408 in 1941 (1.51 SDs).

As with Williams, Welch’s season really stands out compared to league leaders for the 19 subsequent seasons. Williams and Welch were less black swans than dodo birds – the last of their kind. In that case, Welch’s 27 wins are 4.25 SDs above the mean for league leaders from 1991-2008. That’s even better than Williams’s 3.46 SDs.

Welch’s 2.9 WAR just isn’t intuitive. WAR is a model of sorts that is supposed to summarize a player’s performance. Models should be both statistically sound and produce realistic, intuitive outputs. As the name implies, the WAR total indicates how many of a team’s wins that player contributes over a replacement level player (i.e., a borderline major leaguer). It’s hard to see some pitcher the A’s pulled off the waiver wire going 24-9.

We can also compare Welch to himself. Specifically, let’s compare his 1990 season to his 1987 season, when Welch led National League pitchers with a 7.1 WAR. The 1987 Welch had more complete games, shutouts, and strikeouts. Welch also won 12 fewer games in 1987. The 1987 Dodgers had a losing record, but Welch’s ERA was also higher at 3.22. ERA+, which adjusts for league and ballpark factors, was identical. One interesting difference was that Welch allowed 12 unearned runs in 1990 while allowing only 4 in 1987. Strangely enough, the 1987 Dodgers led the league with 155 errors while the 1990 A’s had only 87. A good case can be made that Welch was better than his 1987 won-loss record indicated and not as good as his 1990 record. But it’s hard to see how 1987 Welch was twice as valuable as the 1990 version.

A Game-by-Game Analysis

Many pitching metrics use totals and averages and may not fully reflect what happened game by game. Welch’s 1990 record was obviously helped by solid run support, but that didn’t mean he won a lot of 7-5 games. Welch’s 27 wins include 13 where he allowed one or fewer earned runs. He allowed 2 earned runs in an additional five games. Welch allowed 4 earned runs in five of his wins with 4 earned runs over 5 2/3 innings his worst effort. He didn’t allow more than 4 earned runs in any of his wins.

Contrast that with Bob Gibson’s 1970 season. Gibson won the CYA that season with a 23-7 record. He also led Major League pitchers with an 8.9 WAR. Gibson won one game that year allowing eight earned runs, another allowing five, and three more allowing four. Gibson had a lot more complete games but there wasn’t a big difference in the average level of run support (5.21 for Welch vs. 4.94 for Gibson). Perhaps Gibson in 1970 was probably better than Welch in 1990 but three times better? Hard to see it.

Right Place at the Right Time

A lot of baseball statistics, particularly the more advanced ones, look at a player’s performance independent of that of his team. In other words, how well would a player have performed if dropped in on a random team? This can be a good way to compare players across teams, leagues, and even eras. But it doesn’t tell the whole story. There are times when someone is the right player, for the right team, at the right time.

Welch was an especially good fit for the 1990 A’s. He didn’t get a lot of strikeouts. But the A’s had a good fielding team and sometimes it makes sense to let your teammates do their jobs. In 1990, 25 opposing players grounded into double plays on Welch, compared to only 10 in 1987. Welch didn’t pitch many complete games, but did he really need to? Closer Dennis Eckersley was lights out with an ERA of 0.61. The entire A’s bullpen performed well that year with Eckersley, Todd Burns, Gene Nelson, Rick Honeycutt, and Joe Klink combining for a 1.95 ERA over 317 2/3 innings.[1]

Spreading Around the Recognition

There is also a tendency to spread recognition around when it comes to awards. That’s not a great reason but certainly an understandable one. Willie Mays led NL position players in WAR ten times from 1954 to 1966 and won the MVP only twice. And Mays was a beloved player. Clemens, on the other hand, had a reputation for being, shall we say, prickly. Still, by 1990 Clemens had already won two CYAs and would go on to win four more. Welch was 33 and had never won the Cy Young or any other major award. But he had a great elevator pitch of having the winningest season in the AL since they lowered the pitching mound (in 1969). I could certainly see choosing a 33-year-old who had never won the award over a 27-year-old who had already won twice, especially if the 33-year-old also had six more wins.

[1] This analysis excluded Burns’s two starts.

Was Bob Welch’s 1990 Cy Young Award a Travesty?

Comments

One response to “Was Bob Welch’s 1990 Cy Young Award a Travesty?”