Several weeks ago in this space I took a look at batters that PECOTA has habitually overrated or underappreciated over a period of several seasons. Today I’ll take a look at starting pitchers to see if we can identify those that continually flummox PECOTA by making a mockery of their pre-season forecasts year after year.
When comparing hitters I used Equivalent Average-a metric that has the advantage of being specifically forecast by PECOTA and is translated to account for contexts such as ballpark and league difficulty. For pitchers, finding such a straightforward comparison between PECOTA and actual performance is a little trickier. For this article I’ve used PECOTA‘s projected Equivalent ERA (EqERA) and compared it to the “translated” ERA as shown on a pitcher’s DT Card. Both numbers are adjusted to account for differences in league and ballpark and are calibrated to fit a fictional league with an average ERA of 4.50, so they lend themselves quite well to comparison. On its own, ERA can be a pretty blunt instrument, dependent to some extent on factors beyond a pitcher’s control, but the translated version should be good enough for our purpose here, which is to identify those pitchers that PECOTA has habitually misread.
The charts below are based on the 153 pitchers that pitched at least 100 “translated” innings (per their DT Card, which adjusts usage somewhat) in 2006, then follows the 2006 PECOTA “misses” to see whether PECOTA improves its accuracy over time. Very few relievers reach the century mark in innings-even fewer in multiple season-so this should make our sample almost entirely starting pitchers. I’m using a benchmark of 0.33 runs of EqERA to identify a “missed” projection; there’s no complex statistical reason for that number, other than one-third of a run seemed about right.
First up are the players that underperformed their PECOTA projection by that magical 0.33 runs:
Pitchers with 100+ Translated IP During Season: PECOTA Optimism Sample Sample PECOTA EqERA Year Description Size 0.33 runs Low Pct. 2006 All Players 153 36 24% 2007 Optimistic in 2006 18 2 11% 2008 Optimistic in 2006-07 1 0 0%
During the 2006 season, only 24 percent of pitchers that reached the 100-inning threshold were more than a third of a run worse than PECOTA‘s projection. Note that there is a selection bias at play here: pitchers that underperform their projections are far more likely to lose their spot on a staff, and thus not meet the innings threshold, than those that meet or exceed expectations. Of those 36 pitchers who disappointed in 2006, 18 went on to pitch 100 innings in 2007, with only two able to spend significant time in a major league rotation while continuing to significantly underperform their projection. By 2008, only one two-time disappointment logged the required 100 innings yet again, finally validating PECOTA‘s trust by exceeding his forecast.
Disappoint PECOTA twice at your peril; do so three times, and it’s highly unlikely you’ll continue to be entrusted with a major league rotation spot. Byung-Hyun Kim was only able to leverage the belief that he could morph back into his early-career Snake form for two seasons before the wishcasting came to an end. Only Felix Hernandez, the object of PECOTA‘s longest-running unrequited bot-crush, was given a third chance to match PECOTA‘s great expectations. It’s good to be the King.
So PECOTA almost never overhypes a starting pitcher three times, due to baseball’s natural culling of the pitching herd. What about players that outperform PECOTA‘s pessimistic forecasts?
Pitchers with 100+ Translated IP During Season: PECOTA Pessimism Sample Sample PECOTA EqERA Year Description Size 0.33 runs High Pct. 2006 All Players 153 82 54% 2007 Pessimistic in 2006 48 26 54% 2008 Pessimistic in 2006-07 19 8 42%
During the 2006 season, fully 54 percent of pitchers that reached the 100-inning threshold were more than a third of a run better than PECOTA‘s projection. This may seem high, but again the selection bias is at work here: you usually get to stay in the rotation if you’re pitching well. Of those 82 go-getters, 48 pitchers then went on to toss 100 innings in 2007, with PECOTA again underestimating 54 percent of them. By 2008, only 19 pitchers that had twice been underestimated were able to log 100 innings, and eight of them were dissed by PECOTA a third time. A little over five percent of the pitchers in the initial sample (eight of 153) beat their projections by a fair amount three times in a row. For hitters the number was a little under five percent-quite comparable.
What is it about these pitchers that habitually gives PECOTA indigestion?
2006 2006 | 2007 2007 | 2008 2008 Actual PECOTA | Actual PECOTA | Actual PECOTA Player EqERA EqERA | EqERA EqERA | EqERA EqERA Gil Meche 4.85 5.37 | 3.67 4.94 | 3.86 4.32 Ted Lilly 4.38 5.10 | 3.65 4.39 | 4.06 4.44 Chad Billingsley 3.81 4.62 | 3.12 4.74 | 3.47 4.13 Matt Cain 3.84 4.78 | 3.41 4.49 | 3.68 4.29 Wandy Rodriguez 5.85 6.87 | 4.38 5.74 | 4.32 4.84 Derek Lowe 3.50 4.59 | 3.94 4.56 | 3.48 4.41 Chris Young 3.57 4.76 | 3.34 4.39 | 3.76 4.25 Chien-Ming Wang 3.68 4.98 | 3.64 4.27 | 3.58 4.47
This is a prime example of what Steven Goldman might call “a congeries of unlike players.” Worm-killers like Lowe and Wang are balanced out by the soft-tossing fly-ball artistry of Young and Lilly. There are youngsters like Cain and Billingsley who seemingly matured ahead of PECOTA‘s anticipated timetable for them, and late-bloomers like Rodriguez or a re-bloomer like Meche, whose sudden successes belied a fairly well-established previous pattern of mediocrity. Even diving in from this 30,000-foot view to review a little more detail reveals very little. Many of these pitcher-seasons feature a relatively low BABIP, yet that doesn’t really explain much, as PECOTA often predicted an even lower BABIP rate. No matter how long I stare at the list above, the secret Magic Eye picture never reveals itself. The only unifying fact is this: PECOTA initially projected each player as being subpar (in some cases well below par), then slowly improved the projection each year-but never enough to match the player’s actual production.
Will any of these players make PECOTA out to be a four-time loser? Right now, Cain (Projected 4.14/Actual 3.19) continues to be an icon of misunderstood youth, while PECOTA has even less faith in the continuing effectiveness of Rodriguez (Projected 4.57/Actual 3.65). No one else seems likely to greatly exceed their projections.
Traditionally, pitcher performance is considered to be more variable and harder to predict than batting production. While PECOTA may seem to have similar counts of hits and misses for both pitchers and hitters over time using the criteria spelled out in these two articles, that point isn’t proven; the “0.33 points of ERA /10 points of EqA” benchmarks used aren’t necessarily equivalent margins of error. Lists of PECOTA‘s recurring misses are somewhat like lava lamps: interesting to look at, but only marginally illuminating. Further research is needed to throw more light on the types of players that are more likely to be badly misread, and in which direction. But if, like me, you once took Dan Meyer in an early round of your sim league draft, perhaps you can find comfort in the thought that even PECOTA can sometimes be very, very wrong.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
Also, Wandy's a guy who has always shown great peripherals aside from his struggles with gopheritis.
Will Christina let yo get away with that?
But since it's been 13 days since the first article, personally I think "several weeks ago" is reasonably accurate (albeit one day short), though admittedly not particularly inspired.
If pecota is a tabletop, it may appear smooth to the eye, but take a scanning microscope to it, and one will see bumps and holes. However, there is really no way to observe the small bumps without zooming in, and thereby losing the original units of measurement, and even the original laws of physics. To complete the analogy, the naked eye’s view is the broad statistical trends, while case specific accounts do take into account the information that is introduced by zooming in with the microscope. By this analogy, there will not likely be a statistical solution to the problem of individual outliers, because the sources of explanation for them likely wont be statistical but mechanical/scouty (unless we invent a new set of base statistics that can take into account the mechanical/scouty points. This is however a dismal prospect)
How many pitchers would we expect PECOTA to underestimate 3 times running just by chance, given the standard error of the projections? I have a hunch the answer is very close to 8...
That PECOTA is the best of the projection systems tells us that, even with all the great research being done, computers still can't predict player performance any better than can a reasonably sentient fan. With all due respect, BP authors ought to keep that in mind when the urge arises to treat PECOTA projections as inevitable.
Let's just say that I and my PECOTA-drafted fantasy teams doubt your "any sentient fan" imaginary construct. I don't see that "fan" putting his work out in April year after year to be judged.
PECOTA and its forecasting brethren do not hold themselves out as infallible oracles; and only an idiot would use them as such. They're tools, a baseline to assist a "sentient fan" in making his own judgments about what a player's performance will be. Those tools are only as good as the person using them - if the predictive house you build comes crashing down, don't blame PECOTA.
Moreover, BP authors often rely on PECOTA to evaluate signings, trades, etc., as if PECOTA is destiny. It's still a pretty weak predictor.
None of this should be construed as a repudiation of the work done at BP. One can be a fan without being blind to the flaws. A little humility when it comes to predicting performance is indicated here.
Not to mention, from a player perspective, someone who hits .300 will get paid more than someone who hits .286.
Besides, a person's "best guess" requires some knowledge with which to make a guess. PECOTA, along with other predictive tools, are sources of information that can help make your "best guess" better.