keyboard_arrow_uptop

PECOTA is back for 2025, and as usual, we have improvements to announce. This season the improvements are notable both for where we chose and chose not to make them.

Pitchers

Although we think PECOTA already does a fine job projecting pitchers, it is about to become even better. In fact, we expect PECOTA to improve pitcher projection accuracy in 2025 by hundreds of runs, including rookie pitchers. In back-testing of the 2024 season using only data through 2023, we found that the updated approach improved accuracy by over 341 runs overall, with rookies alone improving by over 113 runs, thanks in part to the StuffPro ratings of Triple-A pitchers we already make publicly available.

Two changes drive these improvements: most come from integrating our StuffPro metric into projection of future balls in play, but only balls in play. The rest come from a small but important tweak in how we define what the next season is.

StuffPro Integration

Projecting pitcher balls in play has always been the most difficult part of the task. Unlike strikeouts and walks, balls in play are not composite events: you don’t get two or three do-overs when the batter makes contact. Rather, balls in play are singular events that can dump the ball anywhere inside (or perhaps outside) the field, at locations where fielders may or may not be waiting. Pitcher batted ball outcomes are thus inherently volatile, which is problematic because balls in play drive run-scoring. On an individual season basis, our metrics Deserved Run Average (DRA) and Contextual FIP (cFIP) seem to cut through this noise better than other pitcher run estimators, but all run estimators still leave plenty to be desired.

We are further challenged by the reality that batters are more responsible than pitchers for batted ball outcomes. Although this is good news for evaluating batters, it means that even on their best days, pitchers tend to swim upstream. According to the models I recently introduced, pitchers have a slight advantage in controlling launch angle (10% or so), but when it comes to the ball’s launch speed (exit velocity), I estimate that the average batter has 2.6 times (260%) the influence that a pitcher does. Yikes.

Nonetheless, pitchers have somehow managed to gain the upper hand. Even with infield shifts limited, pitchers continue to contain baseball offense to historically inept levels. The reason is that the pitcher does control one important thing: they throw the pitch that starts everything. The batter can only swing at the pitch that is thrown to them at the location where it crosses the plate. Thus, if a pitcher can maximize the relevant qualities of their pitches, they in theory can better limit the damage caused by launch speed and at least hold their own on launch angle. The trick is to figure out how the successful ones do this, and the unsuccessful ones do not.

Boosted tree models help us grade individual pitches by their inherent “stuff”—movement, velocity, etc.—and further analysis has helped us control for context, such as pitch location and circumstances (e.g., the count).  At Baseball Prospectus, our metrics measuring these qualities are StuffPro and PitchPro, respectively.  Although many thoughtful, competing metrics exist, our analysis shows StuffPro and PitchPro to best reflect both pitcher skill generally (through reliability) as well as pitcher skill at preventing runs (this season and next). The same article explains how StuffPro and PitchPro also tend to better resist confounders that can frustrate these metrics, such as pitchers changing teams and inconsistency in location rating.

Being composite events, strikeouts and walks largely speak for themselves “as is,” particularly after being shrunk with rigorous statistical methods. But balls in play remain desperate for projection improvement, and this is where StuffPro helps us out, albeit through an unusual process. (PitchPro doesn’t perform as well for projection, likely because the inherent qualities of a pitch persist longer than the context in which it was thrown. Our new arsenal metrics were not finished at the time PECOTA had to be, so its possible future inclusion is on the menu.)

So how do StuffPro’s batted ball ratings affect PECOTA? Traditionally, PECOTA projects individual batting events: not only strikeouts and walks, but singles, doubles, triples, home runs, etc. However, a projection is usually graded on composite run value. If the final run value is not accurate enough, you need to identify the problematic event and improve that model. Perfectly doable, but always this intermediate step.

The thing is, I am confident that teams themselves do not do this: they fit to run value and probably only run value. The same is true for pitch quality metrics, which also typically grade balls in play by run value. In fact, nobody except weirdos who run baseball hobbyist websites and/or play in made-up baseball leagues care how many singles a pitcher allows versus triples. Unfortunately, we are weirdos who run a baseball website and our writers and readers adore their made-up baseball leagues. So we have two masters to serve, and while proven accuracy is our top priority, we can’t just ignore the ultimate ways in which a pitcher’s run values connect to various batting events.

Thankfully, we found a solution: projecting a run value directly from a pitcher’s historical and projected platoon-adjusted StuffPro ratings for balls in play, and then apportioning that run value down to individual batting events, essentially reversing the previous process. This second step is not easy to do well: You can try to force the events to balance through some general purpose optimizer, but the reality is that every player has their own tendency to bucket events in different places. We want to capture that unique signal for each player. We do that through a multinomial model that can correlate by pitcher across all batting events to apportion them as each pitcher naturally tends to do, with an eye toward typical league correlations when information is more scarce. This is no mean feat, but we have many more options than we used to, and Stan is a wonderful thing.

The dream of every analyst incorporating an important new statistic is to forecast the total nobody who will stun the world and receive Cy Young votes. Unfortunately, the truth is less glamorous. In fact, some of the largest improvements we see are from somebody who otherwise would have been written off as terrible (say, a 6.50 ERA) who now gets projected for something closer to average, say 4.30. That may not seem like a sexy upgrade, but if you end up needing 100 innings from somebody, and correctly choose the supposedly replacement level pitcher who hits that better projection instead, it makes a big difference.

Pitchers for whom we lack pitch quality data receive a traditional PECOTA projection for their batting events.

Projecting “2025”

The second major change for pitchers applies to all events, strikeouts and walks included. This change involves thinking further about what it means to project a future season in a rigorous way. Many factors come into play: (1) the improvement or decline we expect from individual players; (2) aging effects on players; (3) the run environment we expect, which is affected by the previous two factors; and (4) the players we expect to play and how much. Certainly one can go back and forth with homebrewed combinations until one looks good enough / least crazy, but often you effectively end up projecting that the current “season” goes on forever, with players aging a year at a time during this infinite concept. We prefer to do it rigorously, so the system is not just taking our word for it on how these factors balance out.

So, an important tweak was to ensure that our batting event models now natively project not just each pitcher’s performance at a new age and in a new season, but also the likely effect of those changes on the overall run environment, which the model also gets the first crack at determining. To be sure, we still have to decide which pitchers we think will show up and adjust the final environment accordingly. But leaving one less recipe up to the cook produces improved results. In particular, PECOTA’s strikeout rate accuracy has jumped even higher, both for MLB veteran and rookie pitchers.

Batters

You no doubt are wondering what these new processes can mean for batters. The answer we found, curiously, was not much. No matter how hard we tried, we could not improve projected batter run accuracy by predicting directly to runs and then backing out to batting events. And launch speed and angle did not seem to be telling us much more than they already were. Why?

Perhaps it is because batters already have so much more responsibility for the actual outcome of each play, as compared to its expected or typical outcome. Indeed, their individual speed and baserunning drives whether they take an extra base or reach base at all. If they hit a ball out of the park, it doesn’t matter where the fielders were standing. Thus, it could be that directly predicting those outcomes already substantially captures the batter’s ultimate run contribution, particularly when you have years of past data for most batters. The bottom line is that batting event outcomes are just less random for hitters than they are for pitchers.

Although there is always room for improvement, it also feels relevant that PECOTA seems to be doing just fine projecting batters in its current form. Later this week, Rob Mains will demonstrate that throughout the 2024 season, you were better off using PECOTA to project rest-of-season results rather than the same season’s statistics for batters. Most people would probably expect otherwise, that projections yield in relevance at some point to what has actually been happening during the new season, but at least when it came to PECOTA and batters for 2024, that does not seem to be the case.

Other new metrics, such as bat speed, are also worth monitoring. It is hard to say whether these upstream measurements, although genuinely interesting, will turn out to have actual value for projection. Unlike teams or trainers, most of us do not have the ability to intervene and try to make bat contact more useful. Player measurements upstream of the resulting batted ball are by definition less reliable, and if a batter cannot reliably translate bat speed into launch speed, it probably does not matter how hard they swing. But more information is preferable to less, and as we get more of it, we’ll know better what, if anything, this data is useful for.

Defense

Lastly, subscribers will note that our projected DRP (Defense Run Prevented) numbers have been adjusted since our “early PECOTA launch” in December, the result of additional study into how we could better project future defense, including—you guessed it—having our models predict directly into their anticipated 2025 environment.  We are pleased with the updates, although we do apologize to Yoan Moncada, one of the players most affected; admittedly not great timing with him still looking for a roster spot.

These updates are reflected in both the Team Standings released today as well as the current PECOTA spreadsheets.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
specialkman
2/03
Dear lord, the Dodgers.
Ryan Dekker
2/03
Did a Blue Jays season ticket salesperson hack your system to say they're projected for 84.5 wins? At least they didn't get too greedy and say 90...
Tim Mitchell
2/04
Are there any fantasy leagues that use pecota.
?
Craig Goldstein
2/04
Not sure what you mean?