Ahead in the Count: Why SIERA Doesn't Throw BABIP Out with the Bath Water

It sometimes seems as if the main reason people are wary of Defense Independent Pitching Statistics as a way to measure pitching performance is that they are reluctant to believe the theory that pitchers do not control the hit rate on balls in play (BABIP). It does not make intuitive sense, and it isn't even entirely true. Certainly, fans who disagree loudly with these theories should be reassured by the knowledge that defense-neutral ERA estimators are usually much closer to next year's ERA than the previous year's ERA, but many fans still can't get past the point that ERA estimators usually assume that pitchers do not have control over the outcome of balls in play. That is because these estimators simply look to interpret the effect on scoring of a strikeout, walk, and home run. This gives them the strength to predict ERA well because they are able to explicitly state the effect of each of these outcomes.

Alternatively, you can now find Baseball Prospectus' new defense-neutral ERA estimator, SIERA, in the stat reports. SIERA does not make this assumption. That's because we know that pitchers do control their BABIP to a certain degree. In any given season, the average starting pitcher who can keep his job will have his BABIP determined roughly 75 percent by luck, 13 percent by his team's defense/park, and 12 percent by his own skill.* How do we actually figure out the 12 percent that is skill, when we know that the variance in BABIPs and the limits of sample size imply that such a large fraction is luck? Fortunately, J.C. Bradbury found in 2005 that much of BABIP skill from pitchers can actually be explained by their defense-independent skills. In fact, about 86 percent of the pitcher portion of BABIP skill is explained by these statistics.**

When I say that these defense independent pitching statistics "explain" BABIP skill, I mean that you can figure how good pitchers are at preventing hits on balls in play by knowing how good they are at these skills. Specifically, pitchers who strike out a lot of hitters also induce a lot of weak contact. Randy Johnson was good at preventing hits on balls in play, and Tim Lincecum looks pretty good, too. Additionally, pitchers who walk a lot of hitters—those who miss the corners of the plate by a few inches—also have trouble with leaving the ball a few inches towards the middle of the plate, and are more likely give up more hits. Greg Maddux is an example of the opposite extreme as his impeccable control allowed him to keep the hits down. Pitchers who allow a lot of fly balls also induce a lot of pop-ups, easy outs that keep BABIP down. Both Jered Weaver and Ted Lilly are fly ball-prone, and while that hurts their home run numbers, they also induce a fair amount of popups. The ability to control the ball, miss the bat, and hit the top of the bat all correlate very highly and explain most of the pitcher's ability to control BABIP, and this is all utilized by SIERA.

What SIERA does is simply ask the question, "How much did teams score off pitchers with these whiffing, control, and grounder skills?" instead of the question that many estimators ask: "How much do these whiffing, control, and grounder outcomes affect scoring precisely?" The latter is certainly a valuable question, but if you are interested in checking how much pitchers' BABIP skills might affect their ERA, SIERA is the statistic for you. SIERA does not precisely estimate the exact run-scoring impact of those strikeouts and other outcomes because there are only seven years' worth of reliable batted-ball data, so the coefficients certainly require some fine tuning as we get the data. However, the benefit of a statistic like SIERA is that we no longer are asking how well a pitcher pitched if we ignore his BABIP skill. Instead, we are incorporating the vast majority of BABIP skill in our estimate.

For the Mathematically Inclined

This section is for those people who want to see the math involved in determining my numbers above, and also provides some transparency. It is not necessary to get the main point of the article: SIERA incorporates most of the skill that pitchers do have to influence BABIP.

*: To figure out the percentage of BABIP explainable by skill, luck, and defense, we do not need to guess. We know that variance of BABIP should equal the sum of the variance in BABIP explainable by luck, defense/park, and the pitcher himself. Since it’s a binomial variable, we can pin this down pretty exactly, since binomial variables have known variance. Firstly, the standard deviation of BABIP among all pitchers with 150 innings in a season from 2003-09 was .02125 (variance = 4.52*10^-4). The average amount of balls in play for pitchers in the sample was 610 with an average BABIP of .295. Thus, we can figure the standard deviation that we would observe due to luck if all teams and pitchers and stadiums did have the same BABIP: sqrt((.295)*(1-.295)/610)=.0184 (variance = 3.40*10^-4). This gives us the fraction of BABIP that is luck: (3.40*10^-4)/(4.52*10^-4) = 75%. The other 25 percent is going to be defense, park, or skill.

Fortunately, we can figure out the defense and park effects together by looking at overall team BABIPs in this time period. Teams had a standard deviation of BABIP within each season of about .0104 (variance = 1.08*10^-4). However, the amount of standard deviation that we would expect among all of these teams with an average BABIP of .298 and about 4,300 balls in play would be, as before, sqrt((.298)*(1-.298)/4300)=.00698 (variance =4.86*10^-5), which means that the variance in actual skill level for the defenses specifically is 1.08*10^-4 – 4.86*10^-5 = 5.93*10^-5, so the standard deviation in team BABIP skill should be .0077. Thus, two-thirds of teams should be between .290 and .306 in BABIP skill level, which sounds reasonable. It also means that team defense and park effects combine to explain 13 percent of BABIP.

This means that we have 12 percent of BABIP that cannot be explained by luck, defense, or park effects, and that should mean that 12 percent of BABIP is pitcher skill. Thus the pitcher BABIP skill should have a standard deviation of about .00721, and that two-thirds of pitchers probably fall between .291-.305 in terms of their actual abilities to prevent BABIP. In fact, 95 percent of pitchers should fall in between .283-.313 in their BABIP prevention abilities. That certainly is not a complete lack of skill difference, but it is small compared to the skill difference in strikeouts, walks, and ground balls.

**: To determine the amount of pitcher BABIP that can be explained by strikeout, walk, and ground-ball skills, I simply ran a regression of pitcher BABIP (weighted by PA) on the three main variables used in SIERA (SO/PA, BB/PA, and (GB-FB-PU)/PA) for all pitchers with over 40 inning in a seasons. This gave me a formula that pitcher BABIP skill is .304 – .077*(SO/PA) + .018*(BB/PA) + .052*((GB-FB-PU)/PA). The variance in projected BABIPs for all pitchers with 150 innings was 4.49*10^-5. Since actual pitcher BABIP skill should have a variance of 5.20*10^-5, that means we can explain about 86 percent of pitcher BABIP by looking at their three primary skills. Thus, SIERA picks up on the majority of pitchers' actual BABIP skill, which explains its strong estimation abilities even with only seven years of data to work with.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

You need to be logged in to comment. Login or Subscribe

roughcarrigan

3/17

"Additionally, pitchers who walk a lot of hitters, who miss the corners of the plate by a few inches, also have trouble with leaving the ball a few inches towards the middle of the plate, too, and give up more hits. Greg Maddux is the obvious example."

Um, did I accidentally click on Bizarro Prospectus?

Greg Maddux is the obvious example of a pitcher who walks a lot of hitters and who misses the corners of the plate?

Reply to roughcarrigan

swartzm

Greg Maddux is the obvious example of a pitcher who had a low BABIP because he didn't do that. I apologize of that wasn't clearer.

Reply to swartzm

BurrRutledge

I had the same interpretation as roughcarrigan. Perhaps you can edit the article above to be more explicit of Maddux not missing the corners, and add in another "control pitcher" who did miss the corners.

Reply to BurrRutledge

Richie

No such thing as a control pitcher who misses corners. Unless you're studying rookie league washouts.

Reply to Richie

cjones06

or Tom Glavine

Reply to cjones06

ZacharyRD

You know, it'd be incredibly helpful if BP changed how it displayed math such that it'd be easier to read visually.

Reply to ZacharyRD

iolair00

Agreed. Commercial and free solutions for rendering equations have existed for more than a decade. LaTex and MS Word spring to mind.

Reply to iolair00

rockyoursox

Just a vote that the way the more detailed math section was split off worked nicely for me. I'm not really able to follow the more detailed tech stuff, but this really allowed me to get the main point of the article before diving into the data with a little more knowledge of what to look for.

Reply to rockyoursox

greensox

3/18

The reply button doesn't work for me.
I think Glavine and to a lesser extent Maddux missed a TON of corners....they just lived in the era of Frank Pulli et al and "Americas team" and were given the benefit of strike zone largesse.....

Reply to greensox

NathanJM

So.. that 12%... it seems like that may include some measurable components worth digging further, such as the pitcher's own defensive ability. The Greg Maddux example pops to mind again here. I think your method of looking at overall team BABIP may account for some of this, but not all. I think team BABIP makes sense for looking at the 8 guys behind him (ok.. 7 behind, 1 in front) but the fielder at the P position will always be himself, not the team's average P. I'm not sure how I'd split this out, but it's something to think about.

(That said, as I re-read the article, it sounds like previous studies show 84% of that 12% may be from the repertoire, leaving 16% of 12% as the possible gain from pitcher defense... so it might just end up being a lot of work to gain 2%)

Reply to NathanJM

mbodell

3/19

There are lots of other things that could be factors too including strength of opponents faced, park factors for IP faced since not everyone pitches at home/on the road the same or against the same offenses.

Also, it could involve handedness, both for amount of hits but also for which part of team defense is involved in more of the contact plays.

It isn't clear that there is a skill part here just from this to me.

Reply to mbodell

ericmvan

Just a note: the historical correlation of manager to team BABIP is very high, seemingly much higher than you could explain by changes in defensive personnel. So what you're identifying here as "team" includes not only the defensive skill of the fielders, but the quality of their coaching.

At about the same time as I did that study I did a correlation study of BABIP for pitchers who changed teams -- which in fact was the first post-Voros study of any kind to demonstrate that BABIP was a pitching skill (all of this work is buried in the bowels of rec.sport.baseball). I seem to recall that the luck % that I came up with (based on the r^2 of the regression) was less than 75%. I'll have to dig up that study and think about the results.

Reply to ericmvan

Ahead in the Count: Why SIERA Doesn’t Throw BABIP Out with the Bath Water

Thank you for reading

Latest Articles

Next Man Up ’25: Release Points $

Fantasy Starting Pitcher Planner ’25: Week Six $

Box Score Banter: You Had to Be There B

The Short Order: Teams With The Most Tylers $

Create-A-Closer, Starring Juan Mejía $

Matt Swartz

Latest Articles

Next Man Up ’25: Release Points $

Fantasy Starting Pitcher Planner ’25: Week Six $

Box Score Banter: You Had to Be There B