In my last article, I looked at the career path of Albert Pujols from the perspective of PITCHf/x. Given the extreme fluctuations in Pujols’ skill over the last five years, I suspected that he would be a good test case to understand how batters are handled differently as their skills change. I found that pitchers approached Prince Albert more and more aggressively as his skills fell off, throwing him pitches closer to the center of the zone.
Even before Pujols’ results began to decline, pitchers were attacking his strike zone ever more audaciously. Consider this graph, which looks at the trend in Pujols’ zone distance in 2009 (left of the blue line) and 2010 (right), years in which he posted TAvs of .373 and .357.
Throughout both years, pitchers were pitching Pujols closer to the center of the zone. In other words, as Pujols was at his most dangerous, pitchers were paradoxically more willing to come after him in the zone. One interpretation of this graph might be that pitchers and teams had noticed an underlying flaw in Pujols’ plate discipline, the same flaw which later caused his historic drop-off. In the broader context of understanding baseball, the above graph prompts the following question: Can one predict a hitter’s performance based on the trends in how pitchers approach that hitter?
Ben Lindbergh and Sam Miller raised this very question on a recent episode of Effectively Wild. I took their discussion as a challenge and set out to address the relationship between changes in zone distance and changes in batter ability.
The Year of Chris Davis
I’ll start by trying to predict breakouts across seasons. I’ll look for patterns of hitters with increasing zone distances in 2012 and then see if those hitters outperformed expectations in the next year. As a baseline for comparison, I’ll use PECOTA’s predictions for a hitter’s true average. Anything above that projection indicates overperformance, and a significant increase constitutes a breakout.
As a side note, for now, I’m going to limit myself to looking for breakouts, rather than breakdowns (unexpected decreases in performance). Even though Pujols’ breakdown prompted this piece, it turns out that breakdowns are more difficult to examine and predict, largely because they often involve the added complication of injury. I’ll return to the subject of breakdowns in a future article.
The below graph is the kind of pattern we are looking for: in which pitchers gradually learn not to provoke a hitter and begin shading ever farther away from the zone’s center.
As it turns out, this graph belongs to Chris Davis, circa 2012. Sometime around midseason, there appears to have been a dramatic change in the behavior of opposing pitchers, such that they decided to stop throwing him strikes.
Using Davis as a template, I looked at the slope of the linear regression line of each hitter’s zone distance profile over the course of the 2012 season (min. 1500 pitches). This method is a little crude, but also simple: the higher the slope, the more pitchers threw away from the zone as the season progressed. Here are the hitters with the most significant zone distance shifts for the year 2012, and how they performed in 2013 (min. 500 PAs).
It appears that changes in zone distance do a good job at predicting breakouts. Eight of the top 10 saw their 2013 performances exceed PECOTA’s projections, by margins ranging from .079 (the aforementioned Mr. Davis) to .003 (Giancarlo Stanton). MVP Andrew McCutchen makes an appearance with a solid (+.031) increase in TAv. Chris Davis is undoubtedly the great success of the method, given that his 2013 could well be regarded as the definition of a breakout season. Recall that Davis’ breakout was a huge surprise, as he had formerly been regarded as a typical Quad-A type player.
Only two players on this list underperformed their projections: David Freese and Buster Posey. Both suffered injuries in the course of the 2013 season, in both cases severe enough for them to miss games. Freese dealt with a lingering lower-back injury throughout the season, at one point visiting the 15-day disabled list. Posey’s injury was more minor (a fracture in his ring finger) but potentially affected his hitting as well.
Even considering these two cases as “misses” for the predictions, changes in zone distance are remarkably accurate overall. Collectively, the above 10 players outperformed their PECOTA projections by .0237 points of TAv, or roughly the difference between Paul Goldschmidt and Todd Frazier this year. That’s no small margin. The probability of randomly picking a group of 10 players who overperformed their PECOTA projections by that amount is something like .005, suggesting that changes in zone distance are statistically significant for predicting breakouts (although excluding Davis increases that probability to ~.05). Overall, there’s a statistically significant relationship between change in zone distance and difference from the PECOTA forecast, with or without Chris Davis included.
This Year
Since increases in zone distance seem to be an efficient way to predict breakouts, let’s take it for a spin on last year’s data and see which players are predicted to breakout this year. Here are the top 12 (because I wanted to include Chase Utley) risers in zone distance from 2013, as well as their preseason and actual projected TAvs for this year.
So far, the breakout candidates this year are outperforming their projections by about .012 points of TAv (or .0079 for just the top 10), which is down substantially from last year, but in line with the non-Chris Davis breakouts of 2013. Generally, TAv numbers are going to be a lot more volatile in ~100 at-bats, so consider these over-performance numbers provisional for now.
The successes so far this year include David Ortiz, who continues his highly successful rebellion against Father Time, as well as Victor Martinez and Chase Utley, who appear to be joining Ortiz in that struggle. Indeed, the mean age of the group is an ancient 32, which is worth further investigation; maybe radical changes in zone distance mostly occur for older hitters. Among non-geriatrics, Anthony Rendon has been raking and would have been predicted to do so by this method, while Starling Marte seems not to have improved as much as expected.
Raul Ibanez can safely be counted as a huge failure of the method, since he’s been absolutely putrid so far this season and shows no signs of improvement. Since the season started, he’s also seen a dramatic decrease in zone distance, partially reversing the increase from last year. He’s probably not as bad as his current TAv, since his BABIP stands at an absurdly unlucky .172. With that said, given the sub-Mendoza batting average, Ibanez is either going to improve substantially or fail to get many more plate appearances.
That fact highlights a survivor bias issue: generally, good players are going to see more pitches and get more playing time than bad players, so I’m missing players who were so bad as to be demoted. This bias could artificially inflate the accuracy of the breakout predictions. However, if I go back to the 2013 data and instead allow players with any (>0) PAs in 2013, increases in zone distance are still significantly associated with over-performance of PECOTA projections, so survivor bias is probably not sufficient to explain the predictive power of zone distance changes.
Conclusions
It’s interesting to see how much more rapidly teams become aware of changes in player skill than the baseball-watching public. I don’t think many tabbed Davis as a breakout candidate prior to the 2013 season, even though he finished 2012 strong (Russell Carleton found that such a late-season surge is more often than not a mirage). Yet PITCHf/x suggests that pitchers were aware of Davis’ newfound strength as early as midseason 2012.
There are at least two direct ways in which teams may gather and employ information to which we don’t have access. The first is via the scouting report, which involves lots of careful observation of a hitter: tendencies, quirks, holes in their swing, and so on. That kind of assiduous study could certainly reveal weaknesses in a hitter before the general public catches on (or the results catch up). A well-trained scout could notice that Chris Davis is suddenly showing an improved mentality at the plate and relay that to the pitcher, resulting in the pitcher avoiding the zone marginally more.
The second way this change could happen is in-game. Besides scouts, both the pitcher and catcher are meticulously watching the hitter for any sign of unexpected weakness or strength. Imagine the following scenario: a pitcher takes aim at the outside edge of the plate and misses towards the middle, and the hitter punishes the mistake with a towering homer. The pitcher or catcher will take note and be a little less likely to challenge the hitter inside the zone in the following at bat.
Most likely, adjustments are made at both the scout- and player-level. Both sources probably contribute to the observed tendency of hitters with zone distance increases to break out in the following year. Anecdotally, it appears that the converse idea—that hitters with zone distance decreases are in line for decreases in performance—is also true. However, these breakdowns often involve the extra complication of injuries, and so require a more nuanced approach for prediction.
Both breakouts and breakdowns presumably involve some additional adjustment by the batter, as well, which I will examine in the future. For example, even if pitchers were hesitant to throw Chris Davis a strike in 2013, Davis had to capitalize on this by curtailing his swing rate on those out-of-zone pitches. Presumably, some cases of missed breakouts (or breakdowns) are the result of batters failing to make adjustments to their approaches to suit the new pitch mixtures they were receiving.
This response was notably the case with Albert Pujols, who took cuts at pitches farther out of the zone on average even as he was seeing pitches closer to the zone. So there’s a whole other side to this issue to address, because even as the pitcher is matching his strategy to the batter, the batter is countering that with a new set of tactics in response. Considering the matchup as a head-to-head battle of wits and execution, it’s not surprising that the pitchers would become aware of a hitter’s changes in ability long before the hitter’s outcomes improve enough to be noticed.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
If you are going to claim that these players "broke out" because they exceed 2013 Tav projections, then you need to defend that Tav projections (which assume regression toward the mean) were an accurate refection of the batter's baseline ability heading into the 2013 season.
Looking at your first chart, these batters are all good hitters who had very strong 2012 seasons. Is it possible that the pitchers were reacting to their strong 2012 seasons rather than employing some mystical method to identify that these hitters would break out in 2013? I would argue that this might be the case.
Looking at the 2012 Tavs, it is apparent that most most of these players actual had worse Tavs in 2013 (the "breakout year"). On the whole Tav went down by .010 per batter in 2013. Only 3 of 10 players posted better Tavs in 2013(than 2012)and only 2 of 10 players exceeded 2012 by more than .005.
For example, looking at David Ortiz, who had the highest zone distance trend, his projected Tav was .297. He exceed this projection in 2013 with a .309. You could argue that his zone distance trend of .0012 predicted this breakout. However, his Tav in 2012 was .343 which is substantially higher that the Tav he produced in 2013. You could argue that pitchers recognized the strong 2012 season, reacted by pitching him further away from the center, causing him to REGRESS in 2013 back to a Tav of .309.
It may be that pitchers are not using some unidentified proprietary information to project improvements in hitting ability. It may be that they are doing a better job of recognizing after it occurs than Pecota does.
David Ortiz Tav in 2013 was .332, compared to Projected 2013 Tav of .297 and actual 2012 Tav of .343.
Your zone distance trend is for 2012 data, and I had incorrectly attributed to 2013. This does a make a difference.
However, since most of these players performed better in 2012 than 2013, the questions still remain as to when the breakout occurred and whether the pitchers were predicting it or reacting to it. It seems as though some of these players had their better year in 2012 and that pitchers changed their approach in 2012. So at the very least, the pitchers adjusted very rapidly.
Sorry about the confusion.
That's certainly not necessarily the case. Check out Albert Pujols graph above; in his best year (and one of the best years anyone has ever had!), his zone distance was trending downward. You see that with plenty of hitters having good years, especially when BABIP-aided. With that said, Pujols is only anecdotal evidence, so I'll look at the league-wide trend and get back to you later with some gory math (either in the comments or in an article).
I tend to think of "breakout" as coinciding with a player changing his skill set, but I do see the utility of defining it in terms of Pecota projections, understanding that a breakout for 2013 might actually mean a skill improvement which occurred in 2012, but which hasn't yet accrued enough data to be refected in Pecota projections.