I know, I’m not supposed to, but I believe in clutch hitting.
By clutch hitting, I mean that certain players have some sort of ability to perform better in higher leverage situations. Leverage, for the uninitiated, is a concept formalized by sabermetrician Tom Tango. We know that some situations in a game are more important than others. When it’s 15-1, no one cares what happens in a plate appearance. When it’s the bottom of the ninth with runners on second and third, two outs, and the home team is down by one, pretty much the entire game rides on this at-bat. Leverage index is a mathematical model of how much more important that late game situation is.
Leverage is based on the idea of win probability. We can look at each game situation (let’s say, bottom of the third, one out, runner on first, and the home team down by two) and figure out over some past time frame how often the home and visiting team won. More to the point, we can figure out how much that win probability can change based on whatever is about to happen next. In the 15-1 situation, whatever the batter does is going to move the needle very little. In the bottom of the ninth, the win probability could go from roughly 50–50 to 100–0 in a hurry. When a batter does something positive that increases his team’s chances of winning, we give him credit for adding win probability (even if giving him all the credit is silly). In a high-leverage situation, a batter can accumulate a lot of win probability in a single at-bat.
The standard test for whether there is such a thing as clutch hitting has been to look at the win probability that a player records over the course of a season and compare it to what his win probability would have been in situations where the leverage index was 1. (This is the basis of how our friends at FanGraphs calculate clutch.) From season to season, players show very little correlation on this measure of clutch. In general, the interpretation has been “clutch doesn’t exist†rather than “we had a poor measure of clutch to begin with.†Indeed, I have found that this measure of clutch eventually does become reliable. It just takes a while. Maybe there is signal in all that noise; maybe we need a better antenna.
Warning! Gory Mathematical Details Ahead!
In the 2014 Baseball Prospectus Annual, I introduced the idea of looking a little more closely at individual players to see how they reacted to pressure situations. I examined how, for each player, the leverage of a situation affected his tendencies to swing at the first pitch. There’s a separate regression equation for Daniel Murphy, David Murphy, and Donnie Murphy. Since every plate appearance has a first pitch and the count is always 0-0 when it happens, I’m able to hold a few things constant. But my program runs a logistic regression only looking at Daniel’s at-bats and what he did in them, creates an equation describing his behavior, and then does it again for David, and again for Donnie.
I then took each equation and calculated the chances that each player would swing at a first pitch when the leverage index was 1 (average) and 2 (a situation twice as important as the average situation). Then, I subtracted the two and got a rough indicator of how high leverage began to affect a player (at least on this one behavior). I used a minimum of 250 plate appearances in a season and looked at players from 2009 to 2013. In the past, I’d found that clutch, as described above, had a year-to-year correlation of .074. (I used a method known as auto-regressive intra-class correlation.) For this group, across the five years, the ICC was .30. That’s not huge, but we call home runs a true outcome for pitchers with year-to-year correlations in the same neighborhood. I termed this difference between predicted first-pitch swing rate “swing difference.†Some players swing a lot more when the leverage goes up. Some barely notice. A few start to freeze.
Next, I wanted to see if swing difference predicted changes in outcomes. For the years 2009 to 2013, I used the log-odds ratio method (which I have used multiple times before) to create a predicted percentage that each plate appearance would end in a strikeout based on the batter and pitcher’s usual rates in that area. I did the same for walks and singles and home runs and the rest of it. Next, I looked at all plate appearances in which a batter with 250 PA in that season faced a pitcher with 250 batters faced in that season. I created a binary logit regression in which I had my predicted percentage of a strikeout (for the initiated, expressed in a log of the odds ratio), and then entered in the leverage index for each plate appearance, the swing difference stat for the batter and the multiplicative interaction of swing difference and leverage.
This type of analysis, called a moderator analysis, is well-suited to answering the “clutch question.†If certain players have some sort of clutch factor (and here, we’re using swing difference as a rough measure of clutch) then as leverage increases, we would expect to see those who are higher on this clutch factor to show greater increases (or sharper decreases). That’s what the interaction term between swing difference and leverage does. If it’s significant, it means that as leverage goes up (or down), the effect it has will depend, at least in part, on that clutch factor.
What I found is that for hitters who show more of an effect on swing difference (leverage makes them swing at the first pitch more), they were less likely than expected to walk and less likely to strike out as leverage went up. Instead, they showed higher rates of both extra base hits and outs in play. To show some sense of how much of an effect this could have, here are the numbers for strikeout rate.
Let’s say that our pitcher-batter matchup stats alone would suggest that the chances of a strikeout are 20 percent. Now, let’s take a look at what would happen in a situation that has a leverage value of 1, and compare a batter who has a swing difference of .10 (he swings at first pitches ten percent more often in higher leverage situations than he does in medium-leverage situations) and a batter who has a swing difference of 0 (he swings equally in both situations). The values are the likelihood of a strikeout happening.
High Swing Difference | No Swing Difference | |
---|---|---|
Leverage = 1 | 19.3% | 19.3% |
Leverage = 2 | 17.7% | 18.3% |
In an average-leverage situation, the two hitters are about the same (they differ at the fourth decimal place), but once the leverage is turned up a bit, they get different results. Not by a lot, but it’s there. You get the same basic effect sizes for the other outcomes.
Before we go further, the careful observer will note that there’s a certain tautology that goes along with these analyses. I think it doubles as both a feature and a bug. A batter who is more likely to swing at the first pitch in high-leverage situations is probably just more likely to swing in high-leverage situations. It’s no wonder he sees a drop in his expected walk rate (and in some sense his expected strikeout rate). And if we’re saying that his swing rate drops because of leverage (or at least in accordance with leverage), then it’s not surprising that the effect appears. We’ll talk about this more in a bit.
Clutch. Heart. Grit. Myocardial Infarction.
Let’s clear a few things up. Clutch is not a result of having superior moral character, notwithstanding the plot of every sports movie. It is also not a guarantee that a hitter will always come through. My contention is a much more reserved one. Clutch is likely some combination of ability to deal with pressure combined with some particular change in approach, whether conscious or unconscious, that results in slight variations from what we might otherwise expect. For some, that change makes a hitter better and in some it makes him worse.
These analyses may not completely prove that clutch ability exists, but they do lay what I hope is a foundation for how we might continue the search. “Clutch†is a way of saying that the situation matters because players are human. What we have here is an indicator that has reasonable (if not great) consistency across years, and it explains differences between players in how leverage affects them. More searching might find something with more consistency. Even then, year-to-year consistency is not the only way to establish that a measure is reflective of a player’s true talent level. Using a more tracking-based approach might help. Players can and do change, even within a season. There’s no reason clutch needs to be an enduring trait, rather than a state we can detect with some reliability. The rest is simply showing that the factor, whatever it is, can explain some of the differences between players’ performances in different leverage situations.
As to these specific analyses, it might very well be that what’s driving things is that some players are looking at the sorts of relievers they face in high-leverage situations and saying “Well, he usually comes right at me, so no point in messing around. I might as well swing when I see something interesting.†It might not be a mystical force at work, but a very reasonable reaction to the circumstances. In that case, clutch isn’t even something psychological, but a mental skill. Still, there could be problems with multi-colinearity. What this might be showing is that some players swing more in high-leverage situations, and so we would expect them to take fewer walks, somewhat by definition. Then again, even knowing that information could have strategic value. Maybe when we have other data sets to work with, we might be able to look at measures of how leverage affects a player that aren’t based on game results.
The other piece of this, and it’s one that I tried to drive home in the piece in the Annual that started everything, is that knowing that a player swings more (or less) often in high-leverage situations might be good within the context of one skill set and bad within another. These analyses fall into the large-N trap that assumes that more swinging is better (or seems to be) for everyone. But if nothing else, I’d present these analyses as a way of re-opening what had been assumed to be a closed debate. Clutch hitting might just exist.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
For example, could it be that some players react to stress so that they're more focused, but without that stress they're actually under-performing?
That is hardly praise for them. I think they should be trying hard all the time.
You state positive swing differences correspond to striking out less. I find this one of the more surprising discoveries here. Taken to its extreme, it would suggest that hitters with no discipline would seldom strike out, but that isn't what happens. More swinging would only guarantee more contact if the mix of pitches faced remained the same.
And in lieu of treating that as a binary clutch/not-clutch variable you could use LI in those "obvious" situations and some baseline for any non-clutch situation.
Also would be interesting if those clutch situations could be weighted by the importance of the game itself (i.e. a game for a team in a pennant race means more (has higher "game leverage"(?)) than a Rangers game).
Also, lasers! You could include lasers!
Is it possible, or even likely, that the batter outcomes are similar even if the player changes approach?
You could argue that a better analysis would be to categorize the PA into more controlled outcomes (BB, SO, GB, FB, LD), and see how the predicted results of an outcome of that type would affect Win Probability, with respect to the maximum (positive & negative) effect they could have. So if the worst you can do is drop the probability from .4 to .35, but the best takes it from .4 to .7, then you get a little credit for leaving it at .4; if the worst takes it from .4 to .25 and the best from .4 to .5, you get more credit for leaving it at .4. Or, in other words, some of clutch hitting is avoiding negative results at important moments.
The other aspect that I'm a little less clear on is the probability of a result. WPA should take into account the typical distribution of results in that situation, but it has less to say about the rarity of a particular outcome. Even normalizing for leverage doesn't exactly remove that as a factor - a rare outcome probably changes the WP by more, but not necessarily in proportion. This could actually cut both ways - perhaps HRs are a powerful enough outcome that they are actually disproportionately represented in WP.
I think if we are talking about fans watching certain players and saying that guy is clutch, we have to assume that for that to really be true in a way that is meaningful, the effect has to be fairly large (at least, on the order of perceptible difference in BA). But when a fan watches a situation, they don't see "well, a walk only adds .01 of a win, a double adds .05, and a homer adds .2" and then judge the double as a mediocre outcome - they see the possibilities of making an out, driving in a run, or getting on base. They are more likely to call a guy who drives in runs consistently a clutch hitter, obviously, but a guy who strikes out, grounds into a double play or hits a lazy fly ball is going to seem a lot less clutch than a guy who hits is hard somewhere. So maybe the question we've been asking is "do hitters accomplish anything useful with clutch hitting, and can they repeat it?" and we should really be asking only one of those questions at once.
BP to acknowledge that PEDs actually help performance?
Shocking.
I am confused. You found somewhat of a "talent" for swinging more or less at the first pitch depending on leverage, and you also found that swinging more or less at the first pitch affects walks, K, outs, and extra base hits, right? But you don't know if this actually makes the batter better or worse - it could be that just their approach changes, but not necessarily the win impact, right?
So, why, in your first sentence, do you say:
"By clutch hitting, I mean that certain players have some sort of ability to perform better in higher leverage situations."
Did you find that some players do perform "better" by virtue of this swing change, or not? "Better" has to be that their win impact goes up. It can't just be a change in behavior without knowing how that affects their win impact, right?
It must be said though that even a rigidly data-driven person on this subject will remember Ortiz's 2013 World Series for a long time.