September 29, 2005
Still Not Clutch
As expected, last week's discussion of clutch hitting generated more than the usual share of reader e-mail. BP readers, an intelligent group, pointed out a few problems with both the methodology and conclusions drawn from the data. Thus, this week will be a follow-up, taking those suggestions into account.
First, and most critically, when attempting to create an expected WINS total for each player, I used VORP. VORP is inappropriate here because it includes a positional adjustment. Thus, players who play at tougher positions will have similar VORP totals to those who play at easier positions and post poorer batting lines. As a result, players at positions like shortstop, second base and catcher were assigned higher expected WINS (PrjWINS) than their raw batting statistics would indicate, while those at positions like designated hitter, first base and the corner-outfield slots were in the opposite situation. Players at more difficult defensive positions were assigned lower "Clutch" totals than players at other positions because of this discrepancy. You can probably guess what's wrong with this conclusion from the original article because of that:
"Adam Dunn usually finds himself on the wrong end of discussions of doing what it takes to win--cutting down on strikeouts and the like--but he's among the league leaders on the clutch list. Perhaps even more surprising is the 16th-least clutch player in the major leagues, Derek Jeter."
Dunn plays left field while Jeter plays shortstop, so while the two may have similar VORP totals--Jeter was at 60.1 while Dunn was at 55.3--Dunn's raw batting statistics were superior to Jeter's, thus their PrjWINS totals were incorrect.
To adjust for this oversight, we can simply substitute Marginal Lineup Value for VORP. MLV is one of the bases for VORP, but does not include any positional adjustment, just as WINS does not, either. The two stats make for a good match. In fact, the correlation between MLV and WINS is much higher than that of VORP and WINS, meaning we can place more confidence in the new PrjWINS.
Another criticism was that it is possible players do not appear in similar situations over the course of a season; some players could come to the plate in a disproportionate number of high-leverage situations while others come up in their fair share of blowouts. This isn't the fault of the hitters, so they should not be criticized for failing to contribute their expected number of wins over the course of a season.
To respond to this point, we can use another one of the tools developed by Keith Woolner with the win expectancy framework from Baseball Prospectus 2005 used to calculate WINS: Leverage. LEV is the final column on the Relievers Expected Wins Added report, a way to determine if certain relievers are being used in situations that have a greater impact on win expectation. LEV is defined as "the change in the probability of winning the game from scoring (or allowing) one additional run in the current game situation divided by the change in probability from scoring (or allowing) one run at the start of the game." A LEV of 1.00 is league average.
In the reliever report, Francisco Rodriguez is tops among relievers with at least 30 innings with a LEV of 2.26 while James Baldwin trails with 0.20. What's more important is that the standard deviation of LEV among relievers with at least 30.0 IP is 0.41. Among hitters with at least 200 PA, it's 0.07. Obviously, this is a result of the fact that batter usage is constrained by the batting order while pitcher usage is a result of the manager's decisions. As a result, 95% of batters are within 0.14 LEV of the league average (in any normal distribution, 95% of the sample can be found within two standard deviations of the mean); there are a few outliers here and there--Eric Chavez had a LEV of 1.22 in 2004 while Mark Loretta was at 0.78 in 2002--but those outliers are significantly smaller than those found among relief pitchers. As such, almost no batters come to the plate in a disproportionate number of high-leverage situations. What's more, seeing more high-leverage situations doesn't necessarily mean that a batter will have a higher or lower Clutch, it just means that they will likely have a more extreme Clutch than batters in lower leverage situations. Batter LEV will be included in the tables below, but PrjWINS will not be adjusted for it because the difference is so small.
With those out of the way, let's check on how things have changed from last week with a look at the new Clutch measure, using MLV instead of VORP. I've included the change in a player's rank on the lists in the final column (players in the "worst" list have ranks listed from worst to first, so 1 would mean the least clutch player):
Name WINS LEV MLV PrjWINS Clutch PrevRank Chipper Jones 5.50 1.02 30.3 2.25 3.26 1 David Ortiz 7.01 1.03 55.4 3.97 3.04 2 Lyle Overbay 4.08 1.03 20.0 1.41 2.67 6 Tony Clark 5.28 0.97 37.2 2.66 2.63 3 Shannon Stewart 1.88 1.12 -8.3 -0.64 2.52 13 Jose Guillen 3.83 1.09 19.0 1.33 2.50 8 Moises Alou 4.62 0.99 31.3 2.22 2.39 5 Darin Erstad 1.35 1.09 -13.1 -0.99 2.34 21 Garret Anderson 2.14 1.08 -1.4 -0.15 2.29 15 Lew Ford 1.62 1.13 -7.5 -0.58 2.21 30 Jorge Posada -1.27 0.99 8.4 0.57 -1.83 16 Jimmy Rollins -1.76 1.02 2.1 0.11 -1.87 20 Mark Teixeira 0.95 1.01 40.1 2.87 -1.92 35 Juan Rivera -2.05 1.05 0.1 -0.03 -2.02 30 Miguel Tejada 0.20 0.96 31.5 2.24 -2.04 12 Albert Pujols 3.61 1.04 79.3 5.70 -2.09 34 David Bell -3.30 0.99 -16.0 -1.20 -2.10 18 Alex Gonzalez -2.78 1.04 -7.0 -0.55 -2.23 19 Shawn Green -0.55 0.99 24.9 1.76 -2.31 4 Casey Blake -2.53 1.07 -2.2 -0.20 -2.33 17
As expected, no player has a LEV different from 1.00 by more than 0.13, and most are within 0.05. Jeter is notably absent from this list, but he's still 916th out of 935 players. Likewise Dunn drops to 40th: off the list, but still among the league leaders. On the whole, players don't move too far up or down the list by switching from VORP to MLV.
Additionally, I stated last week--without showing solid proof--that "Clutch" was not consistent from year to year. There are a couple ways to look at this without simply posting long lists of previous seasons of Clutch lists. The first is to point out that the correlation between consecutive seasons of Clutch/PA--simply correlating Clutch would subject the sample to variations in playing time--among batters with at least 200 PA in consecutive seasons is .098; the r-squared is .010. (R-squared is the conventional nomenclature for the coefficient of determination. It's an indication of how much of the variance in one population can be explained by another with 1 being perfect explanation and 0 being total independence or randomness. In this case, an r-squared of .010 means that 1% of the change in Clutch from one season to the next can be explained by a batter's Clutch from the previous season, or, conversely, that 99% of the change is unexplained.)
The second way to see if there is any consistency to Clutch is to attempt to generate larger sample sizes by breaking up player careers into even-numbered and odd-numbered seasons and then compare the two halves of a career, an analysis technique borrowed from Keith Woolner's discussion of pitcher control on balls in play. Comparing these larger sample sizes, the r-squared between career halves is .025, still nearly completely random. This indicates that even when comparing extremely large samples, batters show no consistent ability to over- or under-perform their expected WINS from year to year.
Finally, some readers suggested merging the Clutch stat with the Playoff Odds Report in an attempt to map performance not only to the odds of winning a particular game but to getting to the playoffs. This idea is something I'm looking into, but running alternate versions of the Playoff Odds Report takes a long time--remember, we're playing the rest of the season one million times. So we'll leave that for next time and wrap up with a list of the highest and lowest Clutch totals for the last six seasons:
Name Years LEV Clutch Clutch/Yr Jeromy Burnitz 6 0.991 9.08 1.51 Fred McGriff 5 0.960 8.30 1.66 Mike Sweeney 6 0.968 7.77 1.29 Jose Vidro 6 1.015 6.83 1.14 Jason Kendall 6 0.977 6.80 1.13 Ryan Klesko 6 0.991 6.74 1.12 Matt Lawton 6 0.982 6.57 1.09 Jacque Jones 6 1.041 6.48 1.08 Randy Winn 6 0.968 6.24 1.04 Greg Vaughn 4 0.895 6.23 1.56 Brian Giles 6 0.955 5.92 0.99 Tony Clark 6 0.959 5.85 0.97 Richie Sexson 6 0.955 5.52 0.92 Carlos Delgado 6 0.993 5.47 0.91 Lance Berkman 6 1.012 5.30 0.88 Neifi Perez 6 1.013 -5.08 -0.85 Alex Gonzalez 6 1.015 -5.40 -0.90 Eric Young 6 1.003 -5.43 -0.91 Michael Barrett 6 0.989 -5.46 -0.91 Marvin Benard 4 1.046 -5.55 -1.39 Magglio Ordonez 6 1.003 -5.56 -0.93 Damian Miller 6 1.084 -5.87 -0.98 Brad Ausmus 6 1.048 -5.87 -0.98 Mike Lieberthal 6 1.041 -6.00 -1.00 Derek Jeter 6 1.049 -6.28 -1.05 Bill Mueller 6 1.028 -6.65 -1.11 Ivan Rodriguez 6 1.027 -6.67 -1.11 Albert Pujols 5 1.043 -6.88 -1.38 Richard Hidalgo 6 1.020 -7.15 -1.19 Alfonso Soriano 6 1.035 -7.26 -1.21
Normally I'm happy when numbers disagree with public or personal perception; it fosters debate and makes us question the way we watch the game and the context of what we're seeing. But I'm shocked at those two lists. I cannot fathom how players like Jeromy Burnitz and Fred McGriff are at the top and Jeter and Albert Pujols are at the bottom. At first glance, it may appear that the players at the bottom have higher LEV numbers than those at the top, but there is no correlation between the two metrics. I'm also curious why there are so many catchers at the bottom of the list--of course, Jason Kendall of all people is near the top--but I can't find a reason for that, either. In the end, we'll just have to be content with the fact that there's still no evidence that certain batters consistently out-perform their expected WINS based on MLV and say that this list, while interesting, has no bearing on future performance.
But Greg Vaughn? Seriously?