Baseball Therapy: The Viability of Burying a Bad Bat

March 17, 2014

Team captain and 39-year-old farewell tour participant Derek Jeter is currently the starting shortstop for the New York Yankees. That is the way of things and has been since I was in high school. But the Yankees also have Brendan Ryan on their roster. Ryan is a noted defensive wizard while Jeter is [must…not…make…Jeter fielding joke]. However, Ryan “hit” only .197/.255/.273 last year in 349 plate appearances. Is there a case to be made for Ryan as the starting shortstop based on his defensive prowess? Keep in mind that the Yankees could bury Ryan in the batting order to limit his exposure, move the ever-under-appreciated Brett Gardner up to the two-spot, pinch hit for Ryan late in the game, and enjoy that sweet glove for eight innings a night. Is that enough to overtake De-rek Je-ter?

Let’s go one step further and assume that Jeter will return to his 2011 and 2012 form. In those years, he was worth 1.4 and 3.0 Wins Above Replacement Player, respectively. Ryan, in those same years, was worth 3.5 and 1.9 wins, based mostly on his stellar defense. Thanks to Jeter’s injuries and Ryan’s offensive nosedive, the two checked in at roughly replacement level last season. Even discounting Ryan’s expectations a bit, could we not make the case that while they have two very different skillsets, they are at least in the same ZIP code when it comes to overall value?

Okay, so the Yankees aren’t actually going to bench Captain America in favor of Brendan Ryan, but the Yankees aren’t the only team facing this sort of a decision. This is a classic bat vs. glove positional battle. The Dodgers seem confused about whether to play Alexander Guerrero at second base, despite the fact that he does not appear to own a glove. Their other option is to play some utility type there who has a decent glove, but not much of a bat. Michael Morse and Gregor Blanco have a similar dynamic going in San Francisco.

What’s the cost of carrying a starter who can’t hit? It’s true that a team really can bury him in the nine-hole if they want. But what if a team tried carrying two of these players? Three?

It’s only in the last decade that advanced defensive metrics have been publicly available to give us a full understanding of how much defense matters. With the implementation of WAR(P), it’s become easier to roll both defensive and offensive numbers into one uber-metric. Now we can directly compare players with very different skillsets against a common baseline, although there is a weakness in WAR(P) for which we need to account. The offensive component in WAR(P) is generally based on the idea that each event that a player generates has a certain run value (e.g., a home run is worth roughly 1.4 runs). The idea is that we pretend that all players live on “average teams” and that they always bat in “average situations,” with an average number of runners on base. Home runs are worth more when there are runners on, and some teams employ hitters who are better at getting on base than others. For WAR(P), where the goal is to create a context-neutral common baseline to use for comparison, pretending that everything is average is a feature. But for teams making decisions about their specific circumstances, it’s a bug.

Consider our Jeter vs. Ryan debate. Let’s return to the halcyon days of 2012, before Jeter hurt his ankle (and before Brendan Ryan was a member of the Yankees), when he posted a line of .316/.362/.429. In 2012, Ryan put up a slash line of .194/.277/.278. Had Ryan been a member of the Yankees and their only option at short, he would have been hitting ninth, meaning that other hitters would have been moved up higher in the batting order out of necessity. Not only that, but Ryan’s general aversion to getting on base would have meant that there would have been fewer runners on when the lineup flipped over.

The guys at the top of the lineup are good, and you want them to have runners on base to knock in. A bad hitter at the bottom of a lineup makes the good things that the top-of-the-lineup guys do less valuable by robbing them of men-on-base situations. On top of that, hitters who otherwise would have been hitting eighth would have hit seventh, meaning that they would get more at-bats over the course of a season. In one game, the effects might not shine through. But over the course of 162 games, the little losses at the margins always add up to something. How much does it matter here?

Warning! Gory Mathematical Details Ahead!
Before we get into the math, we need to start with a very important point. All of the analyses that we’re about to conduct are done with real live data (2009-2013), and so they reflect the zeitgeist of how lineups are constructed in the past few years. Teams generally put together lineups with on-base at the top, power in the middle, and leftovers at the end. (Or alternatively, speed at the top, strikeouts in the middle, and scrappy guys at the bottom.) “What is the best way to construct a lineup ex nihilo?” is a different question, and one that we won’t be answering today.

I looked at the starting lineups for all games played from 2009-2013 and calculated the in-season OBP for each member of that lineup. I used only games played in American League parks because the pitcher’s spot in the NL becomes just a string of pinch-hitters by the sixth inning. I also “ended” each game at the end of the eighth inning. This is because half the time, the home team doesn’t bat in the bottom of the ninth, so in some games, I’d be getting eight innings worth of data and in some, nine. (And in some, 10 or 11 or if we want to get all #weirdbaseball, 16). I found how many runs the team scored that day in those eight innings.

I ran a regression using the OBPs for each of the nine spots in the lineup to see what contribution each one made to the number of runs that each team scored (in the first eight innings). Your coefficients from that regression.

Batting Order Position	Coefficient
1	3.727
2	3.153
3	2.227
4	2.797
5	3.719
6	3.094
7	4.391
8	3.062
9	2.989
Constant	-5.358

First off, what’s up with the seventh spot in the lineup? That one has the heaviest weight in terms of runs scored. So the guy with the best OBP should be placed there? Not exactly. You have to think of the context around these numbers. Teams do not end up with a weak no. 7 hitter at random. If they are putting a guy with a toothpick bat in the seven hole, it probably means that they have two even weaker hitters in the eighth and ninth spots. If a team has a good seven-hole hitter, it probably means that they have six other good hitters in the lineup.

The importance of the seven-hole hitter is only partly about what he actually does. There’s another chunk of that value that’s based on the fact that the quality of the seventh hitter is an indicator of that quality of the other guys in the lineup. But structurally, that seven spot does hold an important function too. If you have three duds all bunched together, it creates a nice little valley for the pitcher to coast through and makes it even more likely that when the good hitters come up (nos. 1, 2, and 3), there won’t be anyone on base for them to play with.

Let’s take an example of a realistic decision that a manager might have. He is trying to decide between a good-glove guy with a .290 OBP hitting ninth vs. a good hitter with a .340 OBP who would hit fifth. Let’s assume that if he goes with the glove guy, everyone else moves up a spot in line. First, leaving the .340 OBP guy in (and looking only at the spots that will change).

Batting Order Position	Coefficient	OBP	Value
1	3.727	—	—
2	3.153	—	—
3	2.227	—	—
4	2.797	—	—
5	3.719	.340	1.264
6	3.094	.330	1.021
7	4.391	.320	1.405
8	3.062	.310	0.949
9	2.989	.300	0.897
Constant	-5.358	—	—
Total	—	—	5.536

And now for using the glove guy.

Batting Order Position	Coefficient	OBP	Value
1	3.727	—	—
2	3.153	—	—
3	2.227	—	—
4	2.797	—	—
5	3.719	.330	1.227
6	3.094	.320	0.990
7	4.391	.310	1.361
8	3.062	.300	0.919
9	2.989	.290	0.867
Constant	-5.358	—	—
Total	—	—	5.364

The difference is 0.172 runs per game (over eight innings). Pro-rating that out to nine innings and 162 games, we get 31.3 runs that we estimate that the offense will bleed away as the result of the weaker bat. If the upgrade in defense is that good, it might be worth it. Then again, aside from Andrelton Simmons, the spread between the best regular shortstop in baseball last year (Pedro Florimon, according to DRS) and the worst (Jed Lowrie) was 30 runs. At most other positions, there were cases where you could find pairs of players at the same position separated by 30 runs or more, so they’re out there, but they aren’t very common.

Of course, this is a contrived scenario using fake numbers that are artificially round. A team making this decision would want to plug in actual players. If the good glove guy would be replacing a guy who can’t hit anyway, it’s not as big a deal. Also, there will be those who wonder why I used OBP (because it makes things easier to understand) when instead I should have used a better indicator of offensive value. I tried the same calculations with a linear weights (per PA) approach and got the same basic message.

So yes, you can bury a guy in the nine hole, and it might actually make sense. But the more you mess with the lineup, the more “echo” effects there are. Suppose that for some reason, the .340 guy whom the manager was considering replacing was batting in the ninth spot legitimately (hello 1995 Indians!) and so subbing him out doesn’t disrupt the lineup at all. Replacing him with .290 OBP guy ends up costing his team only 27.2 runs over the season), a difference of four runs from what we calculated above. Another way to say that would be 13 percent of the initial effect. Context matters.

The other takeaway from this is that while that you can bury a good defender in the nine hole, and perhaps another in the eight hole, but the “cost” of punting batting order spots grows. You can’t pinch hit for everyone late in the game, and there are batting order inter-dependencies. If a team wants to go max defense and punt offense, the cost of that each move isn’t just Player One’s expected offensive value minus Player Two’s. A baseball game is a dynamic system with lots of moving parts. You can’t just swap one thing out for the other and assume that everything else will function the same way.

Let’s go back to Jeter vs. Ryan. If the Yankees were to actually start Ryan at shortstop on a consistent basis, they would move Brett Gardner (and his career .352 OBP—we’ll round to .350) from the seven- to the two-spot (see, ready-made Jeter replacement!), bump Kelly Johnson (stuck around .310 over the past three years) and Brian Roberts (who posted a .312 mark last year in limited duty—we’ll call him a .300 guy) up a couple of notches each, and “bury” Ryan at the bottom. We’ll assume that Jeter can still get on base at a .350 clip, consistent with his performance from 2010-2012 (when he was healthy), and that Ryan is every bit the .260 OBP guy that his 2012-2013 stats suggest he is. First, the lineup with Jeter.

Batting Order Position	Coefficient	OBP	Value
1	3.727	—	—
Jeter	3.153	.350	1.104
3	2.227	—	—
4	2.797	—	—
5	3.719	—	—
6	3.094	—	—
Gardner	4.391	.350	1.537
Johnson	3.062	.310	0.949
Roberts	2.989	.300	0.897
Total			4.487

Now, the lineup with Ryan:

Batting Order Position	Coefficient	OBP	Value
1	3.727	—	—
Gardner	3.153	.350	1.104
3	2.227	—	—
4	2.797	—	—
5	3.719	—	—
6	3.094	—	—
Johnson	4.391	.310	1.361
Roberts	3.062	.300	0.919
Ryan	2.989	.260	0.777
Total			4.161

If you pro-rate that to nine innings and 162 games, the difference between the two lineups is just shy of 60 runs (59.4). Even if you assume that Ryan is a .280 OBP guy, the difference shrinks only to 48 runs. Yes, Derek Jeter is 40, has a bad ankle, and was never a good fielder to start with, but the idea that Brendan Ryan is going to make up 50-60 runs worth of value over Derek Jeter with his glove is hard to swallow.

What’s interesting to note here is that the biggest effect of this whole exercise comes not from the lost productivity in the ninth spot from replacing Roberts with Ryan, but from forcing Gardner to the two spot and shortening the lineup that way. In the past, Jeter has been a roughly a -20 run defender over the course of a season and Ryan has checked in around +20. When you consider their performances out of context, as WAR(P) does, you could make the case that they might be each other’s equals or perhaps that Ryan is Jeter’s superior. Yet it seems more likely that the Yankees, in their situation, would be better off with Jeter at short, despite his defensive shortcomings. When you take a closer look, you can see that Jeter’s hitting ability within the context of what actually would happen to the Yankees’ lineup is more valuable because of how his hitting allows the team to set up the rest of their lineup.

The Trouble with Max Defense
A few years ago, the Seattle Mariners ran their much-discussed “max defense” experiment. The general idea was that defense was undervalued and that players with similar WAR(P) values, but whose value was mostly tied up in their defense, would provide the same bang for a cheaper price. (The Mariners have since switched over to the “Defense? What’s that?” strategy.) What they found out the hard way was that when you trade offense for defense, it’s not a one-for-one trade. The team that they built struggled to score runs, even before they could blame their troubles on the woes of Justin Smoak, Jesus Montero, and Dustin Ackley. Yes, there will be cases where trading a bat for a glove makes sense, but it’s not as easy as just lining up predicted WAR(P) totals and seeing which one is bigger. There are things that teams can do to minimize the damage wrought by a poor hitter, but there are consequences, and the more bats you try to hide, the worse the consequences get, and the more you have to gain from the glove to make it worth your while.

My point isn’t that defense doesn’t matter. It most certainly does, and it should be considered in any valuation of a player. My point is that the idea that the mantra that a run is a run is a run falls apart a little bit when you take measurement out of the abstract and try to apply it to real-world situations. It’s not that WAR(P) needs to be scrapped. It’s that we need to understand what question it answers (and it’s a valuable and important one) and decide whether that’s the question that we are currently trying to answer. A bad offensive player makes the players around him less good. (There’s evidence to suggest that a good defensive player actually makes his fellow fielders worse as well!)

The case of Derek Jeter and Brendan Ryan shows this off rather nicely. You might make the case that the projections for Jeter returning to his pre-injury form are too optimistic or that Ryan might be a slightly better hitter than I give him credit for (though he’ll never be mistaken for Ted Williams). In that case, you could re-run the numbers and see how things shake out. But the point is that you have to be careful about relying on a single win-value stat.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Russell A. Carleton

More about:

Latest Articles

You need to be logged in to comment. Login or Subscribe

Grasul

3/17

Great article.

Reply to Grasul

rocket

3/17

Awesome analysis, thanks

Reply to rocket

kscdac1032

3/17

Nice Work! Context is everything. Isolating everything into its proper context, is shall we say difficult at the very least. Some Sabrmetricians have gotten to the point, where they have convinced themselves that they have completed the task of removing all noise. You can tell by the tone of their work.

Reply to kscdac1032

pizzacutter

3/17

There will always be noise. Which is good, because it means I always have something else to write about.

Reply to pizzacutter

Ogremace

3/17

This article had a number of typos and grammatical errors, and I'm not trying to be irritating, but the high level of writing and editing has always been (for me) a hallmark of BP, something a lot of other sites can't match.

Somehow, in 2011, Jeter's .040 points of OBP over Ryan, as well as slight increases in HR and SB, netted only .5 wins of BWAR. Baffling, honestly.

Reply to Ogremace

matrueblood

3/17

Park factor is a major reason for the small WAR gap against the large absolute production gap, there.

Reply to matrueblood

matrueblood

3/17

THis puts me on such a confirmation-bias high. I've been saying this forever. Russell, let me ask:

Is this one reason why the NL has fallen so badly behind the AL in terms of overall quality? It seems to me that the pitcher's presence at the bottom of the order makes it even more costly to hide a bad hitter in an NL lineup, and therefore, complicates any effort to slot in an all-glove guy for NL teams.

At the same time, of course, all-bat guys (or those who risk becoming same) also fit more easily into an AL lineup, thanks to the flexibility afforded by the DH. And because of DHs, pitchers get fewer breaks in the AL, so starting hurlers have higher utility for AL teams, too. The only things NL teams can afford to value more highly than American League clubs, it seems to me, are guys who do a little bit of everything, especially off the bench. It's a screaming inequality that can't be solved until the senior circuit gets with the times and adds the DH.

Anyway, obviously, thought-provoking, fun work. I love the detail you provide. While I had no trouble understanding or enjoying them, there was something fundamentally unsatisfying about the 'in a vacuum' models that dominated baseball research last decade. I love getting into how differently things can work in specific situations. Thanks for your usual excellence, RC.

Reply to matrueblood

pizzacutter

3/17

There's some amount of truth in this. I guess the way to think about it is to look at what happens when AL teams go to NL parks. Most place their DH in the field (at 1B or LF) in some sort of an attempt to bury him as a fielder (the defensive #9 hole?). It seems an implicit endorsement of the idea that maintaining the integrity of the batting order is better than maintaining the integrity of the fielding grid. Other fielders can sorta cover for the defensive weaknesses of the usual DH, while no one can help a batter as he stands there. If we accept that maintaining a good groove in the lineup is even slightly more important, then the league which can indulge in offense over defense, because of the DH and the lack of a need to maintain as much defensive flexbility, will have an advantage.

Reply to pizzacutter

lyuchi

3/17

Cool article! Fun to think about some extensions of this work.

This can potentially be used to help make some in-season decisions: when the inevitable injuries befell the elder Yanks and the Jayson Nixes of the world begin getting regular duty, might be worth putting Ryan out there to play defense every day.

Reply to lyuchi

cmaczkow

3/17

Russell, forgive me if I am missing something, but if I am managing the Yankees and trying to make this decision, do I really want to use the coefficients you've listed above? As you said, they are based on all teams, and they are heavily context dependent (hence the #7 spot being weighted so highly).

It seems that, were I to follow the logic here, I would be placing my best hitters in the lineup following the order of the magnitude of the co-efficient, in order to maximize the number I get in the Total Value cell - and as you said, that really isn't how this works.

So, I guess my question is, would it make more sense to estimate a coefficient for each spot in the lineup based on the quality of players I am starting each day, rather than the "generic" coefficients presented here?

Reply to cmaczkow

pizzacutter

3/17

In a perfect world, a Yankee manager would have a Markov-type analysis at his disposal looking at different configurations of lineups based on the players at his disposal. My goal here is to show that there are structural issues to consider when making these sorts of issues and that WAR doesn't take those into account. The magnitude of those coefficients is one part an observation on the environment and one part real structural effects. Pulling apart how much is what would take some deeper digging. Your point is well-taken though.

Reply to pizzacutter

matrueblood

3/17

Russell, I would love to read, if you had occasion to write, what you think a team could do with this:

http://www.businessinsider.com/mlb-supercomputer-2014-3

Reply to matrueblood

pizzacutter

3/17

Ahhh... the super-computer...

Reply to pizzacutter

cmaczkow

3/17

One additional question: given the wide disparity between the various measures of defensive value (relative to the fairly similar results different offensive measures provide), how confident would (could? should?) a manager be when using this technique to make a decision between a Jeter and a Ryan?

I imagine it comes down to which defensive measurement you feel the most confident in, but it would have to be frustrating if the math told you the offensive difference between the Jeter and the Ryan lineups was (for example) 10 runs, but the two most widely used defensive measures valued Ryan's defense as +5 and +15 runs versus Jeter's...

Reply to cmaczkow

pizzacutter

3/17

Infield metrics are more reliable than outfield metrics, so we at least have that going for us here. In terms of tipping the anlysis one way or the other though, the results should be big enough that even leaving room for some margin of error, the choice is fairly obvious.

Reply to pizzacutter

tonytouch

3/17

Do you use 8 innings because the home team doesn't always bat in the 9th, defensive replacements late in games, and pinch hitters for poor hitters? All of the above?

Great article. I was hoping Brendan could contribute this year and maybe even give them a little more punch due to getting back to a hitter's ballpark. PECOTA is predicting a TAV closer to his career #'s, but his 2013 marks even in the Bronx were particularly abysmal.

Reply to tonytouch

pizzacutter

3/17

Because of the bottom of the ninth.

Reply to pizzacutter

NJTomatoes

3/17

Statistical probabilities aside, just spend a season watching Ryan come up to the plate and you'll likely become willing to go back to Derek. Been there, done that...It's painful. I guess the big difference would be that in the Mariners' light hitting lineup, Ryan's plate appearances were more salt in the wounds. The same number of PAs with the Yankees would mean a break in the action rather than a continuance of the inaction. Ryan does do a mean Robert DeNiro, though.

Reply to NJTomatoes

StarkFist

3/17

Okay, but if you were to run those numbers with a staff full of worm killers - like, say, 5 starters who each have a GB rate of 50% or higher. This would lead to more balls being hit to the shortstop, more double play opportunities, and greater value for Ryan, yes?

Reply to StarkFist

tnt9357

3/18

Ever since the 1997 AL WC playoff between the Yankees and Indians I've theorized that one's lineup becomes substantially worse once three bad hitters are at the bottom of it. That series, the Yankee's 7 through 9 hitters were Charlie Hayes, Joe Girardi, and Rey Sanchez.

Reply to tnt9357

pizzacutter

3/18

As someone who lived and died with that particular series in my native Cleveland...

Reply to pizzacutter

Oleoay

3/19

Russell,

What are your thoughts when comparing a good offensive catcher versus a good framing catcher?

Reply to Oleoay

Baseball Therapy: The Viability of Burying a Bad Bat

Thank you for reading

Latest Articles

Next Man Up ’24: Week Three $

Fantasy Starting Pitching Planner ’24: Week Four $

speX ’24: Week Three $

Box Score Banter: Experiments in Takeout Slides B

Some Potential Answers for Pete Fairbanks $

Russell A. Carleton

More about:

Latest Articles

Next Man Up ’24: Week Three $

Fantasy Starting Pitching Planner ’24: Week Four $

speX ’24: Week Three $

Thank you for reading

Related Articles

Latest Articles

More about:

Latest Articles

Related Articles