This article was originally published on May 13.
We ask “replacement level” to be a lot of things. Sometimes contradictory things. Sometimes I wonder if we know what it even means any more. The original idea was that it represented the level of production that a team could expect to get from “freely available talent”, including bench players, minor leaguers, and waiver wire pickups. It created a common benchmark to compare everyone to, and for that reason, it represented an advancement well beyond what was available at the time. In fact, it created a language and a framework for evaluating players that was not just better but entirely different than what came before it.
But then we started mumbling in that language. The idea behind “wins above replacement” was one part sci-fi episode and one part mathematical exercise. Imagine that a player had disappeared before the season and suddenly, in an alternate timeline, his team would have had to replace him. The distance between him and that replacement line was his value. We need to talk about that alternate timeline.
Without getting too into 2:00 am “deep conversations” with extensive navel-gazing, it’s worth thinking about why one player might not be playing, while another might.
- A player might not be playing because he has a short-term injury or his manager believes that he needs a day off.
- A player might not be playing because he has a longer-term injury that requires him to be on the injured list.
There’s a difference here between these two situations. In particular, the first one generally doesn’t involve a compensatory roster move, while the second one does. It’s possible, though not guaranteed, that the person who will be replacing the injured/resting player would be the same in either case. That matters. Teams generally carry a spare part for all eight position players on the diamond, although in the era of a four-player bench, those spare parts usually are the backup plan for more than one spot.
A couple of years ago, I posed a hypothetical question. Suppose that a team had two players in its system fighting for a fourth outfielder spot. One of them was a league average hitter, but would be worth 20 runs below average if allowed to play center field for a full season. One of them was a perfectly average fielder, but would be 15 runs below average as a hitter, if allowed to play an entire season. Which of the two should the team roster? It’s tempting to say the second one, as overall, he is the better player. That misses the point. A league average hitter on the bench isn’t just a potential replacement for an injured outfielder. He might also pinch hit for the light-hitting shortstop in a key spot. You keep the average hitter on the roster, even though he isn’t a hand-in-glove fit for one specific place on the field, because being a bench player is a different job description than being a long-term fill-in for someone. If you find yourself in need of a longer-term fill-in, you can bring the other guy up from AAA.
When we’re determining the value of an everyday player though, if he had disappeared before the season and a team would have had to replace his production, they likely would have done it with a player who was a long-term fill-in type because they would have had to replace a guy who played everyday. Maybe that’s the same guy that they would have rostered on their bench anyway, but we don’t know. It gets to the query of what we hope to accomplish with WAR. Are we looking for an accurate modeling of reality or are we looking for a common baseline to compare everyone to? Both have their uses, but they are somewhat different questions.
Let’s talk about another dichotomy.
- A player might not be playing because he isn’t very good and is a bench-level player.
- A player might not be playing because there is another player on the team who has a situational advantage that makes him the better choice today. The classic case of this is a handedness platoon. On another day, he might be a better choice.
When we think about player usage, I think we’re still stuck in the model that there are starters and there are scrubs. We have plenty of words for bench players or reserves or backups or utility guys. We do still have the word “platoon” in our collective vocabulary, but in the age of short benches, it’s hard to construct one. It’s always been hard to construct them. You have to find two players who hit with different hands, have skillsets that complement each other, and probably play the same position. In the era of the short bench, one of them had probably better double as a utility player in some way. Baseball has a two-tiered language geared toward the idea of regulars and reserves. The fact that it was so easy for me to find plenty of synonyms for “a player whose primary function is to come into a game to replace a regular player if he is injured or resting” should tell you something.
I’m always one to look for “unspoken words” in baseball. What is it called when someone is both half of a platoon and the utility infielder? That guy exists sometimes, but he reveals himself in that role—usually by accident. We don’t have a word for that, and whenever I find myself saying “we don’t have a word for that”, I look for new opportunities. What do you call it, further, when the job of being the utility infielder is decentralized across the whole infield with occasional contributions from the left fielder? It’s not even a “super-utility” player. What happens when you build your entire roster around the idea that everyone will be expected to be a triple major?
***
I think someone else beat me to this one, and on a grand scale. Platoons work because we know that hitters of the opposite hand to the pitcher get better results than hitters of the same hand, usually to the tune of about 20 points of OBP. If you want to express that in runs, it usually comes out to somewhere around 10 to 12 runs of linear weights value prorated across 650 PA. But hang on a second, now let’s say that we have two players who might start today, both of roughly equal merit with the bat. One has a handedness advantage, but is the worse fielder of the two. In that case, as long as his “over the course of a season” projection as a fielder at whatever position you want to slot him into is less than a 10-run drop from the guy he might replace, then he’s a better option today.
We’re not used to thinking of utility players as bat-first options, who would play below-average defense at three different infield positions. That guy might hook on as a 2B/3B/LF type (Howie Kendrick, come on down!) but teams usually think to themselves that they need as their utility infielder someone who “can handle” shortstop, the toughest of the infield spots to play. If someone can do that and hit well, he’s probably already starting somewhere, so he’s not available as a utility infielder. It’s easier for those glove guys to find a job. In a world where the replacement for a shortstop has to be the designated utility infielder, that makes sense.
But as we talked about last week, we’re living in a different world. The rate at which a replacement for a regular starter turns out to be another starter shifting over to cover has gone way up over the last five years. There was always some of it in the game, but this has been a supernova of switcheroos. Now if your second baseman is capable of playing a decent shortstop, that 2B/3B/LF guy can swap in. He’s not actually playing shortstop, and maybe the defense suffers from the switch, but if he’s got enough of a bat, he might outhit those extra fielding miscues. And in doing so, he is effectively your backup shortstop.
Somewhere along the lines, teams got hip to the idea of multi-positional play from their regulars. I’ve written before about how you can’t just put a player, however athletic, into a new position and expect much at first. The data tell us that. Eventually, players can learn to be multi-positionalists, but it takes time, roughly on the order of two months, before they’re OK. But there’s a hidden message in there. If you give a player some reps at a new spot, he’s a reasonably gifted athlete and somewhat smart and willing to learn, he could probably pick it up enough to get to “good enough,” and it doesn’t take forever. You just have to be purposeful about it. Maybe you get to the point where you can start to say “he’s still below average but we could move him there and get another bat into the lineup, and it’s a net win.”
Teams have started to build those extra lessons into their player development program. It used to be seen as a mark of weakness to be relegated to “utility player” because that meant that you were a bench player (all those synonyms above come with a side of stigma). Now, it’s a way of building a team. If you get a few reps in the minors (where it doesn’t count) at a spot, you’ll have at least played the spot at game speed before. There are limits to how far you can push that. A slow-footed “he’s out in left field because we don’t have the DH” guy is never going to play short, but maybe your third baseman can try second base and not look like a total moose out there.
***
Back to WAR. I’d argue that the world of starters and scrubs is slowly disintegrating, for good cause. In the event that a regular starter really does go down with an injury – ostensibly, the alternate universe scenario that WAR is attempting to model – it makes the team a little more resilient to replacing him. And the good news is that you’re more likely to be able to replace him with the best of the bench bunch, rather than the third-best guy, because the best guy doesn’t have to be an exact positional match for the guy who got hurt. And that’s what the manager would want to do. He’d want to replace that long-term production, not with an amalgam of everyone else who played that position, but with the best guy available from his reserves.
Now this is still WAR. We still want to retain the principle that we should be measuring a player, and not his teammates. We need some sort of common baseline, and despite what I just said, we’ll still need some sort of amalgam. To construct that, I give to you the idea of the tranche. The word, if you’ve not heard it before, refers to a piece of a whole that is somehow segmented off. It’s often used in finance to talk about layers of a financial instrument.
Here, I want you to consider that there are 30 starters at each of the seven non-battery positions (catchers should have their own WAR, since only a catcher can replace a catcher). We can identify them by playing time, and we can futz around with the definition a little bit if we need to. Next, among those who aren’t in that starting pool, we identify the top tranche of the 30 best bench players, which I would again identify by playing time, and then the second and third and fourth and so on. If a player were to disappear, his manager would probably want to take a guy from that top tranche of the bench to replace him. In a world where even the starters can slide around the field, that becomes more feasible.
We can take a look at that top tranche and say “How many of them showed that they are able to play (first, second, etc.)?” and therefore could have directly substituted for the starter? How many of them could have been a direct substitute for our injured player? We don’t know whether one of them would be on a specific team, but we can say that 40 percent of the time, a manager would have been able to draw from tranche 1 in filling the role, and 35 percent from tranche 2. But on tranche 1, we can also look at how many of those players played a position that could have then shifted and covered for that spot. We’d need some eligibility criteria for all of this (probably a minimum number of games played) but it would just be a matter of multiplication. Shortstop would be harder to fill, and managers would probably be dipping a little further down in the talent pool, and so replacement level would be lower, as it is now.
Doing some quick analysis, I found that the difference in just batting linear weights (haven’t even gotten into running or fielding) between tranche 1 and tranche 2 in 2019 was about 6.5 runs, prorated across 650 PA. Between tranche 1 and tranche 3, it’s 10.8 runs. The ability to shift those plate appearances up the ladder has some real value.
This part is important. We can also give credit to starters for the positions that they showed an ability to play, even if they didn’t play them (this is the guy fully capable of playing center, but who’s in a corner because the team already has a good center fielder) because he allows a team to carry a player who hits like a left fielder to functionally be the team’s backup center fielder. He facilitates that movement upward among the tranches. We can start to appreciate the difference between a left fielder who would never be able to hack it in center (and the compensatory move that his team would have to make) and the left fielder who could do it, but just didn’t have to very often.
Past that, you can continue to use whatever hitting and fielding and running metrics you like to determine a player’s value, but when we get down to constructing that baseline, I’d argue we need a better conceptual and mathematical framework. It’s going to require some more #GoryMath than we’re used to, but I’d argue it’s a better conceptualization of the way that MLB actually plays the game in 2020. If… y’know… MLB plays in 2020. If WAR is going to be our flagship statistic among the acronymati, then we need to acknowledge that it contains some old and starting-to-be-out-of-date assumptions about the game. We may need to tinker with it. Here’s my idea for how.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now