Back in the mid-1970s I was an infielder for the New York Yankees. I was pretty good, too-leading the team in a number of offensive categories and appearing in a few All-Star games. My best season was probably 1977, when I managed to finish the season with a .633 batting average. So for at least a few years, you could have said I was a better hitter than Rod Carew.
Of course, if you had actually said that, you’d be crazy. Intuitively, we all know that hitting success in the Elmhurst Baseball League isn’t exactly the same as hitting success in the American League. The differences between the two in terms of level of competition, playing environment, even the actual rules of the game are so vast that there’s no need to even try to quantify them. But what about when the differences aren’t quite so obvious, and the idea of comparing two sets of statistics isn’t so laughable-say, if we wanted to compare a prospect in the Midwest League to one in the Carolina League, or Roy Halladay to Jake Peavy, or even Ty Cobb to Pete Rose? Maybe that last one is laughable, but for the other cases there’s a whole category of metrics designed to help make possible such comparisons: translated statistics.
The idea behind a translated statistic (you’ll also frequently see the term “equivalent”-don’t get tripped up by the semantics, it’s pretty much the same thing) is to take a player’s “raw” stats, which accurately recorded what happened during a play, a game or a season, and translate them in such a way as to make them more directly comparable to other players who accumulated similar “raw” stats, but possibly in a very different environment.
“Raw” stats come in three delicious flavors:
-
Counting Stats, which basically aggregate certain events that occur during games. Hits, RBIs and Earned Runs fall into this category. Counting stats are building blocks for most other statistics, but as fun as they are to look at and memorize, and as important as they often are for a fantasy baseball team, they lose a lot of their utility when comparing the actual productivity of two players. Counting stats don’t account for variance in playing time: the more plate appearances you have, the more chances you have to hit a home run. They also don’t account for variance in opportunity: the more frequently you bat with runners on base, the easier it is to accumulate high RBI totals. These problems can be partially resolved by instead using…
-
Rate Stats, which are usually represented as a percentage. Batting Average, OBP and WHIP are all in this category, and they’re all calculated by dividing the number of times something happened (e.g., a Hit) by the number of opportunities for it to happen (e.g., an At-Bat). This makes rate stats much better to use for comparisons between players-though by no means perfect.
-
Value Stats, which can be represented as either a number or a rate, but are still derived by performing straightforward math functions on counting stats. Two examples are Runs Created and Offensive Winning Percentage. These stats often try to combine various counting and rate stats to come up with a single number that represents the total offensive contribution a player is providing to his team.
While rate stats and value stats definitely make it easier to compare players than counting stats do, they still rely entirely on the counting stats themselves. The problem when it comes to comparing players is that the environment-often called the “context”-in which those hits, walks and strikeouts were recorded often vary wildly. Comparing Ken Funck’s ability to turn around 45 mph fastballs at a .633 clip to Rod Carew’s MVP season is an extreme example, of course, but even when looking at two major league players there are a plethora of external factors that might complicate the comparison: ballpark factors, league factors, quality of competition, weather, umpiring, etc., ad nauseum.
And that’s where translated statistics come in-their job is to try and strip away the external factors that muddy up “raw” stats, so that players can be compared more accurately. Baseball Prospectus hosts a king’s ransom of translated statistics-just take a stroll through the BP Glossary and you’ll see even Neifi Perez couldn’t swing a bat in there without hitting something that’s been adjusted for ballpark, league difficulty, era and/or quality of opponent.
A good example is Clay Davenport‘s Equivalent Average (EqA), often used as the chassis for much of BP’s statistical work. The BP Glossary defines EqA thusly:
A measure of total offensive value per out, with corrections for league offensive level, home park, and team pitching. EQA considers batting as well as baserunning, but not the value of a position player’s defense. The EqA adjusted for all-time also has a correction for league difficulty. The scale is deliberately set to approximate that of batting average. League average EqA is always equal to .260. EqA is derived from Raw EqA, which is (H + TB + 1.5*(BB + HBP + SB) + SH + SF) divided by (AB + BB + HBP + SH + SF + CS + SB). REqA is then normalized to account for league difficulty and scale to create EqA.
Note that the main ingredient when cooking up EqA is REqA-and further note that you can prepare REqA yourself by merely sprinkling some simple math operators over a generous bed of counting stats. So REqA is a “raw” stat (it says so right on the box), subject to the same problems of context that apply to every other “raw” stat. It’s the next process-applying “corrections for league offensive level, home park and team pitching”-that turns REqA into EqA, and defines EqA as a translated stat. The mechanism for this is described in more (but not complete) detail in the above link, but here’s the 30,000 foot explanation: REqA is shape-shifted into runs produced, compared to the league’s run scoring environment to determine how much better or worse than average those runs produced are, and then that difference is applied to a base EqA of .260. The same process can be applied to any major league player in any season-translating their “raw” stats into a fictional league where the average player has an EqA of .260. Once translated, players from different teams, leagues and eras can be compared with ease.
Let’s take EqA for a spin by comparing the raw statistics of two Chicago Cub hitters from different eras:
Actual Stats Year AB H 2B 3B HR R RBI BA OBP SLG OPS EqA* Ron Santo 1968 577 142 17 3 26 86 98 0.246 0.354 0.421 0.775 0.301 Henry Rodriguez 1998 415 104 21 1 31 56 85 0.251 0.334 0.530 0.864 0.284 *calculated for season
Just eyeballing the counting and rate stats above would lead someone to the conclusion that Hammerin’ Hank’s 1998 season was far more impressive than Santo’s rather pedestrian summer 30 years before. Rodriguez hit more home runs in many fewer at bats, leading to a 100+-point edge in slugging percentage and nearly 90 points in OPS. But the elephant in the room is the final stat: EqA, calculated here to compare Santo and Rodriguez directly to their peers that season. Remember, EqA is calibrated to set an average player at .260-thus Santo’s .301 is very good. But to really see the difference, take a gander at the same statistics translated for all time (taken from each player’s DT card):
Translated Stats Year AB H 2B 3B HR R RBI BA OBP SLG OPS EqA** Ron Santo 1968 565 142 23 2 40 106 121 0.251 0.380 0.512 0.892 0.298 Henry Rodriguez 1998 404 98 18 1 31 52 76 0.243 0.326 0.522 0.848 0.279 **calculated for all time
After this translation we can directly compare counting and rate states. When accounting for the intimidating environment of the Swingin’ (and Missin’) Sixties, as well as the offensive fireworks of the late nineties, Santo has made up nearly the entire gap in slugging percentage, and opens up a 50-point lead in OPS. The value of translating statistics in this way should be pretty obvious to both the casual fan that wants to win a bar bet and the fantasy baseball owner who wants to see how much a player’s stats are being boosted or suppressed by their home ballpark.
Best of all, Equivalent Average is just one of many translated stats that have been developed, some focusing on total value, some on just a specific aspect of play (e.g., fielding, baserunning, relief pitching), some focused on comparing minor league and major league performance. Virtually anything can be considered a factor that might affect a stat. Armchair analysts have spent countless hours perfecting new and different translation methods to normalize for new and different variables-spend a quiet evening with your favorite search engine and a refreshing beverage, and you’ll find a wealth of equivalent stats that delight and amuse. If you’ve got the math chops and some database software, you can even roll your own. And if it’s worthwhile, easy to use and (most importantly) defensibly accurate, you too might get to see your name in lights at the top of a sortable stat column.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
I also have a problem with the Santo/Rodriguez comparison... I'm a Cubs fan and have only heard "Hammerin' Hank" used in reference to Henry Aaron... Henry Rodriguez was "Oh Henry!". Also, if you're explaining equivalencies, why not pick two players who play the same position, or swing from the same side of the plate? It just seems like this is a case where, all things being equal, there were examples of equivalency that would be less open to questions.
Then, to cap it off, you say that "armchair analysts have spent countless hours perfecting new and different translation methods", which seemed to be a bit dismissive. There are other elements of a condescending tone in this piece.
The article, besides elements of the tone and the organization, was pretty understandable and the writing style was easy to read. I just wanted to see it structured better and for the focus to be sharper without the writing style trying to be so "cute".
But that's why I reread :)
Definitely a thumbs up.
One minor nitpick: what about translating for playing time? Some of the most simple projections give players an equal number of PAs. Translating playing time would obviously affect counting and value stats but not rate stats... if you added a sentence or two explaining this it might make both concepts ("raw" stats and translations) clearer.
But the whole article was concise and had a superb tone.
But the implied audience for this piece was the best yet, as the author started with a concept that anybody would clearly agree with, showed why we need to keep that idea in mind for major league baseball, and then finally showed one way that the concept is implemented. A thumbs up from me.
That said, I still want to buy you that beer!
Getting Henry Rodriguez's nickname wrong was a huge mistake - unacceptable, but I would hope BP has fact checkers who wouldn't let that through? (If an ironic comparison to Hank Aaron was intended, it fell short.)
Anyway, Ken's entry piece (TGF) was the most memorable, he has to go through.
Kudos as well for randomly pulling a Santo/Rodriguez comp...where did that come from?
I clicked the thumbs up right there.
These are things that most BP readers know, but for a Basics series you have to make _no_ assumptions about what a reader knows about anything. Imagine this as a chapter of a book that someone's bought on baseball basics. Hand-holding isn't just nice for articles like these, it's necessary.
Alphabet soup aside, this was a really well-done article and it gets my vote...
Cust playing in 1920 would almost certainly be better than Babe Ruth playing in 1920. But Cust playing in 2008 is not more productive than Babe Ruth playing in 1920. Baseball's a zero-sum game, so performance is only meaningfully measured relative to the average.
That Kevin's objection persists after reading the article suggests this point was insufficiently clear.
The main problem is that an individual's EqA in a given league in a given year only allows for comparison to other players in that league in that year. An average player in the Elmhurst Baseball League in 1977 had an EqA of .260. An average player in the AL in 2008 had an EqA of .260. We can't use this as a basis for comparison, because they're NOT translated at all! True, there is no need to actually make that comparison, but the author leans pretty heavily on that one example for it to amount to nothing.
But I also claim that even Santo's .301 EqA in 1968 and Rodriguez's .284 in 1998 cannot be meaningfully compared. Again, these numbers are not translated. We can't just assume the two leagues were of equal difficulty. We know (see Stephen Jay Gould's "The Extinction of the .400 Hitter") that as baseball has evolved, the spread in abilities between the best players and the worst players has steadily decreased. EqA does not adjust for that. Along the way there have been temporary irregularities in the level of competition in a league (see World War II). EqA does not adjust for that. I'm not saying don't use EqA. I'm saying if you start off making a comparison between Ken Funck and Rod Carew, in order to motivate the use of translated statistics, then you'd better actually demonstrate a translated statistic.
I disagree that "equal difficulty" is an issue. Evan said it well up above -- baseball is a zero sum game, and the only meaningful comparison is against your competition in the season at hand. The question of how Jack Cust would fare as a baseball player if Jack Finney taught him how to go back to 1882 is interesting after 2 beers, but not relevant to this article.
Very engaging, fun without working too obviously hard at it. Good job.
Pet peeve: "their job is to try and strip away..." That should be "try TO strip away". Just one of those common errors that drives me nuts. The fact that the writing is otherwise so flawless just makes me all the crazier.
The theory is legitimate.
He seemed to accomplish this week's assigned task well enough: take a stat and explain the basics for a general audience. The task though made all the articles a less fun read for your average BP subscriber, and that's no fault of any of the contestants. I still managed to learn something, or put something into context, from several of the articles, including this one.
But I absolutely loved your initial entry Ken. The way it played off the narrative history of baseball analysis, the sabermetricians and scouts who are the charachters in that story, and the typical BP reader's own vanity in placing himself in that narrative made it the most fun read of all. And hell, I'm still wondering if there's something to The Good Face.
So I write the following criticism as an unabashed fan: I wonder if you didn't take too much heed of some of the criticism of your first piece. Everyone else's initial entry was so very earnest; yours distinguished itself through its irreverance. And that's not to say that BP's primary mission shouldn't be serious baseball analysis, but I reckon BP has armies of free interns who can run regressions on the research topic of the day. It seems the winner of BP Idol should be someone who can really breathe life into the story that results from a research project, and you did that with your first piece on the hilariously trivial subject of facial dimensions. It seems that you were more earnest with this week's subject, and it was less fun as a result, although still among the top 2-3 entries.
So don't let The Man keep you down, is what I'm trying to say.