Prospectus Idol Entry: Dare to Compare

May 24, 2009

Back in the mid-1970s I was an infielder for the New York Yankees. I was pretty good, too-leading the team in a number of offensive categories and appearing in a few All-Star games. My best season was probably 1977, when I managed to finish the season with a .633 batting average. So for at least a few years, you could have said I was a better hitter than Rod Carew.

Of course, if you had actually said that, you’d be crazy. Intuitively, we all know that hitting success in the Elmhurst Baseball League isn’t exactly the same as hitting success in the American League. The differences between the two in terms of level of competition, playing environment, even the actual rules of the game are so vast that there’s no need to even try to quantify them. But what about when the differences aren’t quite so obvious, and the idea of comparing two sets of statistics isn’t so laughable-say, if we wanted to compare a prospect in the Midwest League to one in the Carolina League, or Roy Halladay to Jake Peavy, or even Ty Cobb to Pete Rose? Maybe that last one is laughable, but for the other cases there’s a whole category of metrics designed to help make possible such comparisons: translated statistics.

The idea behind a translated statistic (you’ll also frequently see the term “equivalent”-don’t get tripped up by the semantics, it’s pretty much the same thing) is to take a player’s “raw” stats, which accurately recorded what happened during a play, a game or a season, and translate them in such a way as to make them more directly comparable to other players who accumulated similar “raw” stats, but possibly in a very different environment.

“Raw” stats come in three delicious flavors:

Counting Stats, which basically aggregate certain events that occur during games. Hits, RBIs and Earned Runs fall into this category. Counting stats are building blocks for most other statistics, but as fun as they are to look at and memorize, and as important as they often are for a fantasy baseball team, they lose a lot of their utility when comparing the actual productivity of two players. Counting stats don’t account for variance in playing time: the more plate appearances you have, the more chances you have to hit a home run. They also don’t account for variance in opportunity: the more frequently you bat with runners on base, the easier it is to accumulate high RBI totals. These problems can be partially resolved by instead using…
Rate Stats, which are usually represented as a percentage. Batting Average, OBP and WHIP are all in this category, and they’re all calculated by dividing the number of times something happened (e.g., a Hit) by the number of opportunities for it to happen (e.g., an At-Bat). This makes rate stats much better to use for comparisons between players-though by no means perfect.
Value Stats, which can be represented as either a number or a rate, but are still derived by performing straightforward math functions on counting stats. Two examples are Runs Created and Offensive Winning Percentage. These stats often try to combine various counting and rate stats to come up with a single number that represents the total offensive contribution a player is providing to his team.

While rate stats and value stats definitely make it easier to compare players than counting stats do, they still rely entirely on the counting stats themselves. The problem when it comes to comparing players is that the environment-often called the “context”-in which those hits, walks and strikeouts were recorded often vary wildly. Comparing Ken Funck’s ability to turn around 45 mph fastballs at a .633 clip to Rod Carew’s MVP season is an extreme example, of course, but even when looking at two major league players there are a plethora of external factors that might complicate the comparison: ballpark factors, league factors, quality of competition, weather, umpiring, etc., ad nauseum.

And that’s where translated statistics come in-their job is to try and strip away the external factors that muddy up “raw” stats, so that players can be compared more accurately. Baseball Prospectus hosts a king’s ransom of translated statistics-just take a stroll through the BP Glossary and you’ll see even Neifi Perez couldn’t swing a bat in there without hitting something that’s been adjusted for ballpark, league difficulty, era and/or quality of opponent.

A good example is Clay Davenport‘s Equivalent Average (EqA), often used as the chassis for much of BP’s statistical work. The BP Glossary defines EqA thusly:

A measure of total offensive value per out, with corrections for league offensive level, home park, and team pitching. EQA considers batting as well as baserunning, but not the value of a position player’s defense. The EqA adjusted for all-time also has a correction for league difficulty. The scale is deliberately set to approximate that of batting average. League average EqA is always equal to .260. EqA is derived from Raw EqA, which is (H + TB + 1.5*(BB + HBP + SB) + SH + SF) divided by (AB + BB + HBP + SH + SF + CS + SB). REqA is then normalized to account for league difficulty and scale to create EqA.

Note that the main ingredient when cooking up EqA is REqA-and further note that you can prepare REqA yourself by merely sprinkling some simple math operators over a generous bed of counting stats. So REqA is a “raw” stat (it says so right on the box), subject to the same problems of context that apply to every other “raw” stat. It’s the next process-applying “corrections for league offensive level, home park and team pitching”-that turns REqA into EqA, and defines EqA as a translated stat. The mechanism for this is described in more (but not complete) detail in the above link, but here’s the 30,000 foot explanation: REqA is shape-shifted into runs produced, compared to the league’s run scoring environment to determine how much better or worse than average those runs produced are, and then that difference is applied to a base EqA of .260. The same process can be applied to any major league player in any season-translating their “raw” stats into a fictional league where the average player has an EqA of .260. Once translated, players from different teams, leagues and eras can be compared with ease.

Let’s take EqA for a spin by comparing the raw statistics of two Chicago Cub hitters from different eras:


Actual Stats     Year   AB  H  2B 3B HR  R RBI    BA    OBP    SLG    OPS    EqA*
Ron Santo        1968  577 142 17  3 26 86  98  0.246  0.354  0.421  0.775  0.301
Henry Rodriguez  1998  415 104 21  1 31 56  85  0.251  0.334  0.530  0.864  0.284
*calculated for season

Just eyeballing the counting and rate stats above would lead someone to the conclusion that Hammerin’ Hank’s 1998 season was far more impressive than Santo’s rather pedestrian summer 30 years before. Rodriguez hit more home runs in many fewer at bats, leading to a 100+-point edge in slugging percentage and nearly 90 points in OPS. But the elephant in the room is the final stat: EqA, calculated here to compare Santo and Rodriguez directly to their peers that season. Remember, EqA is calibrated to set an average player at .260-thus Santo’s .301 is very good. But to really see the difference, take a gander at the same statistics translated for all time (taken from each player’s DT card):


Translated Stats  Year   AB  H  2B 3B HR  R  RBI    BA    OBP    SLG    OPS   EqA**
Ron Santo         1968  565 142 23  2 40 106 121  0.251  0.380  0.512  0.892  0.298
Henry Rodriguez   1998  404  98 18  1 31  52  76  0.243  0.326  0.522  0.848  0.279
**calculated for all time

After this translation we can directly compare counting and rate states. When accounting for the intimidating environment of the Swingin’ (and Missin’) Sixties, as well as the offensive fireworks of the late nineties, Santo has made up nearly the entire gap in slugging percentage, and opens up a 50-point lead in OPS. The value of translating statistics in this way should be pretty obvious to both the casual fan that wants to win a bar bet and the fantasy baseball owner who wants to see how much a player’s stats are being boosted or suppressed by their home ballpark.

Best of all, Equivalent Average is just one of many translated stats that have been developed, some focusing on total value, some on just a specific aspect of play (e.g., fielding, baserunning, relief pitching), some focused on comparing minor league and major league performance. Virtually anything can be considered a factor that might affect a stat. Armchair analysts have spent countless hours perfecting new and different translation methods to normalize for new and different variables-spend a quiet evening with your favorite search engine and a refreshing beverage, and you’ll find a wealth of equivalent stats that delight and amuse. If you’ve got the math chops and some database software, you can even roll your own. And if it’s worthwhile, easy to use and (most importantly) defensibly accurate, you too might get to see your name in lights at the top of a sortable stat column.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Ken Funck

Latest Articles

You need to be logged in to comment. Login or Subscribe

kgoldstein

5/24

This one is hard for me because it's a little more basic, with a lot more hand holding, and also because I have my own problems with translations where I don't think we have apples to apples comparisons. For example, I think if you dropped Jack Cust into the 1920s, he'd be better than Babe Ruth. But that's just me, and has nothing to do with this piece. Getting away from the baseball for a second, I think you are a good WRITER, and I'm interested in seeing what you do with a different subject.

Reply to kgoldstein

Oleoay

5/24

I don't think this article was "bad", but something about it just wasn't quite right. If the idea is to explain "the basics", then why move in, out, then back in to discussing EqA. It seems you were discussing the concepts of equivalencies instead of EqA itself. EqA itself isn't discussed until halfway through the article, then disappears for a few paragraphs to talk about a "DT Card" (Davenport Translation) thrown in there, then the article concludes with EqA. It seems to me it would have been best to stich with either EqA or with the DT Card. Also, if this article is intended for new people, it only makes sense to say what a "DT Card", who it is named after (Clay Davenport) and explain it to the reader...

I also have a problem with the Santo/Rodriguez comparison... I'm a Cubs fan and have only heard "Hammerin' Hank" used in reference to Henry Aaron... Henry Rodriguez was "Oh Henry!". Also, if you're explaining equivalencies, why not pick two players who play the same position, or swing from the same side of the plate? It just seems like this is a case where, all things being equal, there were examples of equivalency that would be less open to questions.

Then, to cap it off, you say that "armchair analysts have spent countless hours perfecting new and different translation methods", which seemed to be a bit dismissive. There are other elements of a condescending tone in this piece.

The article, besides elements of the tone and the organization, was pretty understandable and the writing style was easy to read. I just wanted to see it structured better and for the focus to be sharper without the writing style trying to be so "cute".

Reply to Oleoay

Oleoay

5/24

This was the first article I ended up reading so I might have been overly critical. While I also stand behind my previous comments, upon a reread (and reading the other submissions), I realize that I like the writing style and structure here more. It's got my thumbs-up now.

Reply to Oleoay

kenfunck

5/26

Glad you liked it better the second time through. Sorry if the "armchair analysts" paragraph sounds condescending -- it certainly wasn't intended that way. Much of the best baseball analysis of the last 30 years has been (and continues to be) performed by people who were, at least initially, working on their own time with nothing else to drive them than their love of baseball -- starting with Bill James, and continuing with the individuals that post here at BP and at the other terrific baseball websites to which I was obliquely referring.

Reply to kenfunck

Oleoay

5/26

I thought armchair analysts was a bit too close to armchair quarterbacks, which is at times considered a bit derogatory and implies laziness and/or people who don't know what they are talking about. It just ended up coloring the way I viewed the rest of the sentence/paragraph... which since it was the conclusion, left the odd taste in my mouth.

But that's why I reread :)

Reply to Oleoay

wcarroll

5/24

I love the construct here. He really draws me in from the first sentence and then takes me through the process in a clear fashion. He opens up a bit, quoting some from Clay that could be a bit intimidating to the intended audience, but pulls it right back. It works because it's like he's going "this is complex, but you're smart enough to understand. Let me show you." He nails the tone. There's some nitpicks here and there, but solid work.

Reply to wcarroll

ckahrl

5/24

This was exactly what I wanted from a Basics piece: a basic explanation of a complicated concept, yet engaging, because Ken's a writer who isn't afraid to mix up a bit of self-mockery with a confidence in his use of examples and (supported and supportable) assertions. Were I to have points to give, I'd assign them for the tidy and appropriate reductionism as far as the tasty types of stats folks can sample from.

Reply to ckahrl

rbross

5/24

This is outstanding. The writing is extraordinarily good; the analysis is tight and relevant; the take-home points are clear.

Definitely a thumbs up.

Reply to rbross

gersh22

5/24

Favorite and best-written article so far.

One minor nitpick: what about translating for playing time? Some of the most simple projections give players an equal number of PAs. Translating playing time would obviously affect counting and value stats but not rate stats... if you added a sentence or two explaining this it might make both concepts ("raw" stats and translations) clearer.

But the whole article was concise and had a superb tone.

Reply to gersh22

Kongos

5/25

The best-written article I've read so far, but it keeps promising things it doesn't deliver. Where's the comparison between the author's stats and Carews? Between Halladay and Peavy? The intro is cute but really misleading -- it is exactly the sort of apples-to-oranges comparison that translation stats can't handle.

Reply to Kongos

SkyKing162

5/25

I agree that if this article was about EqA, it probably could have been a bit more obvious, and perhaps the percentage of words to certain topics could have been re-worked.

But the implied audience for this piece was the best yet, as the author started with a concept that anybody would clearly agree with, showed why we need to keep that idea in mind for major league baseball, and then finally showed one way that the concept is implemented. A thumbs up from me.

Reply to SkyKing162

leez34

5/25

Best written of any of the pieces, and really the only one that I could see on the BP main page as is.

That said, I still want to buy you that beer!

Reply to leez34

kenfunck

5/26

Thanks. Busy time right now (I'm actually posting this from a fishing lodge in Western Ontario), but sure, I can always be talked into a pint of Ale Asylum Nut Brown!

Reply to kenfunck

hotstatrat

5/25

Yeah, Ken is a terrific writer, but I'm with Richard in that this was just a little too cute.

Getting Henry Rodriguez's nickname wrong was a huge mistake - unacceptable, but I would hope BP has fact checkers who wouldn't let that through? (If an ironic comparison to Hank Aaron was intended, it fell short.)

Anyway, Ken's entry piece (TGF) was the most memorable, he has to go through.

Reply to hotstatrat

ckahrl

5/25

Again, we publish each author's piece as-is to give the audience a chance to see the raw content and judge accordingly. (William Hung didn't get a remix, after all.) That said, several people were referring to Rodriguez as Hank in the Windy City back in his Cubs heyday, albeit without the hammer; maybe Ken meant that as a joke, and perhaps it didn't land with a lot of people.

Reply to ckahrl

kenfunck

5/26

Hammerin' Hank: that's what my friends and I called him during his Cubs career, and I used that nickname to avoid the awkward plural construction "Rodriguez's". In retrospect, I could just as easily have used his actual nickname. Sorry if I offended any Hank Aaron fans out there, or Henry Rodriguez fans. BTW -- one of the best things I'm getting out of this process is that I'm starting to learn how to better edit my own writing. For that I'm grateful to the 1500ish-word limit.

Reply to kenfunck

pokeysplayers27

5/25

My favorite (so far) of both the initial entries and the first round. Vote him through!

Reply to pokeysplayers27

jtrichey

5/25

Thumbs up. Nailed the basic approach in an interesting way. However, nothing really stands out about it either. Understood that it is hard to stand out in a "basics" piece.

Reply to jtrichey

aardvark

5/25

I'm going to echo the sentiment about the author trying to be a little too cute. This is something that is pervasive among sportswriters today and for me it really detracts from the reporting and analysis.

Reply to aardvark

jtrichey

5/25

Also, your picture reminds me of Rick Sutcliffe, whose analysis I find very poor. :)

Reply to jtrichey

kenfunck

5/26

And yet I find his pocketbook to be very rich, so who's to say? ;)

Reply to kenfunck

Oleoay

5/26

Ooh, witty!

Reply to Oleoay

JoshC77

5/25

I totally disagree with those that think that the writer was being 'too cute'. Reading dry pieces that read like my old physics book aren't what I want. Baseball is a game and should be fun; writing (and reading) about it should be the same. A couple of jokes and barbs that bring a smile to one's face make it worthwhile.

Kudos as well for randomly pulling a Santo/Rodriguez comp...where did that come from?

Reply to JoshC77

hotstatrat

5/25

It's a matter of taste. Some people try too hard to be funny and it comes off irritating. I wouldn't go so far as to say that about Ken. His TGF piece was a perfect spoof - if anything it was so dry you are left wondering if he was actually serious. At least half of what made Bill James so great besides all his ground/rule breaking analysis was that he is a brilliant writer. He had just the right amount of pitch perfect humor.

Reply to hotstatrat

Scartore

5/25

"Raw" stats come in three delicious flavors:"

I clicked the thumbs up right there.

Reply to Scartore

jsnell

5/25

Good article, but some problems introducing terms that can be confusing even for a veteran BP reader. He mentions Raw EqA but then on next mention it's REqA -- an explicit explanation (or, quite frankly, sticking with the longer name for such an introductory article) would be better. Too many BP articles end up with alphabet-soup problems with stat names that hinder readability. (The author does something similar by tossing in the concept of DT cards without an explanation.)

These are things that most BP readers know, but for a Basics series you have to make _no_ assumptions about what a reader knows about anything. Imagine this as a chapter of a book that someone's bought on baseball basics. Hand-holding isn't just nice for articles like these, it's necessary.

Alphabet soup aside, this was a really well-done article and it gets my vote...

Reply to jsnell

llewdor

5/25

Kevin's conercerns about comparisons between eras are well-founded, but I think based on a misunderstanding of what it is we're comparing.

Cust playing in 1920 would almost certainly be better than Babe Ruth playing in 1920. But Cust playing in 2008 is not more productive than Babe Ruth playing in 1920. Baseball's a zero-sum game, so performance is only meaningfully measured relative to the average.

That Kevin's objection persists after reading the article suggests this point was insufficiently clear.

Reply to llewdor

caprio84

5/25

I liked the piece, though the first paragraph I re-read several times...he went from first person in the opening to third person in the start of the body.

Reply to caprio84

molnar

5/25

This is one of several submissions that just does not do what it claims to do. In light of that I find it hard to even weigh in the quality of the writing. Supposedly, we are going to learn how players from different leagues can be compared. We are introduced to EqA, which is given a decent explanation. I'd like to point out though that any statistic can be compared to league average; Ken does say EqA is "a good example", but the way the piece is structured one might get the impression that EqA is the only example. Minor point.

The main problem is that an individual's EqA in a given league in a given year only allows for comparison to other players in that league in that year. An average player in the Elmhurst Baseball League in 1977 had an EqA of .260. An average player in the AL in 2008 had an EqA of .260. We can't use this as a basis for comparison, because they're NOT translated at all! True, there is no need to actually make that comparison, but the author leans pretty heavily on that one example for it to amount to nothing.

But I also claim that even Santo's .301 EqA in 1968 and Rodriguez's .284 in 1998 cannot be meaningfully compared. Again, these numbers are not translated. We can't just assume the two leagues were of equal difficulty. We know (see Stephen Jay Gould's "The Extinction of the .400 Hitter") that as baseball has evolved, the spread in abilities between the best players and the worst players has steadily decreased. EqA does not adjust for that. Along the way there have been temporary irregularities in the level of competition in a league (see World War II). EqA does not adjust for that. I'm not saying don't use EqA. I'm saying if you start off making a comparison between Ken Funck and Rod Carew, in order to motivate the use of translated statistics, then you'd better actually demonstrate a translated statistic.

Reply to molnar

DrDave

5/26

I agree that leaving out any discussion of the league averages in the two years being compared was a mistake -- a new reader won't 'get' why Santo was better than Rodriguez without seeing that much context. That's my only complaint.

I disagree that "equal difficulty" is an issue. Evan said it well up above -- baseball is a zero sum game, and the only meaningful comparison is against your competition in the season at hand. The question of how Jack Cust would fare as a baseball player if Jack Finney taught him how to go back to 1882 is interesting after 2 beers, but not relevant to this article.

Very engaging, fun without working too obviously hard at it. Good job.

Reply to DrDave

Magnum1799

5/26

Most engaging intro of any article by far. Definitely a thumbs up!

Reply to Magnum1799

jdavlin

5/26

Ken executed this week's assignment to perfection. He is a genuinely gifted writer that I would love to continue reading.

Pet peeve: "their job is to try and strip away..." That should be "try TO strip away". Just one of those common errors that drives me nuts. The fact that the writing is otherwise so flawless just makes me all the crazier.

Reply to jdavlin

greensox

5/26

Well-written, clear.
The theory is legitimate.

Reply to greensox

abskippers

5/26

After the initial entry and first round, I have found Ken to be the best writer of all the contestants, and it seems necessary to state the obvious: that's important in a writing competition. He employs a creative use of language that far exceeds the pedestrian standard of typical print or ESPN articles. A few other contestants produce well-organized pieces with solid prose, but none of the others bring to life their articles like Ken does - his writing stands apart in that regard.

He seemed to accomplish this week's assigned task well enough: take a stat and explain the basics for a general audience. The task though made all the articles a less fun read for your average BP subscriber, and that's no fault of any of the contestants. I still managed to learn something, or put something into context, from several of the articles, including this one.

But I absolutely loved your initial entry Ken. The way it played off the narrative history of baseball analysis, the sabermetricians and scouts who are the charachters in that story, and the typical BP reader's own vanity in placing himself in that narrative made it the most fun read of all. And hell, I'm still wondering if there's something to The Good Face.

So I write the following criticism as an unabashed fan: I wonder if you didn't take too much heed of some of the criticism of your first piece. Everyone else's initial entry was so very earnest; yours distinguished itself through its irreverance. And that's not to say that BP's primary mission shouldn't be serious baseball analysis, but I reckon BP has armies of free interns who can run regressions on the research topic of the day. It seems the winner of BP Idol should be someone who can really breathe life into the story that results from a research project, and you did that with your first piece on the hilariously trivial subject of facial dimensions. It seems that you were more earnest with this week's subject, and it was less fun as a result, although still among the top 2-3 entries.

So don't let The Man keep you down, is what I'm trying to say.

Reply to abskippers

Oleoay

5/27

I loved his first article, probably the most out of any of the initial entries.

Reply to Oleoay

ronaghanj

5/26

The writing is terrific, the concept clearly explained. Well done.

Reply to ronaghanj

Prospectus Idol Entry: Dare to Compare

Thank you for reading

Latest Articles

Top-120 Dynasty Outfielders for 2025 $

Rotisserie-Style Bid Limits for 2025 $

Tyler Holton: Against All Odds $

BP Annual 2025 Excerpt: Toronto Blue Jays $

2025 Season Preview: St. Louis Cardinals B

Ken Funck

Latest Articles

Top-120 Dynasty Outfielders for 2025 $

Rotisserie-Style Bid Limits for 2025 $

Tyler Holton: Against All Odds $