Future Shock: When Translating College Statistics Is a Bad Idea

It’s one of the great challenges of performance analysis: translating college stats. At first glance, it’s an incredibly difficult project to scope. The hardest aspect would be adjusting for the level of competition, which can vary greatly not only on a team-by-team basis, but on a game-by-game basis as well. Trying to adjust for ballpark effects when splits are rarely available only adds to the confusion. I’m not saying it’s impossible: it’s possible to do the incredible amount of work to come up with a series of coefficients for teams, stadiums, schedules, etc., that would balance the statistical playing field and let us better compare one player to another. However, I have one piece of advice to anyone starting to wrap their brains around the task:

Don’t do it. The data will be useless.

The purpose of such an effort is what I’m calling into question here. Translating college statistics is one thing, and translating college stats in an attempt to give us a better idea of what a player will become is an exercise in futility. When we look at the college numbers of top prospects and big leaguers, they are, in general, very good. This creates the illusion that translated college statistics will give us a valuable tool in projecting professional performance. The missing piece, however, is all of the outstanding college players who never make it, so we don’t look back at their college career. The majority of big league players who played in college put up big numbers there. However, the corollary is anything but true: Not all top statistical performers will make good pros. There are definite patterns to the ones who make it and the ones who don’t. The patterns, though, cannot be measured in raw statistics. They can, however, be measured in scouting reports.

At the big league level, performance is everything. Once we move down to the minors, we begin to split the values of performance and projection. The lower the level, the more important projection is, and the less important pure statistical performance becomes. At these lower levels, two things must be asked: What is the player doing (performance), and how is the player doing it (scouting)?

At the college level, how a player is accomplishing good offensive numbers is far more important than the raw numbers because of the presence of the metal bat. Metal bats can play a significant role in creating ‘false power,’ as a physically strong player can power a ball out of the park without making solid-centered contact, while that same contact off the handle or end of a wooden bat more often than not leads to an easily played fly ball.

So let’s look at some numbers. During this decade, college baseball has had four power conferences: The ACC, The Big 12, the PAC-10, and the SEC. Just looking at those conferences in order to somewhat mitigate the level of competition issue, it becomes pretty clear that the top hitters in the league do not necessarily equal top professional prospects. Here’s a list of all the players in those four conferences from 2003-2005 who accomplished the following four things:

Were among the top 60 in the nation in batting average
Hit at least one home run for every 20 at-bats
Walked at least once for every 10 at-bats
Struck out no more than twice for every 10 at-bats

YEAR  PLAYER (*=sophomore) SCHOOL       AVG   AB  HR  BB  SO
2003  Jeff Van Houten*     Arizona     .413  231  11  24  27
2003  Jeremy Cleveland     N.Carolina  .410  251  19  37  34
2003  Ryan Garko           Stanford    .402  259  18  28  17
2004  Jed Lowrie*          Stanford    .399  233  17  50  40
2004  Eddy Martinez-Esteve Florida St. .385  270  19  32  41
2005  Aaron Bates*         N.C. State  .425  214  12  37  27
2005  Ryan Braun           Miami       .388  219  18  33  39
2005  Chase Headley        Tennessee   .387  238  14  63  23
2005  Brian Pettway        Mississippi .383  266  21  35  47

It’s a short list, as this is, statistically speaking, the cream of the crop. These are players whose raw numbers indicate an ability to hit for average, hit for power, draw walks, and make contact. But are they the best pro prospects? Certainly not to a man, and for some, it’s not even close. Take Jeremy Cleveland, who in 2003 absolutely dominated the ACC. He won the batting title by 39 points, walked more than he struck out, and was second in home runs, trailing only Wake Forest’s Jamie D’Antona. If we just did raw translations of college stats prior to the 2003 draft, we would probably be saying that Cleveland was one of the top three hitters available, maybe even the best. Scouts disagreed, seeing an unathletic body and, more importantly, a swing that was far more designed for aluminum than wood. So when the draft rolled around, nearly 60 college hitters went ahead of Cleveland, who was drafted by Texas in the eighth round and signed for $85,000. On an economic scale, he was worth somewhere between two and eight percent of the investment put into a first-round pick. And in the end, the scouts were proven right, as Cleveland hit .322/.432/.512 in the Northwest League in his pro debut, but was released following the 2005 season after batting .253/.355/.298 at Double-A and just .263/.339/.379 after a demotion to the California League. Sure, it’s an extreme example, but a clear exhibit for evaluating college players where knowing how a player accomplished something was far more important than knowing what he actually accomplished.

As stated before, the illusion of power created by the metal bat is the most difficult to deal with on a pure statistical level. Looking at the home run leader boards from 2003 alone we see players who had little or no shot at a pro career–like Nebraska’s Matt Hopper, who led the Big 12 with 22 home runs in 233 at-bats, Washington’s Chad Boudon, who led the Pac-10 with 22 home runs in 209 at-bats, and Alabama’s Beau Hearod, who led the SEC with 20 blasts in 231 at-bats. Hearod slugged 25 home runs in 2004 for Low Class A Lexington, but he was 23 years old at the time, and hit just .255 with 135 strikeouts. He retired the following spring. As for Hopper and Boudon, they’re both also out of baseball, combining to hit just .193 in 187 at-bats with a grand total of three home runs in the minors. Any purely statistical translation would have seen these players among the top power hitters available, but in reality, none of the three was selected in a single-digit round that June.

Believe me when I tell you that this is not an anti-statistics column, and that many mistakes have been made on college players who did not put up good numbers, but offered plenty of projection or athleticism. The point is that to base professional projections solely on amateur statistical information is futile, because without having the necessary scouting information, you just have half of the puzzle. It’s like watching a black and white movie like Psycho and then being asked to name the color of Janet Leigh’s sweater. You saw the sweater, but you don’t have all of the necessary information to answer the question. The same is true when you just have college stats and not scouting reports. We’re back to beer and tacos–you want to evaluate amateur talent? Both have to be there, and both have to complement each other.

Next week I’ll take a look at how college pitching statistics play into the argument, and present some ideas about making the project of translating college statistics more worthwhile.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Future Shock: When Translating College Statistics Is a Bad Idea

Thank you for reading

Latest Articles

The Stash List ’25: Week Two $

Lineup Lockdown: American League, April 2025 $

Did David Bednar Have the Longest Closer Leash in MLB History? (Part Two) $

Was There Really A Steroid Era? $

MLU: Embarking on a Questad $

Kevin Goldstein

Latest Articles

The Stash List ’25: Week Two $

Lineup Lockdown: American League, April 2025 $

Did David Bednar Have the Longest Closer Leash in MLB History? (Part Two) $