The Cape Cod League is the premier summer baseball league for college players. A good summer on The Cape might just make you a million dollars at draft time. I’m told there’s also a local professional team in the New England area that has had some recent success too, so good for them. And yet, in scouting circles, New England is seen as something of a desert wasteland. The standard explanation is that sure, there are athletes good enough to play professional baseball in New England. The problem is that players in Stars Hollow, Connecticut just don’t get the reps that they do in Georgia, because there’s a lot more baseball weather (read: time that it isn’t snowing) in the South.
The geography of where baseball players come from is a fascinating topic (and makes for a great map!) Matt Swartz recently noted that counties with warmer weather (and bigger incomes) were more likely to produce major leaguers. New England actually turns out rather well on the income distribution, with Connecticut, Massachusetts, and New Hampshire ranking fourth, fifth, and sixth, respectively, among the 50 states in median income, so it must be the cold and snow that’s holding the region back from producing MLB talent. Or is it?
Is it really that hard to scout in New England? A few weeks ago, I studied how well teams were doing when it came to properly evaluating prospects for the MLB draft. The answer was that teams weren’t doing as well as we might think. The links between signing bonuses and draft positions and basic outcomes like whether the draftee made it to the majors or produced five career WAR were actually only moderate. I choose to interpret that as “Prospecting is hard” rather than “Teams are doing a bad job.” But I got to wondering whether New England’s reputation is actually well-earned. Do teams have a harder time scouting cold climes than warmer ones? Is there something else at work here?
Warning! Gory Mathematical Details Ahead!
Similar to the method I used in my previous article, I used a database of signing bonuses obtained here and career WAR stats (to date) from Baseball Reference. I studied the results of the first 10 rounds of the drafts from 2003-2008. I standardized all bonuses to represent the percentage of that year’s league-wide bonus spending that the player got. If the league spent $100 million and the player got $3 million of that, his standardized bonus was three percent. I coded players for meeting a couple of thresholds: appearing in a major league game and collecting five career WAR. (I tried a few other cutoffs, and the results generally came out the same.)
I ran a couple of different analyses. In one, I ran a correlation between a player’s standardized bonus and his career WAR total (to date) among those who had made it to MLB. I also ran a logistic regression predicting whether he met the two other milestones, with signing bonus as a predictor. In my previous work, I used signing bonus as a proxy for how highly a team thought of a player. I found that the correlation was stronger for some categories (first-round picks, college players) than others (anything after the first round, high school players). In theory, a high correlation shows that teams (in general) are good at assessing players. Low correlations mean that teams are paying money and have no idea what they’re getting for it. That could work out in their favor (getting a really good player for a $10k bonus) or against them (Brien Taylor), but it’s the sign of an inefficient market.
This time, I split things up geographically and focused on where draftees were from. Whether it was from high school or college, Baseball Reference kindly provided the state in which the player’s school was located. This is convenient because teams often assign scouts to specific states or, depending on the size of the state and how baseball-rich the area is, clusters of states. If it’s true that New England is harder to scout because the weather is worse and the competition is more uneven, then we should see teams guessing more on players from New England than from other areas, like the all-baseball, all-the-time state of Florida.
Here, similar to my original article, I present the value of the correlation between signing bonus and career WAR. For the logistic regressions, I took the Nagelkerke’s R-squared for the model and took the square root, to bring it to the same scale as the correlation. (If you aren’t super-initiated, just nod your head and know that “higher is better.”)
I also present the total number of players from each region who signed during those years, and the percentage of them who appeared in an MLB game. Finally, I used a logistic regression to create an expected rate of MLB appearance. We would expect a player who got a $3 million signing bonus to be more likely to get to the bigs than a guy who got 10 grand. I looked to see how many major leaguers each region should have produced (if their signing bonuses are any indication) and what percentage of that number actually showed up.
States |
Signing Bonus – WAR |
Signing Bonus – Appeared |
Signing Bonus – 5 WAR |
Total Players Who Signed |
Percent Who Made MLB |
Percent of Expected MLB Players |
New England (ME, MA, NH, VT, RI, CT) |
.085 |
.288 |
** |
31 |
22.6% |
57.5% |
Tri-State (NY, NJ, PA) |
.314 |
.292 |
.387 |
63 |
34.9% |
98.3% |
South Atlantic (DE, MD, WV, VA, NC, SC, GA, FL) |
.421 |
.438 |
.437 |
466 |
32.8% |
89.4% |
The Midwest (OH, MI, IN, IL, WI) |
.362 |
.429 |
.507 |
124 |
36.3% |
115.4% |
Mid-South (KY, TN, MS, AL) |
.351 |
.338 |
.400 |
152 |
40.1% |
113.4% |
Texas and Friends (TX, OK, AR, LA) |
.226 |
.486 |
.400 [sic] |
295 |
37.6% |
101.2% |
Mountain West (MT, ID, WY, NV, UT, CO) |
-.340 |
.011 |
.327 |
48 |
22.9% |
79.7% |
Southwest (AZ and NM) |
-.017 |
.453 |
.232 |
65 |
47.7% |
134.03% |
West Coast (CA, OR, WA) |
.381 |
.495 |
.356 |
400 |
38.0% |
109.3% |
Texas only |
.177 |
.530 |
.406 |
177 |
36.7% |
94.2% |
Florida only |
.438 |
.407 |
.365 |
191 |
33.0% |
89.9% |
California only |
.375 |
.485 |
.359 |
345 |
39.1% |
111.6% |
** – There were no players from New England drafted from 2003-2008 who put up more than five WAR.
The worst results in measures of how efficient the market is (that is, how good teams are at pricing eventual performance) came from New England and from the Mountain West (where MLB teams actually seem to have it backward). Those two regions also produced the fewest draftees and the lowest ratio of major leaguers to draft picks, as well as the lowest yield of major leaguers when adjusting for expectations (read: signing bonuses). Teams didn’t find a lot that was interesting in these areas, and when they did, they had almost no idea how to price it and it usually ended up disappointing them.
Missing Persons Report
Blame it on the weather, I guess. Except…look at the numbers for the Midwest. Having spent 30 years of my life in Cleveland and Chicago (and probably a grand total of three months on the Indiana Toll Road), I assure you that there is plenty of snow and cold in those states. Yet major league teams seem to do about as well as most other regions in figuring out how to price draftees. The New York-New Jersey-Pennsylvania area also seems to be about right, and it snows plenty there as well. New England is an outlier. Where are all the missing New England major leaguers?
We can interpret the lagging New England numbers in a couple of different ways. It may very well be that good players from New England high schools choose to go to SEC and Pac-10 colleges that are perceived as better places to hone their craft, and then they get drafted from there (and so my model lists them as being from North Carolina or California). That can turn into a spiral where those programs really do become better programs because they get all the good talent, and that could depress the number of players drafted. There are plenty of those in the database, by the way, but that doesn’t explain the whole problem.
Why is it that the ones who are left behind—the high school players from Boston or the college kids from Boston College—are so poorly priced? Certainly, if a team is interested in a player from Vermont, they send a scout or two to go see him play in the same way that they would send a scout to Texas. In theory, they would evaluate both on the same criteria, and do the same interviews. Why are teams so much worse at guessing what the Vermont kid will become?
One answer that we can rule out is that because there’s a talent drain from New England high schools into colleges in other areas, the players who do get drafted are more likely to be high-risk high schoolers. In my original article, I found that high school players really are riskier bets, in that the market does a poor job of figuring out what they will be come, worse than college draftees. However, while 31.2 percent of all draftees from 2003-2008 were high school students, only 25.8 percent of New Englanders were drafted out of high school. That doesn’t seem likely. We also saw that players drafted after the first round were, as a group, mis-priced. Maybe teams see New England as a nice place to find a fifth rounder? That wasn’t the case, either as 22.6 percent of the New Englanders chosen were first-round picks (compared to 15.4 percent of all picks—I counted supplemental picks as first-rounders).
Maybe it’s just the fact that because scouts don’t have as many chances to get good looks, they’re going on less information. Less information always means more risk. Maybe it’s because the talent drain means that the opposing hitters/pitchers that the player is going up against aren’t as good, and so the scout doesn’t get a chance to see what he can do against “real” competition as easily. I suppose that’s what the showcase circuit is for, but even that’s an ever-smaller sample.
I spoke to a few of the scouting folks here at Baseball Prospectus (I know, they’re Capulet and I’m Montague…and that’s how it actually works, people. We’re supposed to hate each other, but we actually stand on each other’s balconies and make out all the time) and several of them chimed in with theories. Some mentioned a couple of specific cases of draftees who turned into busts. In a sample size of 31, that can go a long way toward messing up a correlation. Ryan Parker had an interesting theory about how the Cape League might actually be to blame. If a scout has New England as his territory, should he work the high school circuit or just park on the Cape and see all kinds of fun college kids? In that way, teams get fewer looks at the high school talent.
Chris Mellen pointed out that New England is made up of states with relatively low populations, so it might be that there’s not a lot of talent to begin with, simply because of raw numbers. Al Skorupa observed that because New England states tend to have higher incomes and higher concentrations of college graduates, high school students are more likely to become college students—not because it will increase their chances at MLB, but because they are more likely to be the children of college graduates who want them to graduate as well. Of course, they go to schools where the baseball team will get some coverage.
A lot of the potential explanations came down to “bad weather, which means fewer reps, which means talent that’s more raw, which means bigger risks for major league teams.” All of these theories make sense, and maybe one or two of them are actually true. Or maybe this is a fluky thing that just kinda happened; not everything has an explanation. But if this thing does, it might point to some sort of inefficiency (nay, opportunity!) in how scouts approach the northeast corner of the country. Right now, I have to confess that it’s not entirely clear to me what that opportunity is. So I leave you with a mystery. Where are the missing New Englanders?
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
Also, for the sake of draft completeness, what about draftees from Canada or Puerto Rico?
I think college baseball might be the answer: the Midwest gets well-represented because scouts watch the Big Ten closely, but there are only two big programs in New England (UConn and BC).
I also think there's a population effect you've missed, especially since the other region that significantly underperformed (Mountain West) has a population even lower than New England's.
There are probably a lot of New England Ex-Patriots (*rimshot*) out there in college at NC State and Clemson. Maybe I should go back to see what happens by place of birth.
I agree college recruiting is largely regional, but the elite products that do come out of New England are all afforded the opportunity to play at warm weather schools. The second and third tier players that don't blossom until later on are much more likely to wind up at regional colleges.
http://www.higheredinfo.org/dbrowser/?year=2004&level=nation&mode=data&state=0&submeasure=63
College-Going Rates of High School Graduates - Directly from High School
Highest states: NY (ok, makes sense), SD (what?), SC, ND (what?), MN.
CT and MA are pretty high but VT and NH really aren't.
1. Players that are good college prospects but have not developed enough for the pro game due to lack of reps, so:
A. They go to a quality college program (south/west) and are drafted there two to four years later, or
B. They require an above-slot (or more than their profile would require based on risk/upside) in order to get them to sign, skewing the valuation of the player than the scouting would otherwise indicate
This effect is further skewed during the years in which the data was compiled, as teams were then able to spend without limit, so you were more likely to see a team roll the dice and give $700K to a projectable New England high schooler that was a comparative long shot when lined up with a $700K second rounder from Southern California. It will be interesting to see what the numbers look like in several years now that teams are forced to make more difficult decisions when it comes to handing out signing bonuses.
A smaller point -- I think the "value" you are seeing in areas like Chicago, New York, Philadelphia, Boston, etc. is probably in part skewed by the fact that, as a general matter, inner-city kids as a grouping tend to be more willing to sign for slot or below and get started with their pro careers, as compared to smaller private school kids in New England (for the various reasons noted in the article). That's of course a generalization, but I do believe it has an effect on the numbers when we are talking about the money doled out to these kids once drafted.
You mentioned your prior article found less efficiency with HS players which may well be an artifact of this structural impact (though you may have tried to correct for such in the prior article - I forget).
This does suggest that where drafted college players played in HS has the potential to smooth out some of the variances - also mentioned by both you and Nick.
I don't think most of the elite high school talent is being swallowed up by outside colleges. From my experience in Western Mass, most of the elite high school talent went on to college programs in New England, followed by Tr-State.
After growing up in that baseball world, and watching my peers move on, I have come to believe that players from New England are flat just not as good as other areas. I think this is due to a lack of reps, and proper training in the area. While this is speculative, one thing you may want to research would be the quality of coaches for high school and prep leagues, as that could be another variable to consider.
I would be interested to see if adding the choice of school adds any additional perspective beyond conference. I know many of the NE schools will spend the first month of their season traveling the South in preparation for their conference season and to thaw out -- I wonder if those players end up benefiting not only from the stereotypically better competition seen in warmer climes, but also from the extra scouting eyes laid upon them in the process.
As a guy who spent some of his teenage years in the Upper Peninsula of Michigan -- hooray for northern arms!
- Maybe youth baseball in New England is not as well developed as elsewhere?
- New England is among the most suburbanized regions in the country. Anecdotally it seems a disproportionate number of ballplayers come from small towns.
- Might there be an ethnic component? Aren't as many Blacks or Hispanics proportionately up there as in other parts of the country.
One could also speculate that growing up a Red Sox fan (as most of these kids do) messes up your mind (I'm saying that as a Bosox fan since 1968...)