Last week, I wrote a piece on the social development of young baseball players (and humans in general). In the piece, I suggested that one reason that teams might employ older players who are well past their prime, to the point where they are barely replacement level, is that there might be something to the "clubhouse guy" effect, particularly on young players. Players in their early 20s are going through a seldom recognized and only recently understood period of neurological development, and in addition to being baseball players are also trying to figure out how to be adults. There might be some value to having a guy around who is… well, already an adult. Someone who could take a young player under his wing.
When I wrote that, I was thinking mostly in terms of the minor leagues, particularly for the age 18-21 set. During those early years, a player might need guidance not only on how to hit a curveball, but also on how to be a fully-grown man. An older player who has been there might be a good person to approach. Teams often talk about older veterans in terms of their possible contributions to the team off the field, even at the major-league level. Those guys are a calming influence in the clubhouse. They help keep the younger players focused.
Well, if I'm going to propose a theory like that, I should be willing to test it against the evidence. In the past, I have found some evidence that older catchers may help young pitchers in a meaningful, if minor way. Does it work for hitters?
(As always, if you think math is witchcraft, please skip the next section and go to "The Results.")
Warning! Gory Mathematical Details Ahead!
I just realized that I haven't done a gory details piece in a while. Sorry.
Let's define some terms. I looked at all players under the age of 25 from 1993-2011 who had 250 PA in a season and also in the season before. I calculated their on-base percentage for both years, determining the change between years using the standardized method that I have described before. This method controls for differences in the reliability of a statistic given different sampling frames (i.e. OBP, like any stat, is more reliable after 500 PA rather than 250 PA) and produces a z-score to tell how much a player has changed over time. Only players who remained with one team all year were eligible (although a player might change teams between years).
I calculated how many players on our greenhorn's team were 32 or over that season (and who got 250 PA for the team during the year… no fading guys who stopped by for a quick cup of coffee), and what percentage of the hitters on the team this represented. I also coded (yes/no) whether anyone was of an advanced age. Since hitters and pitchers tend to socialize with each other, this seemed appropriate.
Now, there are certain players who are playing into their late 30s not because they are good at helping out younger players, but because they are still excellent at hitting baseballs. So, I slimmed back the pool of possible "mentors" to those who are clearly subpar hitters. I limited the pool to those who had an OBP of less than .310 in the season in question. I calculated how many of those were around (and whether any were around).
I ran a few different regressions. For all of them, I controlled for the previous season's OBP. A really good way to have a sudden upswing in your year-to-year OBP is to have a really down season the year before. In one regression, my dependent variable was the z-score generated above. In this way, we have an idea of how far the player has progressed in standardized terms. For all guys under 25, I looked at whether the number of hitters over 32 on their team predicted change in OBP. I did the same for the percentage of hitters over 32, the presence or absence of such a teammate, and then the number of hitters over 32 who were subpar in their own performance.
In addition to looking at the influence on change over time, I looked at whether any of these factors might, even if they didn't lead to amazing development, keep a young hitter from a collapse. I coded whether a player had fallen in his difference score by more than one standard deviation and ran logistic regressions using the same predictors.
The Results
Nothing was significant. I even played around with the age at which someone was considered a veteran (raised it to 35), and how young a player had to be (moved it to 27). Nothing worked. So there's nothing to the "good guy in the clubhouse" theory, at least as far as it actually impacting player performance. Right?
There's probably someone who read that nothing was significant and thought "Ha! Another myth bites the dust!" Not so fast. This is a mistake that I see a lot in sabermetrics and have probably made myself.
There probably are cases in which an older player takes a younger player under his wing. It's hard to believe that he would do so for everyone on the team (or that everyone would need it). Also, it's probably not the case that all older players are good at that sort of thing. But it's harder to believe that it never happens, that there aren't certain guys who are good at it, and that somewhere along the line, someone benefited from that sort of mentorship. Maybe we don't know fully how to identify the good ones, but it's a bigger stretch of the imagination to deny that this effect is out there.
This exercise neatly illustrates the problem with large N database research. It assumes that all players will respond to some set of circumstances in the same way—young player plus veteran player equals some measurable bump in performance. There are times when everyone does react the same way, and it teaches us something interesting about the way that the game works. But that sort of one-size-fits-all effect is the only type of effect that a large-N database query is capable of finding.
We can say from the analyses that I just did that the mere fact that there are seasoned vets on the team does not mean that young players will all show a positive effect. This should silence the people who indiscriminately praise the signing of any over-the-hill player as a "good clubhouse move" and immediately dream of his "veteran leadership" blanketing the kids on the team in a warm glow of OBP. It doesn't work like that. But that doesn't mean that on the micro level, we wouldn't see some sort of major (and real) effect of mentorship. We just don't have the data set that would allow us to see it.
I suppose that there are a number of other objections that one could make to the methodology (I looked only at OBP, the 24-year-olds in MLB are a very selected group, the 34-year-olds in MLB are a very selected group, the problem might be that a team that plays a bunch of older vets might have done so only because their blue chippers had a bad few months and were sent down to the minors for more experience, etc). Those problems might be getting in the way, but I think there’s more to the results than that.
In the chat that I hosted here at BP a couple weeks ago, I was asked where I thought the next big advance in sabermetrics would come from. I answered that it would come from moving away from the large-N model and understanding each player as his own unique data set. It's an approach that has been oddly missing from the sabermetric worldview, and I think that we're the worse for it.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
It also might vary by team. If the team has several bad influences, it might be very important to have some good ones, too. If the team lacks bad influences, it might be less important.
Since this outcome is going to depend on the young player, the veteran, and the overall team environment, my guess is that statistics offer little help with this, as there are too many variable we lack information on and are hard to quantify.
I can't imagine an easy way to statistically test this, what with the thousands of other variables involved, but one could start by dividing teams into age-distribution categories and comparing records.
Veterans, club house chemistry, and so on and so forth, these and umpteen other factors, they affect different players in different ways. Perhaps a veteran leader shows a kid a couple of tricks to deal with the daily grind, extending his longevity. Perhaps being in a series of good clubhouses encourages a player to keep his love of baseball and hence good habits... Perhaps... Perhaps... Perhaps...
I love digging into the fog of this game we all love, but as a wise sabermetrician advised, it must be respected. As highly trained, professional, and mechanically perfect baseball players may be, they're still people, with emotions, preferences, tastes, quirks, and flaws.
When you controlled for the first season's OBP, did you regress the change in OBP (as a dependent variable) against the first year (explanatory variable)? This will give a biased result. The correct way is to use the sum (or average) of the OBP for both years as the explanatory variable.
You probably know better, but I see this error all the time.
On the flip side, look at the recent stuff emerging about 21 year old hockey star, Tyler Seguin - http://sports.yahoo.com/blogs/nhl-puck-daddy/tyler-seguin-spends-lockout-living-abject-squalor-leaves-210150618--nhl.html
One reason I am confident that people have gone off the deep end in applying generalizations to tail cases is that I have made a living betting on baseball for the past 10 years, and it's been in no small part due to taking the other side of these inaccurate predictions of mean reversion (provided the sample is large enough to refute it).
I hope someone here or elsewhere can devote an entire series to examining serious career outliers (in FIP vs. ERA, handedness splits, etc.) apart from the rest of the population and determine if such players should be lumped in with the rest of the player universe, or rather if we should be applying different predictions of regression.
Well said, Evo, and in the case of Hellickson, Jason Collette wrote an awesome article on his particular tail-wagging, to which I chimed in with a couple of additional details (with GIF's!).
Also, watch Ricky Nolasco pitch if you want to see an outlier example of the difference between ERA and FIP - his (ERA-FIP) for the past four seasons have been +0.56, +1.17, +0.62, and +1.75. The guy doesn't miss his targets by much, but he misses them all the time, and missing your spots within the strike zone will get you hammered in the show. The result is few walks but a ton of hard-hit baseballs, a factor that slips beneath the radar of any metric that is rooted in box-score stats.
Speaking of which, I think the greatest limitation in today's statistical playground is the error inherent in our input variables - how can we properly tease out the roles of pitching and defense on balls-in-play when we are using a data set that treats a bunt single the same as a screaming liner over the second-base bag?
Luckily, Mike Fast covered that topic for us before bolting for more humid pastures.
Excellent work, as per usual. I meant to ping you when I saw the chat comment, and I couldn't agree more about the large-N model.
I remember learning something in stats class about applying the results of large-N studies towards similarly robust samples - if 500 players tend to regress toward a mean in one season, then one might expect a similar sample to follow the same trend. But the model breaks down on a case-by-case basis.
Besides, individualized analysis opens up so many fun research questions, many of which don't enter the analytic framework of a large-N model.
But do you see people moving away from these studies toward more case-by-case research? It's hard for me to imagine how to even structure such studies. Longitudinal data focused on individual performance, perhaps?