What is a good manager worth? More to the point, how do we tell who the good ones are? We can measure what a manager does during the game, but that’s only a small part of his job description. A manager does decide who pinch-hits when, but he’s also in charge of making sure that everything is cool in the locker room. He manages the men as well as the game. We’re pretty sure that the answer isn’t zero, but what is it?
Last week, I looked into one major job that a manager has. It’s a long season and it’s the manager’s job to fight against The Grind. Right now, teams are assembling in Arizona and Florida for spring training and hopes are high all over the place. But by August, when that new season smell is gone and players are hurting physically and emotionally, it’s hard to keep going day after day. That’s The Grind. The manager has to put a stop to that. We know that over time hitters see their plate discipline suffer a little bit. It’s a small effect, but it builds up and ends up costing a fair bit in terms of lost strikes.
We know that if hitters are dragging in their plate discipline, over time there must be an edge to being a pitcher. It’s not surprising given that while everyday players play every day, starting pitchers only have to worry about playing once every five days, and relievers can at least count on not being in every single game. Unless it’s Eddie Guardado. But the travel still wears on the pitchers and there is still a lot of mentally taxing work to be done. The hitters might feel it more, but it doesn’t mean that the pitchers don’t. And it means that if managers do have some talent at fighting The Grind (and last week, we seem to find that they are pretty consistent from year to year in their Grind fighting abilities), they might be able to help pitchers just as much.
Warning! Gory Mathematical Details Ahead!
Well, the nice part is that you can basically go back to last week’s article and just replace “hitter” with “pitcher” to get an idea of the methods.
Quick recap: I used data from 2010-2014. First, we control for the batter and the pitcher and how likely an individual pitch is to result in a swing, or how likely a swing is to result in contact. Some hitters are good at making contact. Some pitchers are good at avoiding contact. Next, we look at how many days it’s been since Opening Day for that team. It’s a rough proxy for The Grind, but a reasonable one. Then, we look over time at how well that control variable and the time since Opening Day predicts whether a certain pitch will end up inducing a swing, or a swing producing contact. I used a binary logistic regression, for the initiated. Next, I added in a term that was the interaction of the manager and the number of days since Opening Day. We assume that there will be a certain slope overall on how The Grind affects these plate discipline stats, but this term will tell us how much we should adjust it for each manager.
Once I had all of that output, for each manager, I assumed that he was shepherding a player who had a 50 percent chance of making contact on a given swing on Opening Day, and then looked to see what the regression would predict was the chance of contact on Day 90 of the season. The stat that I was most interested in was whether a particular pitch ended up producing a strike or not.
I first looked to see whether a manager’s abilities to help (or hinder) his pitchers to fight The Grind was stable over the years. I treated each manager-year as an independent entity, and for managers who were active in at least four of the past five years, used an AR(1) intraclass correlation (ICC) to see whether the effect was stable over time. Like last week, the effects were pretty stable. The ICC’s all came in around .60 (Note: I separated out starters and relievers and got the same basic findings.)
The manager effects correlated as you might expect them to, although with hitters, we saw that a manager being able to keep his hitters making contact was more important than a manager keeping his hitters taking called balls. For pitchers, a manager who can help his staff avoid contact and who can help his staff get more called balls had about equal effects.
So, here’s the leaderboard for the past five years (min. three years of those five managed). This starts with an assumption that a given pitch has a 50/50 shot at being a strike vs. not being a strike on Opening Day, given the batter pitcher matchup (so that managers aren’t getting credit for having better players). The regression then adjusts for what that percentage would look like on Day 90 of the season. In this case, lower numbers are better, because the pitcher wants strikes to happen.
Manager |
Percentage chance |
Ozzie! |
.492 |
.492 |
|
.493 |
|
.493 |
|
.493 |
|
.495 |
|
Joe Maddon |
.495 |
.495 |
|
Not Jim Tracy |
.496 |
.497 |
|
.498 |
|
.498 |
|
.498 |
|
.498 |
|
.499 |
|
.499 |
|
.499 |
|
.499 |
|
.500 |
|
.500 |
|
.500 |
|
.501 |
|
.501 |
|
.501 |
|
.502 |
|
Jim Leyland |
.502 |
.502 |
|
.503 |
|
.505 |
I broke it down by just starters and just relievers to see whether certain managers were better or worse when it came to handling starters vs. handling the bullpen. The left column shows the leaderboard for starters only. The right shows the leaderboard for relievers. The correlation between the two is .27, suggesting that being able to keep the starters on track isn’t a guarantee that a manager will do well with the relievers. He might, but he might not.
Manager |
Percentage chance (Starters) |
Manager |
Percentage chance (Relievers) |
Ozzie! |
.492 |
Joe Maddon |
.490 |
Brad Mills |
.492 |
Ozzie! |
.492 |
Bud Black |
.492 |
Terry Francona |
.493 |
Terry Francona |
.493 |
Bob Melvin |
.493 |
Buck Showalter |
.493 |
Mike Matheny |
.494 |
Terry Collins |
.496 |
Charlie Manuel |
.494 |
Fredi Gonzalez |
.496 |
Don Mattingly |
.496 |
Charlie Manuel |
.496 |
Jim Tracy |
.496 |
Joe Maddon |
.496 |
Ned Yost |
.496 |
Bruce Bochy |
.497 |
Terry Collins |
.499 |
Manny Acta |
.497 |
Bud Black |
.499 |
Not Jim Tracy |
.497 |
Clint Hurdle |
.500 |
Clint Hurdle |
.498 |
Bruce Bochy |
.500 |
Kirk Gibson |
.498 |
Ron Gardenhire |
.500 |
Davey Johnson |
.499 |
Kirk Gibson |
.500 |
Ron Gardenhire |
.499 |
Dusty Baker |
.501 |
Bob Melvin |
.499 |
Davey Johnson |
.501 |
Eric Wedge |
.499 |
Brad Mills |
.501 |
Ned Yost |
.499 |
Robin Ventura |
.501 |
Ron Washington |
.499 |
Manny Acta |
.502 |
Don Mattingly |
.500 |
Buck Showalter |
.502 |
Ron Roenicke |
.500 |
John Farrell |
.503 |
Joe Girardi |
.500 |
Joe Girardi |
.503 |
Jim Leyland |
.501 |
Ron Roenicke |
.503 |
Dusty Baker |
.501 |
Eric Wedge |
.505 |
Mike Scioscia |
.502 |
Mike Scioscia |
.505 |
Robin Ventura |
.503 |
Ron Washington |
.505 |
Mike Matheny |
.504 |
Jim Leyland |
.506 |
John Farrell |
.506 |
Fredi Gonzalez |
.507 |
We see that Ozzie Guillen seems to have a special talent for handling pitchers, as do Terry Francona, and (there he is again) Joe Maddon. Buck Showalter, he who allegedly works magic with bullpens, actually gets much better marks working with his starters than relievers. Mike Matheny ranks high in handling a bullpen, but near the bottom with starters.
Now, let’s take a look at the overall pitching numbers and the overall hitting numbers (from last week). We know that an average team saw (and threw) 22,310 pitches in 2014. We know that a pitch turned from a strike into a not-strike is worth just shy of .10 runs (.097). These manager numbers cover 2010-2014, but we’ll use the 2014 value stats as a rough baseline. I looked at how many “extra” strikes (or non-strikes) a manager would have produced over the course of 22,310 pitches, compared to a baseline rate of 50 percent through the whole season, and multiplied that by .097.
Manager |
Manager Grind Runs (per season managed, 2010-14) |
Bud Black |
18.84 |
Terry Francona |
14.68 |
Joe Maddon |
14.35 |
Charlie Manuel |
13.98 |
Terry Collins |
12.83 |
Buck Showalter |
11.06 |
Jim Tracy |
10.92 |
Ozzie Guillen |
8.68 |
Manny Acta |
7.57 |
Davey Johnson |
5.24 |
Ron Roenicke |
5.01 |
Clint Hurdle |
4.76 |
Brad Mills |
3.62 |
Bob Melvin |
2.18 |
Dusty Baker |
1.35 |
Ron Washington |
.78 |
Bruce Bochy |
.61 |
Mike Matheny |
-.12 |
Kirk Gibson |
-2.30 |
Mike Scioscia |
-3.64 |
Robin Ventura |
-4.47 |
Don Mattingly |
-5.26 |
Jim Leyland |
-5.34 |
Ned Yost |
-7.25 |
Ron Gardenhire |
-7.55 |
Eric Wedge |
-7.91 |
Fredi Gonzalez |
-11.07 |
Joe Girardi |
-12.86 |
John Farrell |
-16.95 |
When they ask how much a manager is worth, you can tell them that the spread between the best and worst on this measure is about 35 runs, at least when it comes to how well a manager fights The Grind. Bud is the new Black.
And for the curious, the 2014 leaderboard:
Manager |
Manager Grind Runs (2014, offense and defense) |
Buck Showalter |
33.40 |
Terry Collins |
24.82 |
23.37 |
|
Ron Gardenhire |
23.07 |
Kirk Gibson |
21.45 |
Terry Francona |
18.28 |
Joe Maddon |
14.16 |
Joe Girardi |
12.79 |
Clint Hurdle |
10.55 |
Bruce Bochy |
6.97 |
Ron Roenicke |
5.42 |
Robin Ventura |
3.98 |
Rick Renteria |
3.52 |
2.33 |
|
Bud Black |
.89 |
.19 |
|
-.19 |
|
Ron Washington |
-5.14 |
Bob Melvin |
-7.90 |
Mike Scioscia |
-7.92 |
-8.44 |
|
-9.64 |
|
John Gobbons |
-10.46 |
Fredi Gonzalez |
-12.66 |
Don Mattingly |
-15.97 |
John Farrell |
-18.76 |
Mike Matheny |
-23.02 |
Ned Yost |
-23.50 |
-25.00 |
|
-32.42 |
The careful (and sloppy) reader will note that the spread of talent over one year is greater than the multi-year average. Even Bud Black had a merely average year last year. But we also see that perhaps Buck Showalter deserved that Manager of the Year Award after all. But it suggests that over the course of a year, a good manager can actually be worth several wins.
Valuing Something That Never Happened
It’s hard to evaluate something that never happens, but that seems to be the manager’s primary job. We know that the baseball season is long. We know that it wears players down. Now we know that it can have some very real effects on a player’s plate discipline. And yet, we know that under certain managers, this isn’t as big of a deal. Maybe it’s not all the manager’s doing, but it’s hard to believe that he doesn’t have something to do with it. We’re conditioned to look for the value of a manager in the things that he actually does, rather than the strikes that never happen under his watch. But it’s prevention, not action, which appears to provide the lion’s share of his value to a team.
It’s not clear exactly how a manager might accomplish this. For example, with pitchers, we don’t yet know whether a manager who is more judicious in not over-working his bullpen is the one who doesn’t see the effects of wear and tear. It’s not an unreasonable hypothesis, but we don’t know that yet. Maybe it’s all in his personality. Maybe there’s room for a little bit of both.
We also know that while skill in working with individual types of players (hitters, starters, relievers) is consistent over time, the skills aren’t correlated with each other. That means a manager might be good at one and bad at another. When you find a manager who is good at all three, best to hold on to him. Or sign him away from another team. The nice thing is that we can now put some reasonable numbers to what’s been understood for a long time (even if teams aren’t paying like they believe it). Managers have real value to a team. It’s hard to see it without “big data” to hold up a magnifying glass, but that’s the beauty of big data. It can see those small effects that pile up over time.
These findings have one other interesting corollary. If a manager can help fight The Grind, then maybe there are other things that can help as well. Teams in MLB have already begun talking about how diet and sleep and can help players to fight it. It seems that research might be worth much much more than is generally believed.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
The results don't seem to bear out that hypothesis, given that the managers between Buck Showalter and Tito Francona on the 2014 leaderboard all presided over poor teams. But I'm wondering: Might those managers be even better than this index suggests? To keep morale high in a down year for a team -- or at least to attenuate the effects of morale-killing losses -- seems like a more noteworthy accomplishment than to attenuate the Grind on a playoff team.
Conversely, it can't be good to be low on this list when your team is performing well (I'm looking at you, Ned Yost and Mike Matheny).
Also, although this may not be a popular thing to say, I'm not convinced that MLB is entirely PED-free. Managers that happen to preside over players that routinely pop greenies will be shown as handling the grind better. I'm not implying that managers on this list condone that behavior, rather that they may inadvertently benefit from it and we have no idea who that would be (spoiler: I'm targeting Chris Davis in my fantasy drafts this spring).
It's like noticing that offense is down from one time period to another. Did pitchers get better or batters get worse? You have to do some mathematical gyrations to try and figure that out, and even then, it might not work.
More seriously though, it sounds from the description that "manager effect" is largely interchangeable with "team effect"? Are the year to year correlations considerably higher for teams when the manager remains the same than when a team switches out their manager? Finding that team-level declines in contact rate have a sizable year-to-year correlation is interesting but I'd be pretty hesitant to assign that to the manager. If managers, teams/seasons and players were all random effects and the manager random effect turned out to be consistently substantial I'd find that more compelling.
But I am still hung up on how you could just look at the data and find that both batters AND pitchers get worse as the season wears on. That makes no sense from a data point of view. The batter/pitcher matchups have to go up for the pitchers to get worse and down for the batters to get worse. Which is it?
Re: Farrell vs. Black: That's a decent hypothesis and would represent added value. Not sure how to pull that one apart, but a good one to start thinking about.
Eyeballing it looks like minimal correlation so we got that going for us, which is nice.