November 1, 2013
Moments of Transition, Moments of Revelation
While working on cleaning out my house recently (more about that later—but long tangents before I get to the point are a tradition around here, and I can’t well abandon that at the end, can I?), I came across a book called Understanding Solid-State Electronics. I don’t think I’ve seen it in years before now. It’s a bit dated. Actually, it was a bit dated even when I was reading it as a kid—its illustration of something that fits the Universal System Organization of sense, decide, and act is a record player.
What can we learn from this, other than that the Texas Instruments Learning Center illustrators have a really weird idea of what people look like? The basic concept is that you have an input, you have some sort of action being taken upon that, and you have an output based on it.
But more importantly—it’s a model. And the point is to teach us something about how electronics work. It is not, in itself, electronics. It can’t play a record. But, at the same time, it allows us to transfer understanding. Because we understand how a record player works in this model, we can then use that to understand things that aren’t record players.
In sabermetrics, we tend to use models a lot. PECOTA is a model, or rather a series of interconnected models. WARP is much the same—it’s a set of models connected to each other to create an even larger model. I love models. And I love baseball. And I love that I get to spend a lot of time applying one to the other and vice versa.
I don’t know about you—and please, feel free to tell me later!—but I grew up surrounded by models. That solid-state electronics book. My various Radio Shack breadboard kits (if you want that story, I bet there are still copies of Best of Baseball Prospectus Volume 2 available for sale). Model cars. Model rockets. Pinewood derby cars and space derby ships. Things you can take apart and put back together again and see how they work.
A lot of sabermetrics these days likes to focus on how well a model can predict things. Now, predictive models are great and good, partly because there’s a lot of utility in predicting things, and there’s a lot of intrinsic value in using prediction to validate a model. But there’s also a lot of value in how a model can explain something.
Part of that is because explanatory models are going to be better predictive models in the long run. We can talk about overfitting, and how models based on a limited amount of data (so pretty much every model ever—some models are less limited than others, but there’s always less data than there is life) can come to some pretty odd and incorrect conclusions that break down when applied to additional data. There are statistical tools that you can use to avoid such problems, but creating a model that has explanatory power in addition to predictive power is another way to avoid such problems.
But explanatory models aren’t just useful in prediction. They are also useful in, well, explaining things. Explanatory models are how we learn new things about baseball, and they’re how we can communicate what we’ve learned about baseball to a larger audience. Nobody would ever dream of creating an explanatory model without any thought for how well it predicts, but sadly the converse doesn’t seem to be true. The most powerful models do both.
I’m talking a lot about models because I have some news to share with y’all. I am excited and a little sad to tell you that I’ve been hired by the Houston Astros as a mathematical modeler in their Decision Sciences department. Now, I’m sure you have a lot of questions, and I’ll start with the most important one first: no, I will not be wearing a lab coat and goggles, as much as I might want to. I’m sorry to disappoint all of you.
So now that we’ve gotten that out of the way, there are a few other questions I imagine you’re all asking, one of which is “What does this mean for Baseball Prospectus?” Another might be, “What does the continued brain drain mean for sabermetrics?” And another might be, “How might I get a job like that someday, if I work real hard and eat all my vegetables?” Let’s talk a while about those.
This sort of goodbye has gotten rather familiar to Baseball Prospectus readers, and I want to assure all of you that this isn’t because we don’t like you. I’ve written for a lot of websites and read many more, and I can say with absolute certainty that there’s no site of this size and subject matter that has a better community around it than Baseball Prospectus. The reader comments on BP articles are far and away the most intelligent, thoughtful, and interesting comments of any website of this size on the Internet. You all show up to support us at events—from the pizza feeds of old to the book signings to the ballpark events. Meeting with our readers at those events is a special joy for me, and I think for all the authors and contributors we’ve had over the years. You’re all great, and I’m blessed to have had such an audience to work for.
I’m sure that’s cold comfort. It doesn’t seem to keep people from leaving. Here’s the front page of BP from when I first started contributing. There are a lot of familiar names on there, and you’ve said goodbye to an awful lot of them (although Russell’s like a bad penny, he keeps turning up). So why keep supporting BP when the writers you come to know and love keep leaving?
In particular, I suspect BP readers are starting to feel like they root for an Astros farm team of sorts, as I’ll be following in the footsteps of Mike Fast and Kevin Goldstein. As luck would have it, I’ve spent the past year rooting for an actual Astros farm team, as the Quad Cities River Bandits became an Astros affiliate this season as part of the Great Midwest League Affiliation Shuffle. Rooting for minor league ball is different than major league ball in a lot of respects, but one that sticks on my mind is how rosters change from the start of the season to the end of the season. In major league baseball, teams will shuffle rosters to try and win—they’ll cast aside players who aren’t performing, and try to add new players who will perform better, for the most part. (The trade deadline is a notable exception.) In minor league ball, the biggest reward for success is a promotion. The players you fall in love with in April may not be there in August, even on a winning club. You learn to say goodbye a lot there, too.
Here’s the thing, though. They weren’t called the River Bandits back then (during my childhood, they were mostly known as the Angels, after their parent club), and the stadium was still called John O'Donnell Stadium (it’s still the same place, though, although there’s been a lot of work done on renovations and the park looks gorgeous). But that park and those teams are how I truly fell in love with baseball. I remember sitting on a red and yellow afghan blanket out by the right field fence, watching players race around the field and seeing the ball soar above them and eating bratwurst and enjoying a cool breeze off the Mississippi River and knowing life would never get much better than that.
And this past season, I remember sitting in my seats by the third-base dugout with my daughter, watching the players race around the field and seeing the ball soar above them and eating taco nachos and enjoying a cool breeze off the Mississippi River and knowing that life in fact hadn’t gotten much better than that, but I was okay with that. I’m still in love with baseball, and despite how it might appear at times, baseball is still in love with me.
But if John O’Donnell Stadium is where I learned to love baseball, Baseball Prospectus is where I learned to love baseball the way I do now. I first developed an interest in baseball stats purely out of necessity; as it turns out, there’s no WGN in Iraq, so I was left to follow the 2003 Cubs season (probably the defining season of the modern Cubs fan’s life) through mostly box scores. Baseball’s statistics have a certain narrative power to them that allows one to follow the game pretty well this way, actually, although it certainly loses something without the visuals. But along the way, I became steeped in the numbers of the game, although as yet I didn’t know what to do with them.
So then the 2003 Cubs did that thing they did, and I was pretty not okay with that. And I needed some way to deal with that. So I turned back to the numbers, and along the way I was introduced to the work of Nate Silver—maybe you’ve heard of him. Necessity introduced me to the numbers, Nate made me understand that there was a deeper meaning to them. And you’re here, and you’ve been here, for what I imagine have been similar reasons. You’ve all come to us first and foremost because somewhere, in your own way, you’ve come to love baseball. And now you want to understand it better, and for a lot of you I think a big part of it is because that understanding gives you a new and deeper love for the game.
Now, I haven’t been the first person to leave BP, and I won’t be the last. I’m not the best or the brightest of the ones who have left so far, and I doubt I’ll be the best and the brightest of the ones leaving from now on. And throughout all the many changes, the people who have come and gone and (see again, Russell) come again, Baseball Prospectus has been that. And no amount of Ex scientia, astros (Latin for “from the knowledge, Astros”) is going to change that.
And BP isn’t alone here; the larger sabermetric community has much the same thing going on. (I remember talking to Dave Studeman when I left the Hardball Times for BP, and the notion that “there’s always a bigger fish” figured prominently—just as BP was there to scoop up his most annoying writer, others out there were and are ready to scoop up people from BP). Now, I don’t want to give off the impression that I consider myself a major figure in this; the sabermetric brain drain well precedes me. But I kinda forgot to talk about it much before it happened to me, and I won’t have much of a chance to talk about it after this, so let’s proceed.
Let’s not sugarcoat things: the early days of sabermetrics was populated by the sort of publishing techniques popularized by crackpots of all stripes—the hand-cranked mimeograph machine, the occasional self-published book in the days before anyone could have a professionally typeset book published on-demand for cheap. Sabermetrics was very much an outsider thing. And yet, just this week, people like Bill James and Tom Tippett picked up yet another World Series ring for working in the front office of a major league team.
At some point, the outsiders became accepted and valued. That’s exciting for a lot of us who are a part of this community. At the same time, that presents certain challenges. And I think one of the reasons teams have adopted not just sabermetrics but actual sabermetricians is the nature of this field. Because there’s no formal discipline for sabermetrics, the field has become a grab-bag of backgrounds and trainings. It’s truly multidisciplinary. So the focus becomes less on the particular tools one uses, and on understanding how the tools one has relates to the subject matter at hand. It’s kind of the inverse of the school of thought behind things like Freakonomics, where once one has mastered the tools of econometrics, one defines the study of econometrics as “using those tools to study whatever the heck I want to.”
Colleges can crank out people who know and understand the tools, but the sabermetrics community has given teams people who have demonstrated that they can use those tools to find useful insights into the game of baseball. So teams court them as part of their effort to win games. (And yes, “part of” is exactly the word for it—we aren’t replacing scouts, or player development guys, or anyone else who used to be part of a front office. The idea that sabermetrics was at war with traditional front offices, rather than being adopted as part of a cohesive whole, is more a creation of the media debate than the reality within baseball. There certainly has been conflict, but there’s been a lot of cooperation and collaboration along the way as well.)
But the brain drain doesn’t seem to be even all-around. There have always been two sides to the sabermetric movement—some, like Bill James, could and did both, but James is an exceptional talent. There have been the hard-core number crunchers, the ones who get their hands dirty in the numbers, but there’s also been what Jay Jaffe calls the “liberal arts wing,” who took the findings of those number crunchers and applied them to commenting about the game itself. (Although I must confess, I am constantly amused that the person who introduced me to that phrase as a self-description is one of the better quantitative analysts I’ve had the pleasure to read and work with, even as he’s moved on to success doing what that term describes.)
The quantitative, number-crunching folks are the ones who’ve largely moved on to jobs with teams, doing secret number-crunchy stuff. (You can hear the numbers and their brittle sounds as they’re crushed from behind closed doors at many MLB teams now.) The liberal arts types, on the other hand, are invading the mainstream media, taking jobs at places like ESPN, Sports Illustrated, and elsewhere.
Which, I mean, success is great. But the best sabermetrics has to offer comes from both parts of the field working together. Every so often, you get the total package in one person, but the Bill James types are rare. Normally that means collaboration. The sort of bifurcated success sabermetrics has had is starting to cause the public part of sabermetrics to tilt more heavily to the liberal arts wing.
Let’s go ahead and bring the first and second question together. People like to compare Baseball Prospectus to Baseball-Reference and Fangraphs. Which in a lot of ways is fair and useful, but I think it misses something important. Fangraphs and Baseball-Reference have both been important to the sabermetrics movement, but they’ve been firmly in the liberal arts camp. They’ve been important in popularizing the work of people like Sean Smith, Tom Tango, Mitchel Lichtman and others. And they’ve contributed to and enhanced that work. But that was work that was started elsewhere and picked up and adapted by those sites.
Baseball Prospectus, on the other hand, has been dedicated toward popularizing its own stats. A lot of people view that as a negative, and at times it hasn’t served BP as well as other strategies might have. But it also means that BP views the encouragement of developing new stats as core to its mission, rather than simply popularizing work that others have done. I don’t know what BP is going to do after I leave, and it would be unfair of me to commit them to things in my absence. But I know that baseball research is a part of its blood, and that makes Baseball Prospectus stand out from its peers in that respect. For that reason, I think that now more than ever Baseball Prospectus deserves your support and your encouragement.
Which brings us to our third question—how can you find yourself in a position like I’m in now, or at least, in the position I was in before now? Allow me to start with a few words of discouragement. If you’re doing this primarily for a career, you should probably stop now before you end up disappointed. If the work itself counts as reward enough to keep you going, then you should consider it. Because for a long time, the work itself is the only reward this offers. There are other rewards, but they’re infrequent and they never pay quite as much as similar work outside of baseball. You have to love this in order to be able to stick with it.
So, if that description fits you, how can you become a hardcore number muncher, the sort that deftly avoids Troggles… er, I mean not that sort but the sort that I am. First, start off by getting yourself a blog of some sort. Get on Wordpress, or Tumblr, or Blogspot—I don’t care which. Start a blog. Now start writing on it. And then keep writing on it.
This accomplishes a few things. First off, you want to get better, right? (If you don’t, again, you should start reconsidering this idea.) There are a lot of ways to get better, but one of the most important is getting feedback from other people. You’re not going to get good feedback if you’re keeping your work to yourself. If something isn’t very good, a lot of people don’t want to publish it—and it’s understandable in that you want to put your best foot forward, which we’ll talk about in a minute. But if you don’t know why it’s not very good, publish it. Someone will tell you why. Then you can take that feedback and make it better, and then you can publish that.
In a practical sense, it also shows that you’re devoted, that you’re self-motivated, and that you’re interested. It seems counter-intuitive to think that people are willing to pay you for work that you’re demonstrating that you will do for free, but if you demonstrate the ability to stick with it even without monetary compensation, they’re more likely to trust your dedication to continue doing it even after they’ve hired you.
Get yourself some tools. You can certainly start off with something like Excel, but if you’re doing this sort of work a lot, you’re going to find yourself bumping up against the limits of a spreadsheet before too long. Set up an SQL database, and join the Baseball SQL Discussion group. Learn something like GNU R (BP’s own Max Marchi has a book that you may be interested in). And read up on these tools not just in a baseball context, but in a general context. It’ll make you better able to do analysis and come up with things you can write about. And let’s be honest, being able to say outright, “Yes, I know how to use an SQL database and I can write code in R or Python” is exactly the sort of thing that is going to make you look employable to an MLB team or an organization like Baseball Prospectus at some point down the road. Tools are your friend! Tools are what differentiate you from the majority of mammals on this planet! Learn your tools! Love your tools!
Now, let’s talk about the sorts of things you’re writing about and how you’re writing about them. You want to do two things—you want to learn things about baseball, and you want to communicate those things to someone else. A good way to do the first thing is to ask a question and then investigate what the answer is. That’s the core of the scientific method—formulate a question, come up with a hypothesis, make a prediction based on that hypothesis, then test to see if that prediction holds. Not only is that a good way to do research, it is also a great way to present your research to others. Start off with your question! That does a lot of things. It clarifies in the reader’s mind what the purpose of the study is. It helps the reader to relate the study you’ve done to the things that interest them. And it helps you and the reader have a way to see if the study you’ve done has accomplished something. And remember—a study that disproves your hypothesis may be just as interesting and useful as one that confirms it. When in doubt, publish.
But before you publish, you should see if someone else has written something that answers your question, or at least tries to answer it or is related to it. If they have, does that mean you shouldn’t publish? No. No, it very much does not mean that. But you should cite prior works up-front as much as you can. Even if all you’ve done is come up with the same findings someone else has to the same question, that’s often worth publishing—reproducibility is a key part of the scientific method, after all. So there’s nothing wrong with reproducing what someone else has done before, so long as you’re up front with it and discuss what it means. (In fact, this is a great way to learn stuff when you’re getting started—find a study that you’re interested in, and using your tools—love your tools! —attempt to reproduce that study.) And what you’ll come to find over time is that the more prior work you’ve read and had experience with, the easier it is to come up with new and interesting work of your own. That means reading as much as you can get your hands on.
You also want to find as many primary sources as you possibly can. Reading Tom Tango’s writing on wOBA is great. But especially when you’re first getting started, reading an article about linear weights is better. Older material tends to be more accessible than newer writing, partly because there’s lower orders of sophistication involved and partly because a lot of newer material was written with the assumption that the reader was familiar with the older material.
The next thing you can do is start joining conversations. Hang out at Tom Tango’s blog. Get yourself a Twitter account, and start following the sort of people who talk about this stuff. Should you Tweet links to your blog? Yes, absolutely. Twitter should be an avenue of self-promotion for you. But it should also be a medium of discussion. Tweet links to other people’s articles you like, or articles you don’t like and explanations of why. Ask people questions. Answer other people’s questions. Develop relationships with other people in this community. Those are people who can help you get better.
Because in order to be successful, you’re going to need help. I’ve certainly had a lot. I don’t know that I can ever pay back everyone who’s helped me get here, but I can at least try to thank them. The problem with having a lot of help is that you have a lot more people to thank than you can manage or than your readers can endure; I apologize to anyone I don’t thank as much as I ought.
But there are some people to whom I owe particular thanks. Russell Carleton recruited me out of obscurity into slightly less obscurity to write for MVN’s Statistically Speaking blog, from where Dave Studeman recruited me to write for the Hardball Times. Both of them have been a great help to me even after those points, and I wouldn’t be where I am without both of them taking a chance on me. Thank you, guys.
I’ve had the pleasure of working with a lot of great people at BP—far too many to name—but I’d like to single out a few for special thanks. Kevin Goldstein took a crank who kept sending e-mails complaining about things not being done well enough, and challenged him to try and do better and gave him an opportunity to do it. Dave Pease saw the work I was doing for BP and offered me the chance to work on their most important stats projects full-time. Gentlemen, I owe you both more than I can say, and I’m more grateful for the chances I’ve had than I know how to say.
A lot of editors have had to put up with me over the years. I’d like to especially thank Steven Goldman, who knew what I was capable of doing and challenged me to actually do it instead of doing what I was doing, and Ben Lindbergh, for putting up with pretty much everything I’ve said to him.
And there are a lot of analysts I’ve had a chance to read, talk with, and argue with. In particular, I’d like to thank Tango, MGL, Mike Fast, GuyM, Patriot, Peter Jensen, Dan Turkenkopf, Sean Smith and Chris Dial. Best wishes to all of you going forward.
And finally, thanks to my parents to all their help and support in starting down this path, and to Liz Roscher for all her encouragement and support to keep going down it.
I would be lying if I said I wasn’t excited for where I’m heading now—I get a chance to work with smart people I’ve worked with before, like Mike and Kevin, and a lot of smart people I haven’t had a chance to work with yet. I think the organization the Astros has built has a chance to win a World Series sooner than some people might think, and I’m excited to be able to contribute to that effort. But at the same time, I know I’ve had an incredible run in a very weird niche even among baseball stats writers, and I owe it all to you, my readers. You’ve been incredible beyond measure, allowing me to carve out a space where I could write about Chesterton’s fence at length and call it baseball analysis. I’ve been incredibly blessed to have that chance to talk to you all about things I passionately believe in, and I thank all of you for that, and I’m going to miss you all.
But enough of all that—this isn’t supposed to be a sad occasion. Instead, I’d like to send you off with a song in your hearts. And I can’t think of a better song for that purpose than this:
Good luck to all of you, and may God stand between you and harm in all the empty places where you must walk.