Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

I don’t believe in open source or the wisdom of crowds. The only peer reviews I need are the ones I get from editors and fact checkers. All that said, I do believe in explanation and kaizen. Since I started at BP, I’ve been publishing Team Health Reports that use a simple red/yellow/green coding system, one that I thought was a simple front end for everyone who isn’t colorblind. I have also realized that there are a vocal few of you that like looking under the hood and who want to know all twelve of the factors that I use, and there are more than a few of you who are trying to reverse engineer the whole thing. Maybe being a Mac guy ever since the days of OS 6 and my Mac Classic has steered me in the other direction; I just want things to work. So let’s meet in the middle and take a look at how the Health Report system is put together, how it has evolved, and how we might figure out how to make it even better.

The base of the system is an actuarial table. You can’t get more boring than that, and unfortunately, it’s the part I can’t share with you. The table is put together by an outside entity, and is… well, let’s just say it’s very similar to the one used to calculate the premium on insuring player contracts and setting workman’s comp payments. Like any actuarial table, it’s simply a matter of presenting risk based on various categories. The most basic categories here are age and position; a 26-year-old pitcher could have a 40 percent risk for injury (these aren’t the real numbers), while a 32-year-old pitcher might have a 38 percent risk. These injury risks are calculated to account for a severe injury-one that would put a player at risk of passing the elimination period of the policy, which is usually 90 days. The injury risk is also based on a three-year period. For a 29-year-old pitcher, it doesn’t care if the pitcher is CC Sabathia or Jeremy Affeldt; it’s a baseline risk.

Because that baseline risk is so broad, it requires adjustment. Sabathia does not carry the same risk as Affeldt, and because of that, I make eleven separate adjustments to the baseline figure. Most of these are very incremental, which is a bit counterintuitive. The baseline risk is broad, yes, but it’s also accurate in most cases. It’s not perfect-outliers like Tim Wakefield and Jamie Moyer throw things off some years, but when you consider that their health status over the past couple of seasons has been roughly a matter of a coin flip, these isolated cases don’t really indicate any faults in the adjustments themselves.

The first major adjustment from the baseline comes from PECOTA. The Attrition Rate isn’t necessarily one that’s predictive of injury, but it is an accurate predictor of playing time that may often be limited due to injury. I add half the attrition rate to the baseline risk before making the smaller fine adjustments. Those adjustments come from ten factors, including team, body mass (instead of height/weight), position change, injury history, recovery time, role change, conditioning, and a small subjective number that I only use when a player is “on the edge,” allowing me to use subjective information to push a player from a “low red” to a yellow (for example) when I can find evidence for doing so. For pitchers, I also factor in workload and a mechanical adjustment based on discussions with scouts and pitching coaches. Because it’s subjective, it’s very small, but it’s my hope that I’ve come up with some kind of consistent number, and so far it’s proven to be very accurate.

This year, as in every year, there will be a new underlying table, and I also have a five-year table with data that shows how quickly players tend to return from injuries-this allows me to make the recovery-time adjustment more accurate. We’ll also be presenting the data to you in three different ways. First, we’ll have the normal Team Health Reports in the format that people are used to and seem to like. Secondly, we’ll be transferring all of the data to a positional health report for those that like that format-each position will have a document, and once a Team Health Report is published, we’ll update the Positional List as well. Finally, we’ll have a spreadsheet showing the ratings for those of you that have early drafts or who just want a quick guide. As usual, I’ll remind you that the colors and the commentary don’t always agree. I may look at the system’s take on a player and flat out disagree, as I did with last year’s comment on Manny Ramirez. In the end, I think both were right.

I’m always looking to improve, so if you have suggestions, you know how to reach me. There’s still some time before you’ll start seeing the Team Health Reports published, but the work is already happening behind the scenes. There’s not only no other system like it out there, there’s nothing else close to it. For the BP subscriber, it might be second only to PECOTA as a reason why they win their fantasy leagues.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
gankerken
1/09
I\'m not sure if Will is in charge of the liks in his column, but it seems a bit ironic to start off with the line \"I don\'t believe in open source or the wisdom of crowds\", and then include a link to Wikipedia two sentances later.
havybeaks
1/09
It\'s also ironic that less than a week ago, Will initiated a \"wisdom of the crowds\" article:

http://www.baseballprospectus.com/unfiltered/?p=1140

I appreciate that Will\'s success depends heavily on his ability to access closed source information, but open source knowledge - properly used and understood - is very powerful.

Hopefully he doesn\'t entirely dismiss the \"peer reviews\" provided by his readers! :)
scareduck
1/10
Yeah, good thing Will isn\'t in charge of running the servers:

http://uptime.netcraft.com/up/graph?site=http%3A%2F%2Fwww.baseballprospectus.com
jackalltogether
1/09
Thanks for the info Will--although I have to say if everyone in the statistical community echoed your opening statements, well, there would be no statistical community. It\'s certainly different because you pay money to access your data, but that\'s pretty dismissive. Anyhow, I found it pretty interesting because I\'m sitting here in my job as a property insurance underwriter and let\'s just say the precess sounds familiar.
wcarroll
1/09
I wont address the first, but the process ... well, if you looked at how we dealt with disability issues in my former employment, I might need to pay a licensing fee!
perhaps
1/09
Yeah--you really don\'t need to mention that you don\'t believe in open source to begin your article. And, believe me, that\'s an argument you don\'t want to get into, since baseball prospectus USES open source to function.
perhaps
1/09
I should perhaps have mentioned some proof for that last assertion--the \".php\" in http://www.baseballprospectus.com/article.php?articleid=8410 indicates that PHP is used as a server-side scripting language, which is an open-source technology.
wcarroll
1/09
Just because I don\'t believe in open source doesn\'t mean our webmaster doesn\'t.
perhaps
1/09
Personally, and from the comments I believe this point should be clearly evident, when you said \"I don\'t believe in open source\", I think you meant to say \"I don\'t have to share if I don\'t want to.\"

That\'s perfectly fine, and doesn\'t mean you can\'t believe in open source. Open source works fine for some things -- BP included. It just means you\'re not interested in letting everyone know the formula for THR, and if you aren\'t, then that\'s that. There\'s really no need to defend that position and, in my opinion, people who cite \"open source\" as a reason to share the formula are in the wrong.

After all, they\'re forgetting another favorite mantra used in open source -- if you don\'t the way things are done, fork it and do it yourself. ;)
wcarroll
1/09
Yes, I agree with you, \"perhaps.\" I mean \"I\'m not giving away my meal ticket!\" I do hope to make it better.

I will say I don\'t believe in the wisdom of crowds, but I\'m a BIG believer in crowdsourcing. I think people conflate the two.
scareduck
1/10
There is a story a friend of mine tells about Larry Niven, who has been active in LA-area science fiction fandom for some time. A number of years ago, they were part of a large group going to a Boston Market to get dinner; once they got to the counter, they immediately started to negotiate over whether it would be better to order items individually or just pick up large quantities of particular dinner items, and if so, how much of each. The haggling rapidly got out of hand, so much so that Larry apologized to the cashier, saying, \"We\'re smarter individually.\"

The expression perfectly fits a lot of bureaucratic situations -- people are perfectly willing to make stupid decisions if they can evade or diffuse culpability. My friend started using this as a .sig online, and not long after she did, she was accosted by a Niven fan. This guy was one of those pestiferous jerks who claimed to have read everything that Larry had ever written (even scribblings on cocktail napkins, I suppose), and *he* wasn\'t able to find those words, and therefore *she was lying*. Rather than explain that, yes, she was friends with Larry, and yes, she had heard him say those words first-hand, she yanked her .sig, and that was that. I\'ve used it myself a few times (a Google search will catch some of my usages), yet it has not caught on. Wisdom of crowds? Well, maybe.
veg9000
1/09
Not to mention Mac OS X, which \"just works\" with several open source components at its core. Thankfully, we all benefit from open source no matter what we believe about it.
dianagramr
1/09
Will .... no love for Mozilla Firefox ... ?
wcarroll
1/09
Nope, I\'m a Safari guy. I\'m open to Chrome if they ever bring it to Mac.
mtofias
1/09
Will,

I love you, but you sound like an idiot today. OS X is loaded with open source technologies.

Safari in particular traces it\'s origins to the open source KHTML and to this day depends on the open source WebKit to render web pages (as does Chrome).

I\'d also suggest that open source software is more like crowd sourcing than the \"wisdom of crowds\" since it\'s about contributing effort to a project and not just preferences.

And if you\'re against the wisdom of crowds does that mean that you are against democracy, (free agent) markets and the awesomely helpful Amazon.com star ratings?

---
Mac User since 1991.
samuelpage
1/10
Try Camino (http://caminobrowser.org/). I used it back in my Mac days. It has the speed of Safari and Chrome but uses the powerful Mozilla engine. It\'s a very exciting open-source browser.
mwball75
1/09
If you could combine the thr spreadsheet with the pecota, that would be awesome, regardless, keep up the good work.
mwball75
1/09
You guys also use MySql, an open source database.

But I think what you\'re saying is that you don\'t want to distribute the THR recipe (correct me). Which is fine, you own it. I think you were very open about the ingredients, just not how much of each goes in.

You couldn\'t possibly be against open source, because if you purchased a web server, database, application server and operating system my subscription price would go up and I\'d rather pay for the content, not the delivery charge.
chabels
1/09
\"For the BP subscriber, it might be second only to PECOTA as a reason why they win their fantasy leagues.\"

So for those of us who don\'t win, are THR yet more evidence at how bad at fantasy we really are?
Nickus
1/09
I am thrilled that I am not the only person who used OS 6.
scareduck
1/10
scareduck
1/10


(prior post was supposed to have content)
wcarroll
1/11
Not only did I use it, I still have a working Mac Classic (albeit on Sys 7.)
BananaHammock
1/10
I think adding some sort of feedback loop to your rating system could be helpful. So you have this system that is largely fantastic, but does it actively compare last year\'s projections with last year\'s results? Perhaps as a whole your system is overly aggressive or the opposite. I would think there could be a way to incorporate a calibration factor based on last year/last 3 years results vs projections. Just a thought...
wcarroll
1/11
Tried that a couple years ago and what I tried didn\'t work. Problem I had was that there\'s an inherent \"bad luck\" factor that doesn\'t carry from year to year. Any lookback is going to carry it. I factor it in during the writeups, by saying \"well, the system doesn\'t like this guy but it\'s been wrong before.\"
andland
1/10
I think it may be interesting to do a review of last year\'s good and bad predictions and the reasons you were right or wrong (or lucky or unlucky). I looked back at the Indians\' THR from last year and you had Cliff Lee with a Red light and Travis Hafner with a Green light.
RiloBoxer
1/10
Have you thought of adding a \"Dusty Baker variable\" to the actuarial table for pitchers? THR might have been a little less optimistic about Aaron Harang last year if that was included.
pikapp383
1/11
Care to share which teams get an adjustment up and which ones get one down?

:)
wcarroll
1/11
That\'s pretty easy to figure out on your own.
dbrown
1/13
Have you read Surowiecki\'s _The Wisdom of Crowds_? I ask, because the central idea is one that is fundamentally different from that underlying open source and one that is orthogonal to whether or not you open the proverbial hood up on your Team Health Reports algorithm.

A \"wisdom of the crowds\" approach to THR would be more akin, for example, to conducting a secret poll of BP readers, who would each weigh in with a vote as to the injury risk presented by a certain player, and then aggregating those votes into a single value (though such a crowd may not be truly wise, in that it may lack what Surowiecki terms \"Independence\" since each of us, in theory, would be influenced by the same information we consume from BP).

Come to think of it, I wonder whether such crowd-based approach would \"beat\" your secret formula... :)
chaseball
1/16
I think as a whole, \"follow up\" type articles need to happen much more often in this industry.

Especially when one claims to have an amazing recipe, but doesn\'t share the ingredients.