BP Unfiltered: PECOTA Update

March 12, 2010

2010 PECOTA Projection Analysis

As promised, here are the results of our preliminary analysis of the 2010 PECOTA projections. We took the methodology for multiple builds of the projections (which will be easier to identify when we start using versioning, which we will do when we've got everything else cleaned up) and ran all the inputs for the 2009 version of PECOTA through it. Then we compared those projections to the actual results of the 2009 season.

In all cases, lower numbers are better.

RMS Error results

System	Release Date	R	H	2B	3B	HR	RBI	BB	K	SB	CS	TOTAL
2009 PECOTA	3-28-2009	10.30	12.91	4.86	1.46	4.16	10.12	8.78	12.25	3.87	1.47	70.18
BP2010 PECOTA	1-5-2010	8.92	12.21	4.28	1.50	4.22	10.35	9.81	14.52	3.68	1.77	71.26
Current PECOTA	3-12-2010	9.12	14.16	4.30	1.50	4.43	9.98	9.09	10.42	3.68	1.50	68.18

Bias-adjusted RMS Error results

System	Release Date	R	H	2B	3B	HR	RBI	BB	K	SB	CS	TOTAL
2009 PECOTA	3-28-2009	8.10	9.50	4.16	1.46	4.06	8.94	7.73	9.73	3.72	1.43	58.53
BP2010 PECOTA	1-5-2010	7.11	8.68	4.10	1.46	3.93	8.46	7.83	10.07	3.67	1.50	56.81
Current PECOTA	3-12-2010	7.16	8.90	4.12	1.47	3.93	8.15	7.88	9.91	3.56	1.45	56.53

BP used the 2010 PECOTA projections as the basis of our LABR draft strategy this weekend.

PECOTA Ten-Year Forecasts and Hitter Cards

We have diagnosed the main problem with our ten-year forecasts as reported on the PECOTA beta hitter cards. Nate Silver generated one set of comps in the original PECOTA process and used those comps to generate the long-term projections. We were trying to re-generate new comps in year n+1 based on the player's career thus far with his projections for year n included, and repeating. In addition to introducing considerable extra complication, this process generally created much less favorable long-term projections, as many readers noted.

We've adjusted the long-term projection process to work the way Nate originally designed it. Once we've got everything stabilized and released, we'll be revisiting this topic.

Another problem we had was that a player projected to be out of baseball entirely had a value of zero, while if he was good enough to be projected to get some playing time but bad enough to be below average, he'd have a negative value. For example, in a player's tenth percentile, he might be out of baseball entirely, returning 0 WARP, but in his fiftieth he performed well enough to stay in baseball, and rated -1.0 WARP in the playing time he was projected to get. Because we didn't distinguish between a 0 WARP in baseball and a 0 WARP being out of baseball, this enabled some very weird results for players with this condition. We've changed the process to differentiate between the two, and the values should now be much more reasonable.

The K bar in the player profile graph has been reversed, and now works as it did previously.

We still have a problem with some players having their higher percentile projections zeroed out, while the lower are filled–we've identified this internally as "the Koyie Hill problem". This is a similar issue to the reversed projections problem above, and it'll be fixed this weekend.

PFM Settings Update

We've made some small changes to the position adjustment setting of the PFM:

"Level 0" is now "OFF"
"Level 1" is now "ON".

Levels 2 and 3 are disabled. We are looking into bringing those levels back into play–lets us know how much you used and miss them.

The default is now "ON", not OFF.

Depth Charts, Weighted Means Spreadsheet, and PFM

We've pushed two updates to the Depth Charts, Weighted Means Spreadsheet, and PFM this week.

The first one was sent out late March 9. With this update, we attempted to address some of the issues people had noted with Depth Chart team statistics that Clay mentions in this post. While this might have made the team projections more satisfactory, it quickly became apparent that it did so at the expense of the individual player projections, which are what the vast majority of subscribers are using PECOTA for this time of year. If you downloaded a spreadsheet or ran PFM between the evening of March 9 and now, please do it again and use the individual player stats you get from the current data.

As an aside, the modifications that were made to the March 9 data were more-or-less applied on a league-wide basis, so draft order and dollar value from the PFM probably won't change very much. The raw statistics that were predicted will be fairly different, though.

This morning, we pushed out another update which puts things largely back to their previous state. The RMSE analysis above is run on this version of PECOTA.

Depth charts will be updated at least every other weekday through the start of the season.

Weighted Means versus Fiftieth Percentile

Traditionally, PECOTA has used weighted means projections for it's default projections–the Depth Charts, PFM, Weighted Means Spreadsheet (obviously), and player cards have all used or highlighted the weighted means stat line.

This year, we've been using the fiftieth percentile for these applications instead. Until recently, we haven't had the weighted means at all for PECOTA in 2010. We now have the weighted means in the cards–see the bottom row of the 2010 projections table–and the weighted means are being used for the ten-year projections in the cards. Everywhere else, though, we're sticking with the fiftieth percentile projections for now, so you'll see the projections in the fiftieth percentile line of the cards match the Depth Charts, PFM, and Weighted Means Spreadsheet (which I realize means the spreadsheet is now misnamed).

PECOTA Pitcher Cards

We're working on the pitcher cards now. They'll be available next week.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Dave Pease

Latest Articles

You need to be logged in to comment. Login or Subscribe

dianagramr

3/13

Thanks for the update!

Is there a way for you to footnote new additions/eliminations to the PFM/Depth Charts/Weighted Means (in terms of players added/removed)?

It would make it easier to match up prior iterations which have been transferred into spreadsheets.

Reply to dianagramr

Cardinals645

3/13

When do you anticipate switching the Depth Charts, PFM, and Spreadsheet to Weighted Means?

Reply to Cardinals645

alskor

3/13

Follow up: Will this be done? Should this be done? Why?

Reply to alskor

dpease

3/13

We don't know. The weighted means (at least for 2009) have more bias than the fiftieth. We're going to have to look at it more closely.

Reply to dpease

clayd

3/13

The data I have in hand right now says we won't. The 50 percentile data in the PFM already has a positive bias - projecting more singles, HR, BB, and runs than the 2009 major leagues did, on the order of 4%. I don't consider that to be a large problem - a forecast system that is optimized for individual elements isn't necessarily going to be an unbiased estimator of the system as a whole. Going to a weighted means forecast will double or, in some categories, triple those biases, into the 10-15% range, which is big enough for me to consider it a problem. And for those larger biases, you get no improvement in the individual errors - at least for 2008 and 2009.

Reply to clayd

kwilson2

3/14

Even if the weighted means seem more biased, I'd like to have access to a true weighted means spreadsheet. At the very least, I think you should change the name of the "Weighted Means Spreadsheet" to "Medians Spreadsheet" or something, so that it is not so misleading.

Reply to kwilson2

luftmich

3/13

"Thanks for writing and for your support of Baseball Prospectus.

Our crew has posted an updated about some of the recent changes to PECOTA. You can find it at
http://www.baseballprospectus.com/article.php?articleid=10226.

We hope this is helpful and we're sorry for the delay in communicating with you. Thank you for your continued support of BP."

Seriously?! This is the response I get from your customer service when I complain that I paid for access to "just the numbers", but find most of those numbers are not available for my draft tomorrow.

It mid March and there's no long term projections, no pitcher weighted mean, no pitcher cards, the hitter cards are "beta", the PFM is using different stats this year because the proper ones are not available. This is a giant problem for fantasy subscribers.

Sorry but a link to this post is not good enough, and does not address the problem.

Reply to luftmich

mrharrier

3/13

I wrote a long reply. It somehow got lost.

Allow me to boil it down: this post is insufficient. BP acknowledged a huge problem with your core product, then went silent for two weeks. When you posted the above, an analysis of that problem, readers immediately identified further significant problems that BP hadn't considered. BP acknowledged that those problems should be examined.

What is missing, at this point, is an abject apology for soliciting customers on the basis of a core product that was never delivered, and a gigantic banner on the home page that tells it as it is: "We fucked up. We'll get it right next year. Enjoy our high-quality in-season content over the next nine months."

Reply to mrharrier

sam19041

3/13

Helpful update. Thanks for being transparent about it.

Reply to sam19041

Wozzyck

3/13

Thanks for the update, though it still seems that there are issues with the PECOTA hitter cards that haven't been addressed. One is that several players have screwy things going on around the 90th percentile. Take Adam Jones for example.

His Eq rate stats are .305/.370/.509 and .298/.360 /.509 in his 90th and 80th percentiles, respectively. But the adjusted rate stats are .306/.371/.519 and .308/.370/.535 for those same percentiles. Why is this happening (for him and others)?

Staying with Adam Jones, his Weighted Mean Projections have him at .291/.323/.483. This OBP is lower than that for his 10th percentile. It seems that a similar OBP-suppressing phenomenon is happening for most players. However Dexter Fowler's Weighted Mean OBP is .380 which is higher than his 60th percentile projection.

Just a couple of observations.

Reply to Wozzyck

jpjazzman

3/13

Even worse -

Fowler's 80th percentile slash stats: 0.295/0.393/0.458 are better than his 90th percentile stats: 0.285/0.390/0.448

What a joke... BP, you guys have lost it I think. Two weeks after the initial problem is diagnosed, this is what you come back with?

Reply to jpjazzman

molnar

3/13

80th percentile?

Reply to molnar

dpease

3/13

We're looking at this. Thanks.

Reply to dpease

BurrRutledge

3/13

Dave, the PFM allows us to set "aggressive," "moderate," and "conservative" rankings. In the past I've selected "aggressive" because I want to draft players with the best chance to outperform.

Is that setting reliable at the moment?

Thanks in advance!

Reply to BurrRutledge

swarmee

3/13

You're not using that setting correctly. That's supposed to be how aggressive the other owners are in the league. As in, if you're in a single-season league and Pujols goes for $55, you're in an aggressive league, $48 and you're in a moderate league, and $43 and you're in a conservative league, basically.
If you want to add in propensity to overperform, consider making one of your categories "Upside."

Reply to swarmee

BurrRutledge

3/13

Seriously? What's the setting do in a straight draft?

Reply to BurrRutledge

swarmee

3/13

It just reapportions the dollar amounts to fit better in with your league. It really shouldn't change the order of players.

Reply to swarmee

BurrRutledge

3/13

So funny. As long as I've been using the PFM I've thought it was some way to tweak the PFM to evaluate the players by a slightly higher percentile of their forecast. Sigh.

At least I've finished in the money every season I've used it...

Thanks for setting me straight!

Reply to BurrRutledge

hessshaun

3/13

I never knew that either. The work instructions for that tool are horrible. I realize that those items can be helpful, but the tool burnt me as much as it saved me, two years in a row. I think it's more me sorting through PECOTA's biases than anything else.

Reply to hessshaun

swarmee

3/14

Disposition: allows the user to select from between a more conservative and a more aggressive approach when deriving dollar value calculations; the more aggressive approach places a higher value on the top players (e.g., "Stars & Scrubs")
http://www.baseballprospectus.com/article.php?articleid=4793
So you would use it to inflate the values of the studs and devalue the other players, i.e. my Albert Pujols example.

If you want what I would consider a true dollar value, use the conservative values without position adjustment turned on. Those will compare more closely with the magazine dollar values, IMO.

Reply to swarmee

jrmayne

3/13

There's an ongoing discussion in the Depth Charts, for those interested in the thoughts of some who have followed the updates. Some of us are skeptical.

The depth charts now show the AL at 61 games over .500, which would be impressive, all the more if not for the fact that the NL is projected to be right around .500.

--JRM

Reply to jrmayne

dianagramr

3/13

Dear BP,

I suggested this a couple of weeks ago. I'll suggest it again now. Ask some of your subscribers to be Beta-testers for the Depth Charts, PFM and Weighted Means, PRIOR to release.

Reply to dianagramr

rawagman

3/13

Question - why not include projections for all first rounders from last year? Why are guys like Dustin Ackley and Kyle Gibson excluded, while other rookies with zero pro experience like Stephen Strasburg and Jose Iglesias are projected?

Reply to rawagman

dpease

3/13

We'll get them in there. Thanks.

Reply to dpease

Wharton93

3/13

Should we be sending the books back to Amazon? Are THOSE Pecota numbers any good? I can't even tell anymore what's what. The season ended 5.5 months ago---that's a long time to have worked this all out. Hopefully somebody will spend the Summer fixing this. Many drafts are already over this year.

Reply to Wharton93

dpease

3/13

we have included the projections printed in Baseball Prospectus 2010 as the second lines in the error reports above.

Reply to dpease

rbross

3/13

thanks for the update and for explaining the problems and the processes by which you are seeking to solve those problems.

Nevertheless (and I say this with all due respect), I think you should be offering some sort of refund to all of us. I can't speak for anyone else, but 90% of the reason I subscribe to BP is for PECOTA stats to prepare for my fantasy draft. In particular, I'm most interested in UPSIDE, as I am drafting almost exclusively prospects in a 100% keeper league. I pay $40 for a year's subscription so that I can have access to these statistical tools for basically one to two months of the year. I tried to delay the start of my draft further and further in hopes that PECOTA would be fixed before we started making picks, but the other guys in my league understandably wouldn't wait any longer. So while I will still enjoy reading the columns for the rest of the year, I feel like I just wasted $40 (or, to be fair, about $36) on this subscription.

I suggest a partial refund and/or an extension of our subscriptions.

Reply to rbross

geefsu

3/13

Ditto!
You took the words right out of my M(cL)outh.

Reply to geefsu

vtadave

3/14

Don't hold your breath. Guesing no refund is forthcoming. BP has really lost no only my piddling respect this year, bt rsspect accross the fantasy landscape. There is no doubt thtat BP has some truly sharp minds behind the numbers/content, but the ball has been clearly dropped this year.

Reply to vtadave

sfhubbard

3/14

Sadly, I must agree that a refund seems unlikely. Dave has said that non-keeper leagues should not be an issue, but there's little more than "oops" for those of us in long-term sim leagues.

I have long been concerned that PECOTA is one of the latest projections posted every season. I can get Shandler's stuff in December. If BP wants to make this right, let's get an good schedule for the fall where we can have access to this much sooner. One suggestion was Jan 15. I think that's too late.

PECOTA is the primary input into my draft list every year. That may have to change.

Reply to sfhubbard

ferret

3/13

My two cents is this has been (and still is) a major disaster that will affect the BP brand very negatively. I would make a couple of suggestions for your consideration. They have been suggested previously but not acknowledged....

1. Get the Pecota Beta on line January 15 next year.
2. Consider getting customer input before posting.
3. Review the Pecota at basic levels so at least the math is correct. Not +60 wins in the AL and -15 in the NL.
4. Make a big public apology.
5. Offer a discounted subscription rate for renewals.

Reply to ferret

jivas21

3/13

"4. Make a big public apology."

THIS. I understand making mistakes, but I feel like nobody at BP has come close to acknowledging the extent to which this process has been screwed up this year.

Dave: with all due respect, this post ain't it.

Reply to jivas21

ccmonter

3/13

I'd like the apology, but first of all I want the numbers. Let the team do their work. Let's get the numbers first and then we can worry about everyone's feelings.

Reply to ccmonter

markpadden

3/13

Exactly. It is one thing to screw up massively. Quite another to flatly ignore your customers once it occurs [yes, remaining silent for 12 days after the problem was acknowledged counts as ignoring customers]. This has been a case study in how NOT to handle a significant problem encountered by a consumer-facing business. The increasingly apparent arrogance of BP is bordering on astounding at this point.

Reply to markpadden

dpease

3/13

I abjectly, completely apologize for the problems we have had with this process this year. Honestly, it kills me that we've had these issues. We so appreciate your support, and we want to bring you the best.

I've been working on that message in a larger format, with more background included. In the meantime, I understand and agree with many of the points people are making here. We're working hard to make things better than ever, and I'm sure we'll get there, but it's been much more difficult than we anticipated.

If you want to send me an email about PECOTA, or anything else, I'm at dpease@baseballprospectus.com.

Reply to dpease

jivas21

3/13

Thanks Dave. It really does help to have BP acknowledge the extent of the problem that is plainly evident to all of us.

You may also want to address *why* you've released a new update today, when looking at the projections both on the site and the spreadsheet, it seems abundantly clear that there are still a number of bugs in the system. While your post above seems to get at this - although, to me, it doesn't move the needle at all - why should we trust the current updates rather than the 2/25 projections?

For me, having access to another set of projections that likely contain a number of errors - different ones than prior versions - only further muddles the process of trying to identify what production I can reasonably forecast for players this year. In short, in my opinion providing the current updates (presuming that they're wrong, which I believe they are) only makes the matter worse.

I may only speak for myself, but I recommend getting it right and then giving us the final, ACCURATE product.

Thanks again.

Reply to jivas21

jrmayne

3/13

I wouldn't mind a true-beta; one with some bugs here and there. But it simply has to be closer than every single release so far.

Dave, if you're writing a chronology, explaining why facially defective updates were put up is important. If that explanation is, "We're idiots," that's better than no explanation or an implausible or excuse-ridden one. The latest Depth Chart snafu is one of many that are obvious to moderately mathy folks. I've articulated some of the many other problems elsewhere.

I realize I have a pitchfork and a torch, here ("Some problems are best solved by angry mobs" - Homer Simpson, and if it wasn't him, it should have been) but I think the pitchfork and torch crowd are right.

And I didn't come with an axe to grind (to mix my metaphors); you can see my number, I've been to at least a half-dozen BP events, my Scoresheet league is called AL_BP_NorCal, I've bought every book since number two, y'all have kindly paid me for a couple of articles, and I've been acknowledged in the book (which I found quite flattering). It took *a lot* to get this torch lit.

But I think it's important to shed light on the unsupportable continued releases and continued reliance on PECOTA. Further, it's important to show that what you're saying now doesn't explain nearly enough, doesn't make up for continued obvious errors, and doesn't speak in a properly meaningful way to PECOTA 2010 v. 0000011's accuracy.

No more releases of Depth Charts until wins average 81 (or slightly under, if you want some rainouts.) No more releases of PECOTA that facially don't make sense. No more using 2009 data to determine if it accurately predicts 2009 outcomes. No more tinkering. Back-engineer it to 2008 PECOTA if you can; that was the last well-performing iteration. If not, well, there's 2011. But enough of this course of conduct, which people are rightfully quite angry about.

--JRM

Reply to jrmayne

brokeslowly

3/13

I think that all the naysayers should lighten up. Progress is being made, and I'm betting that the projections available right now are up there with any in the business. BP is obviously trying hard to fix the problems. If you don't think that the non-PECOTA content of BP is worth your $40 for a year, either you're not a true baseball fan or you probably can't afford to dole out the money in the first place.

Reply to brokeslowly

jrmayne

3/13

I'd note the comparisons appear designed to show the newer PECOTA is better.

First, back-testing on one year of data is insufficient.

Secondly, was the back-testing done after filtering out the known information (the 2009 season?) That is, doesn't the 2010 PECOTA have some 2009 season information in it? If you did filter it out, you should have clarified this. If not, the data's value is damaged.

Thirdly, what bias exactly was filtered out?

Fourthly, the BA/OBP/Slg are all worse now than they were with the book's method. You used a measuring system that a cynical unpleasant person might think is designed to hide this.

Fifthly, the changes keep coming. I have to say, if I kept changing a system, I'd think I could fix it up to predict the 2009 players right well by this time.

To quote myself from the Depth Charts thread, I don't believe you when you say PECOTA 2010 is fine now. More very simple errors are being generated with each iteration, so I don't think you believe you either.

--JRM

Reply to jrmayne

markpadden

3/13

All these points are valid, especially #1 and #2. It appears they used 2009 results both to create the new formula and to "test" its accuracy. That just doesn't work in any kind of forecasting business.

Reply to markpadden

nosybrian

3/13

Not sure this is a valid criticism. I think the main thing BP has been trying to do with PECOTA this year is just to change the underlying technology from Excel to some other form of data base, while turning all of Nate's complicated macros into lines of code written in a totally different format. So it would be logical for them to see whether they come up with the 2009 PECOTAS that Nate developed, but this time using their new code. If they accurately "replicated" the 2009 PECOTAS using the new code, then they could feel reasonably confident that their translation from Excel to their new DB and code was a good one.

We've seen some evidence, however, they they were also changing the formulas -- not just trying to replicate Nate's original code in a different system. For example, in this article Dave mentioned using a different way of making 10 year forecasts than Nate did -- which, however, led to a host of unanticipated problems. He also talks about sometimes using the weighted mean PECOTAs and sometimes the 50th percentile. In an earlier article Clay also talked about using a different, larger base of historical player data for purposes of identifying similar players.

With so many moving parts, it's really hard to know what went wrong. But again, their looking at 2009 is a good thing to do.

Reply to nosybrian

markpadden

3/13

If they were simply interested in replicating the old PECOTA formula [which is what they *should* have done for this season], they would have simply compared the generated 2009 projections for each system on a player by player basis and noted the discrepancies. They clearly changed the way the algorithm works, and their post was an attempt to show that the new system is slightly better than the old one -- i.e., that the changes they implemented had some merit. It's a valid concept certainly. But the problem is, as a previous poster noted, that the sample size is tiny (one year) and they are using in-sample data to test the new algo. So the RMS error table for 2009 does not really establish anything meaningful.

Reply to markpadden

tbwhite

3/13

"With so many moving parts, it's really hard to know what went wrong."

No it's not, you just gave the answer. Too many moving parts. You can't port software to a new platform AND make changes at the same time. Port it. Test it. Then change it. If you only change one thing at a time, and constantly test after making each change, you'll know exactly what broke it. It's when you get in a hurry, and try to cut some corners that you end up screwing yourself. Unfortunately, I know this from experience.

Reply to tbwhite

cwyers

3/13

These are PECOTAs run with the current version of the system, but only the information available at the start of the '09 season.

I have slightly older runs of the new PECOTA system for three years of data, which I have tested against the originally published PECOTAs. Old and new PECOTA both return identical RMSEs for OPS, at least in the version of the test I'm currently looking at. (I'm currently working on ID mapping for some other projection systems, to give some basis for comparison.)

Reply to cwyers

markpadden

3/13

Thanks for the info. I understand that it was not run with any explicit knowledge of the future, but was the new PECOTA system originally developed (optimized) using any data from the 2009 season? That has not been made clear yet.

Would like to see RMSEs for SLG and OBP (not counting stats), new vs. old, for at least the last five seasons.

Reply to markpadden

cwyers

3/13

If I throw out the '09 data, I still get nearly identical RMSEs for old and new PECOTAs, looking at only '07 and '08. (The difference is .001.)

Reply to cwyers

markpadden

3/13

What years of data did the "old" PECOTA system use as its input for optimization of the algorithm? I.e., when was the last time PECOTA was changed prior to this season?

Reply to markpadden

nosybrian

3/13

AFAIK there were some changes made in PECOTA every year, with the possible exception of 2009, e.g., in one year Nate added GB/FB information, in another he added platoon splits, in another league differences, etc.

Reply to nosybrian

jrmayne

3/13

Colin:

Thanks very much for this information. Comparing against actual PECOTA's for 2007-2008 (under the old Silver system) would be very, very helpful, too.

A full discussion of various iterations of current PECOTA vs. 2007-2008 PECOTA vs. (say) Marcel and CHONE, with complete explanations of the methodology and a direct comparison of the slash stats would be welcomed by many.

Combined with a full explanation of how we got here, and (critically) a good PECOTA performance in 2010, and that's dirt in the hole rather than dirt out.

--JRM

Reply to jrmayne

clayd

3/13

On number one, I agree, which is why I've added 2008 data to the mix and am in the process of adding 2007.

The results remain the same. The current iteration of the PFM data - what is in the PFM and the 'MjLgHitters' portion of the has a lower RMSE than the 2008 Pecota on 8 of the 10 categories (losing triples and CS); the sum of RMSE across all 10 categories was 6.5% lower (better) than the 2008 system. In bias-adjusted terms, it was 1.5% better on the sum of all RMSEs.

Second, when back-testing for 2009 projections, all data was restricted to data available in spring 2009 (likewaise for 2008). The target for the league and parks settings was always taken from previous year's data, and a section of code in the program prohibits any comparison data from an equal or later year than the test player from being used.

Third, the bias in question is the ratio between projected numbers and actual numbers. For the first set of numbers, all forecasts have been pro-rated to the actual number of plate appearances - we're testing the performance of the program, which means our real target is the rate of each stat. The second set of numbers pro-rates again for systemic bias - if forecast HR were 10% high across the board, we knock 10% off everybody and recompute.

Four is correct. The book version was entirely restricted to forecasting 2010 data, but I will certainly look into taking back what worked there - assuming it can identified.

Five, as long as the system has identifiable improvements, changes will continue. I don't recognize any point to say "done, stop now". And more drastic improvements to 2009 were definitely possible - but not without overfitting and killing performance in other years.

Reply to clayd

dpease

3/14

If you really think that we're trying to pull the wool over your eyes by producing test results that we've massaged, or made 2009-specific changes only for the purpose of passing tests on 2009 data, please send me an email and I will refund your money.

We've had serious execution and communication problems this year, and I must again apologize for those, but I'd sooner go out of business than resort to something like that.

Reply to dpease

markpadden

3/15

It's not a question of "resorting to something like that." It's a much less sinister question of how best to test a new prediction algorithm. I (and others) have suggested that optimizing parameters on a data set (all past seasons), and then testing the algorithm's accuracy on a subset of that data, is not a valid approach.

It's the same as developing a stock trading system using all data through 2009, and then testing it on 2009 data. And then comparing to a system that was develped using only data through 2008. Ideally, you want to test the system on data that has never been used in the development process.

*No one* is suggesting that the new PECOTA somehow has access to future results as it makes it projections. Rather, there is simply some interest in using more rigorous methods when trying to assess accuracy. In this case, to compare the old vs. new PECOTA, it would only make sense to look at past years that neither system had access to during *devlopment*.

Reply to markpadden

tbwhite

3/15

If for some reason you feel you MUST use 2009 data in training, at least randomly split your data in half or 60/40. Train on one piece and measure accuracy on the remaining group. But an accuracy test based on data you trained on is pretty close to worthless.

Reply to tbwhite

dpease

3/15

The parent comment had as its lead 'the comparisons appear designed to show the newer PECOTA is better'. They weren't--they were what we had to publish. More coming.

Reply to dpease

nosybrian

3/15

I don't buy this. I think the main effort this year with PECOTA has been to translate the system to a new platform. PECOTA is very complex, probably more so than any other "systematic" forecaster that's out there.

I'm going to lay out my impressions about what's been going on, reading between the lines of the reports we've received.

Getting the various parts of the PECOTA system working on a new platform is very complicated. One could probably expect some differences in precision by integrating the estimates on one platform. Nate did some analysis (specifically the identification of comparable players) using a stats package and imported results from that to his Excel spreadsheets. So there could be some loss of information from using such a "disintegrated" system of equations and data management.

However, contrary to the remarks by previous commenters here, there was no need to do split-half comparisons or find new data in order to determine whether the new PECOTA platform is achieving what the old PECOTA platform achieved. Instead, making sure that the new and the old are consistent with one another, using past years (e.g., 2007, 2008, 2009) and the same input data would be perfectly sufficient -- the best way to go in fact. (The purpose of this work was NOT to prove whether PECOTA did better than other forecasting systems -- but rather to get the existing PECOTA working on the new platform.)

That said, as I mentioned in response to a query earlier, Nate made changes to PECOTA pretty much every year except perhaps 2009 (e.g., using GB/FB ratios, league adjustments, platoon splits, etc. -- none of which were in the original 2003 version). So replicating PECOTA estimates for earlier years could be a challenge -- requiring the programmers to REMOVE features that were in the 2009 version. That probably wouldn't be a reasonable expenditure of time. And I don't imagine that they actually tried to do this.

For that practical reason I think it was perfectly reasonable to focus on first trying to replicate the 2009 Excel-generated PECOTA's on the new platform, and if that replication proved successful (as Dave's and Colin's entries suggest it was), then seeing whether the "2009 PECOTA" algorithm on the new platform worked well retroactively on the 2007 and 2008 estimates.

Leaving aside the peculiar problems with the PFM -- which may not be integral to evaluating PECOTA (or any other systematic projections), then once Clay and others were convinced that their translated code on the new platform closely matched those that Nate obtained in 2009 and did AT LEAST as well for 2008 and 2007 (when Nate's formulas may have been different from what they were in 2009) they could address the PFM issues.

These, unfortunately, proved to be more complicated than was anticipated. And introducing some changes to the system, such as in the multiyear forecasts, added further complications to producing the PECOTA cards and getting the PFM to work right. In addition, as Clay has noted here, unlike the PECOTA's generated in December (and publised in the annual book), the later ones get refined by taking more information into account about lineup changes, batting order, and playing time -- reflected in the depth charts.

Users of PECOTA expect such information to be taken into account as it comes available up til opening day; and in its depth charts BP also tries to anticipate playing time for the entire season (including players not on the 25-man ML roster on opening day). BP subscribers know from experience that the post-January PECOTAs are therefore subject to change as the depth charts and batting orders are refined.

Like everyone else -- including Clay, Dave, and the BP team -- I wish this would have been resolved before now. But I am happy to see them working through these issues as well as that checks of the system (which are at the top of this article) are favorable.

Reply to nosybrian

markpadden

3/13

It's unclear from this post whether 10-year projections for hitters have been fully "fixed" yet, but it doesn't look like it. Basically, most players are projected to remain stable or improve from mid/late-20s through age 35 or so. This is not a normal aging curve, and differs significantly from past long-term PECOTA forecasts. E.g., Dustin Pedroia's TAv taken from his latest 10-year forecast:

Age 26: .305
27: .294
28: .292
29: .290
30: .291
31: .293
32: .292
33: .291
34: .301
35: .292

Other examples (pulled quickly by spot checking random players) include Sizemore, Miguel Cabrera, Adam Lind. Basically almost every good player is projected to experience no decline between age 30 and age 35. This does not correspond to observed reality (http://www.baseballprospectus.com/article.php?articleid=4464 and http://www.baseballprospectus.com/article.php?articleid=9933), or to the way past PECOTA projections looked.

Reply to markpadden

nosybrian

3/13

To illustrate why a relatively flat aging curve might not be wrong in many cases, but also why it is complicated to make multiyear projections, take a look at this article by Nate Silver: http://baseballprospectus.com/article.php?articleid=7189.

Reply to nosybrian

markpadden

3/13

Valid point. I would still maintain that the age 30-to-35 curves look suspiciously flat this year compared to previous PECOTA iterations. The fact that there are 10-year projections instead of 7-year projections is probably also a factor. I have not done any systematic checks, just looked at a 15-20 players.

Reply to markpadden

stlpdx

3/13

This 'update' is a whole lot of nothing. I am so tired of this shit.

Reply to stlpdx

blcartwright

3/13

First, I had a hard time finding the 2010 projections. The 'Find a Player' box still takes you to 2009.

Jesus Montero's weighted mean OB of 315 is only 20 pts above his BA of 295, and lower than any of his percentile forecasts. Clearly a miscalculation. But the final nine years of the ten year forecast are built on that number.

Reply to blcartwright

Junts1

3/13

It seems to me people are missing the implication here that Nate is not doing the PECOTA adjustments and tweaks as he has always done, probably because he's no longer really an active part of BP. Every year Nate has, in the mid-winter, posted his various PECOTA adjustments and new projects for the year. However, it's a lot easier for the designer to work on a project than it is for people, however brilliant, to pick up that project, learn it and improve it without breaking some things.

Clearly, they have broken some things during the learning process.

Reply to Junts1

BMoreGreen

3/13

~However, it's a lot easier for the designer to work on a project than it is for people, however brilliant, to pick up that project, learn it and improve it without breaking some things.~

This is why I cannot fathom the thought process that resulted in BP's strategy to implement the changes in the manner they chose. PECOTA is, after all, the foundation for all projection analysis done on the site and the signature product for the company. At the least, BP should have run one year of the new system completely behind the scenes while evaluating the performance. As someone relatively risk averse fiscally, there is no way I would have put the brand at risk to this degree.

Reply to BMoreGreen

deepblue64

3/13

The old system didn't work - they had major problems last year, that wasn't an option.

Reply to deepblue64

Junts1

3/13

At least through 2008, PECOTA was a series of excel spreadsheets on Nate's computer. If Nate isn't in the picture, there's no way to make that transition pretty.

I've written some fairly extensive excel calculators before, though nothing on the scale of PECOTA (which used to take -days- to actually do all the calculating!), and when I've needed to change or update them, it's a task that takes hours even knowing where you put every equation. For people who didn't write the spreadsheets to come in and learn how everything is arranged and where all the math is done and adjust it is something that can only lead to many mistakes and a lot of wasted time.

That's what happens when something that was intended as a personal toy turns into a major financial asset. PECOTA wasn't created to do what it does for BP: I'm sure it was far from optimally coded in the spreadsheets, and to update it and possibly migrate it to a better calculating framework has got to be an incredibly complex task requiring thousands of man-hours.

Reply to Junts1

dianagramr

3/13

"All we are saying ... is give Pease a chance."

(we now return you to the PECOTA/BP-flogging, already in progress)

Reply to dianagramr

krissbeth

3/13

To the torch-wielding mob: Does rotoworld's projections work better than PECOTA right now? I enjoy reading BP, but I want the best projections possible for my draft in two weeks. Which would you choose?

Reply to krissbeth

BMoreGreen

3/13

Geez, even as frustrated as I am with BP right now, it makes me feel a little dirty to consider posting more reliable sources ... but being raised Catholic, I guess can just head to confession for absolution at a later date - if warranted.

Unless BP transparently addresses and fixes the PECOTA meltdown, none of the data they have published this year will inform my fantasy drafts for 2010. Just impossible to determine if any of the BP iterations is better or more accurate than the others.

HQ, CHONE, hell even mlb/cbs/espn projections are providing more value to me this year. Particularly the no-charge ones.

Reply to BMoreGreen

zstine1

3/13

do any other sites have anything like PFM that is so customizable?

Reply to zstine1

BMoreGreen

3/13

I have not found one customizable to the extent of PFM.

Reply to BMoreGreen

nickojohnson

3/13

Last Player Picked

Reply to nickojohnson

yadenr

3/14

Baseballnotebook.com is also a strong projection site. I've used them paired with PECOTA now with great success. Every system has its flaws and hangups. Get to know at least a couple of them and make your own call based on their biases. I don't mean to advert for a competitor, but it seems like you should have at least one more system that you are using in combination with BP. BN costs money but I've found it to be really valuable.

Reply to yadenr

luftmich

3/16

Baseball Monster has something similar. Not sure I'd trust it, but it's free.

Reply to luftmich

BurrRutledge

3/13

My fantasy 2010 analysis will still be 100% PECOTA-based, with a sprinkling of Marc Normandin insights to inform some of the more difficult drafting scenarios I'm likely to encounter.

Reply to BurrRutledge

bflaff1

3/13

Krissbeth,

As someone who has used PECOTA data to draft for several years with varying levels of success, I do not see anything in the current iteration that renders that data worthless. Even during the Silver years, PECOTA managed to produce individual projections that, in retrospect, whiffed badly. I believe that the data have always needed to be interpreted based on whether you make the same assumptions about a player that PECOTA does. I can name several players from past seasons for which PECOTA has seemed to have a multi-year blind spot (Chris B. Young, Jay Bruce, and Kelly Johnson off the top of my head) and after some bad blunders, I learned to adjust accordingly. The problems with internal consistency in this year's numbers are frustrating, but I don't see PECOTA churning out a steady stream of outlier projections. While some may believe that CHONE has a better read on (say) Kendry Morales simply because its team standings add up correctly, I'd argue that this alone does not inherently make it more accurate. I still feel that there is plenty of accuracy left in PECOTA, and that if it is less useful this year than in the past, the differences are not as significant as is sometimes maintained. (Of course, I draft in a traditional 5x5 league, so problems with UPSIDE or 10-year projections do not affect much of what I do.) As I see it, BP's biggest mistake this year (and last, admittedly) appears to be that they did not recognize the size of the problem back when there was sufficient time to deal with it. However the "problem" to me has more to do with getting PECOTA from 'good' to 'the best it can be', instead of from 'useless' to 'OK'. Your mileage may vary, but I still say the numbers can be used, and used successfully. Good luck with your draft.

Reply to bflaff1

krissbeth

3/15

Just FYI, I've re-upped with BP, so don't worry about costing BP a customer.

Reply to krissbeth

Stars0ftheL1d

3/13

as someone who's not involved in fantasy baseball, or uses the PECOTA projections for anything more than personal reference...i have to echo others' sentiments about how disappointing this roll-out has been. i'm mystified by the fact that this transition didn't fully undergo thorough QA testing before replacing the existing methodology to produce the data that is the foundation to the core product.

beyond that, y'all could've communicated to your audience a little more regularly, and broadcast in a different way than has been done. if i were reliant on this information for my draft, i'd be furious.

that being said, i'm extremely sympathetic to the BP staff for the heat they're taking, especially considering the years of quality that have gone before 2010. i'm hoping the staff is learning from their experience to ensure history does not repeat itself.

Reply to Stars0ftheL1d

danlbfaks

3/13

I'm not a programmer and have generated "software" in Excel for Z-scores, regressions, projections, et cetera that could not have possibly amounted to 1/1000th the complexity of what Nate created and tweaked over the years. When I've handed my "software" to a programmer to translate it, they've often laughed at the silly complexity involved in using Excel to kludge the calculations. I imagine successfully converting Nate's work lies somewhere between flapping your arms to fly and blowing someone's head off with a thought.

We're stuck with the name PECOTA for branding reasons, but the product is no longer the same. It isn't clear whether it should be exactly the same or not--certainly asking that question is the correct thing to do. PECOTA was *not* a fantasy baseball tool at first. My impression from the beginning is that the focus is on identifying prospects before they identify themselves or prognosticating breakouts and collapses before they happen. It's too bad that building a successful business in baseball analysis requires turning to the fantasy world. PECOTA wasn't and isn't well built for fantasy--it's an aggregate tool being shoehorned into an individual game.

Reply to danlbfaks

swarmee

3/13

But whether or not the fantasy players want to use it, if the underlying projections are out-of-whack, it's still not completing its original purpose. If the data made sense (like 10% < 25% < 50% < 75% < 90% projections were all proper), then trying to adjust the weighting scales to better approximate the 2009 season makes sense. When the answers are still being ironed out 5 months after the season ended, it's obvious that the work isn't being checked in house and solved in house.
Basically, 2010 PECOTA == Microsoft: Pay us, then identify/fix our bugs for us!

Reply to swarmee

Clonod

3/13

I buy BP for the articles, but if I bought a fantasy subscription this year, I'd be seriously irked. Drafts are already taking place, and this stuff is still buggy?

Reply to Clonod

leites

3/13

Let me just say I would buy a BP subscription just to read Kevin Goldstein's prospect analysis -- for me, PECOTA is just a bonus.

That said, I continue to be dumbfounded by this year's comparables. Is Everth Cabrera really most comparable to the young Hanley Ramirez and Jimmy Rollins, and also very similar to the young Derek Jeter? Can you comment on whether the comps perhaps need to be tweaked along with the 10-year forecasts?

Reply to leites

Olinkapo

3/13

Age 22 seasons
--------------
Everth Cabrera, 2009: .276 EQA
Derek Jeter, 1996: .279 EQA
Jimmy Rollins, 2001: .266 EQA
Hanley Ramirez, 2006: .291 EQA

This is not a defense or robust analysis or anything quite like that. Rather, I found it pretty cool. :)

Reply to Olinkapo

leites

3/13

Another comp example: Matt Wieters' top comps are John Christensen, Jarrod Saltalamacchia, Sid Bream, and Steve Decker. Looking at these, and Everth Cabrera's, I'm no longer sure the comp listings have any value whatsoever. (Of course, if Cabrera turns out to be a Hall of Fame player . . .)

Reply to leites

granbergt

3/13

*Gnashing of teeth*

Reply to granbergt

mhixpgh

3/13

I was kinda disappoint that PECOTA wasn't up to snuff when I had my draft last weekend. AND I think BP should have been more upfront about things A LOT earlier. It just seemed like I expected BP people to more forthright about all this PECOTA stuff. Why keep pretending?

Articles keep getting posted, analysis keeps getting churned out, readers keep reading.... But upon what are the articles, analysis and readers depending upon? Are we really to expect some point of clarity which allows us all to move forward with reliable data, predictions and analysis?

I can pick my own fantasy baseball team on my own just fine. Thats on me. But PECOTA.... That's on you guys, BP. I wish you the best and truly hope that you get it all sorted out.

Reply to mhixpgh

coonscrape

3/13

We have an opening in our 5x5 roto auction fantasy league to anyone who will pay $40.oo for Jose Reyes

Reply to coonscrape

AAG455

3/13

Allow me to join the angry mob here. I'll reiterate what's been said above - the BP brand has been severely, if not permanently, impaired.

Presumably it was clear during last season that a switch over needed to be made from the Nate Silver cobbled-together sheets into something more dynamic (and by the way - I've never understood the Silver hagiography that goes on here, nor been impressed by the next site he's gone on to run, but this makes me wonder if he isn't in fact the only person with a clue around these parts).

With that in mind, why not work on the program throughout the 2009 season and have a 2010 projection ready to roll out around October 1st? You guys shot yourself in the foot by waiting until February to get this going.

And while I've been a defender of the site in the past, you guys reap what you sow when you tell people that "Everything they know about baseball is wrong" and that your forecasts are "deadly accurate" ... and then you deliver a stinker like this.

The reality is, the product you are offering for a price is inferior to those being offered elsewhere for free.

Not a great path to prosperity in my view.

Reply to AAG455

bflaff1

3/13

"The reality is, the product you are offering for a price is inferior to those being offered elsewhere for free."

The following are a random sampling of projections. One comes from the current Depth Charts at BP. The other comes from a popular free system. Without identifying which is which, can it really be said that one is demonstrably inferior? Or that based on these numbers, that one should wonder whether anyone at BP has a clue?

Grady Sizemore
Brand X: .274/.375/.491
Brand Y: .272/.369/.484

Pablo Sandoval
Brand X: .310/.366/.504
Brand Y: .325/.368/.526

Elvis Andrus:
Brand X: .264/.327/.362
Brand Y: .266/.328/.367

I can't tell which of these brands is 'delivering a stinker,' or 'permanently impaired', and I suspect that most of the commenters on this article can't tell either. You will get no disagreement from me that the roll-out for PECOTA this year has been flawed at best. Certainly the site has taken a hit for that, and they've acknowledged their mistakes. However, instead of simply massaging the numbers to make it work out right, at least the site has been transparent about its efforts to correct the programming, and apologetic for the delays. That does not excuse their errors, or make it OK that PECOTA is still not 'locked,' but I think the scorn being heaped upon Dave and the rest of the BP team in this comment and others of its ilk is unwarranted. I also believe that the errors are not 'fundamental' ones, and that the fact that Adam Jones' 80th and 90th percentile projections don't add up, doesn't make BP's .283/.344/.459 inherently inferior to CHONE's .294/.349/.497. It doesn't mean that Christina Kahrl can't say anything intelligent about the Blalock signing. It doesn't mean that Will Carroll can't explain an arm injury, and it doesn't mean the Kevin Goldstein doesn't know how to evaluate a prospect anymore. It is an annoying logical error, but that on its own does not signify to me that BP no longer knows anything about baseball, or that I can get better analysis/information from Yahoo, Fangraphs, or MLB.com.

Reply to bflaff1

dpease

3/14

To follow up on this--I'm not much of a natural marketer, but the reason we haven't been more forthcoming about serious issues in the 2010 projections is that we're not aware of any serious issues in the 2010 projections. Seriously.

If you are in a non-keeper league, as far as I know there's no reason you shouldn't be able to make use of the projections in the book, or the ones released on the site in late January, to inform your draft strategy.

The long-term projections (and, by extension, the upside), the comps, the weighted means, the percentiles--we've had problems with all of those to some degree, and we're still working on fixes. But if what you'd generally use is the output from the PFM, none of those are relevant.

Reply to dpease

redspid

3/14

This is key. Pecota has some drawbacks, but if you know them from previous years, you can adjust. The 2010 projections should be fine.

Reply to redspid

dianagramr

3/14

I have $5 that says Adam Jones outperforms Nate McLouth in every category this season, contrary to PECOTA's projection.

Seriously, I know there's been a discussion of McLouth's projection, but the one for Jones looks low across the board.

Reply to dianagramr

dpease

3/14

I have never, ever agreed with 100% of the projections from any projection system I've ever seen. Sometimes I'm right, sometimes I'm wrong. As they say, that's why they play the games.

Reply to dpease

dianagramr

3/14

Understood, and appreciated.

Reply to dianagramr

markpadden

3/15

What is the point of pulling three player projections and saying all is well because they generally resemble another firm's projections? That's the most absurd notion I have heard in this entire thread.

Reply to markpadden

rosssheingold

3/13

At this point I'd like a refund so I can take my $20 back and pay for Bloomberg Sports' draft kit. Is it too late for that?

Reply to rosssheingold

NathanJM

3/14

I wouldn't. I've been pretty unimpressed with the Bloomberg.

Reply to NathanJM

dianagramr

3/14

I haven't found any way to print ANYTHING I've seen on the Bloomberg site ...

it LOOKS cool ... but,

Reply to dianagramr

rosssheingold

3/21

As follow-up to this, I tried out Bloomberg Sports and was very unhappy with the product. They focused way too much on a fancy design, and it wasn't as customizable as I would have hoped for it to be.

My main frustration with BP.com this year is the lack of PECOTA cards linked from the PFM. Believe it or not, that really affects my draft prep.

I guess I'll just have to work with what I've got...

Reply to rosssheingold

zstine1

3/13

i think toyota has taken less flak than BP has this spring.

Reply to zstine1

whoami1219

3/13

Well, lets remember the countless number of deaths caused by an inaccurate Pecota. Shame on you BP! SHAME!

Reply to whoami1219

vtadave

3/14

...and Toyota DESERVES far less flak than BP. This coming from the owner of both a Toyota Sienna and a BP subscription.

Reply to vtadave

mrharrier

3/13

I've asked this question previously in ways that haven't generated positive responses -- so now I'll ask it in a positive way. Could anyone please provide some insight into the characteristics of Nate McLouth that would lead him to be ranked by BP as one of the top players for the coming year, though other prognosticators differ on this point? Is there something about his BABIP data, his expected walk rates, an age-fueled increase in AVG, or anything else that is making PECOTA particularly optimistic about his expected production?

Reply to mrharrier

mbodell

3/14

I think this idea in general is a good one for an article. Take a bunch of other projection systems (CHONE, Marcel, Zips, Bill James, fans, etc.) and look for players where PECOTA is way above or way below the average of the other group (maybe where OPS is >5% different?) and try to explain what is making PECOTA feel this about a player and why the BP author thinks PECOTA is right (or wrong!). I think that would make for a great and interesting read.

Reply to mbodell

rbross

3/14

Nate Silver used to do this with prospects. He'd compare the Baseball America prospects with those hyped by PECOTA, exploring particularly the players that one and not the other system thought predicted would be good. It was very interesting and he wasn't afraid to concede that PECOTA has its limitations.

Reply to rbross

bflaff1

3/14

mrharrier,

Although the McLouth projection screams 'proceed with caution' to me, I'll take a stab at answering your question. First off, McLouth's Depth Chart projection lists his 5x5 counting stats as 109/25/91/23/.267. This matches well with the 113/26/94/23/.276 he put up in 2008, so if the system simply sees this as a bounce-back year, then the numbers line up. Although McLouth's flyball rate dropped significantly last year and he struggled more against LHPs than in years past, his walk rate also increased and he joined an Atlanta team that should be far superior to the one he left in Pittsburgh. He missed time with a hammy injury that should be healed now, and rates a 'green' on Will's THR. If you look at your other prognostications, I think you will find that PECOTA's triple slash projections for McLouth are not far off these other systems at all. The big difference in counting numbers comes because PECOTA sees 713 PAs for McLouth, while others I've seen have him well below 600. As I said before, I plan to proceed with caution, but I don't know that I'd discount the projection completely.

Reply to bflaff1

braden23

3/13

Aren't good fantasy owners supposed to look at projections, add in their own conclusions and make their calls accordingly? Before all this noise, I had my PFM set to my league rules, and I draft according to the projections and my assessment on which ones I am on board with and which ones I am not.

The fact that PECOTA has some kinks is not ideal, but who here is drafting blindly with these projections? Marc's guidance, Will's THR's and damn near every article on this site should help all of us form opinions and make the calls.

The book is great, the site is great, the BP team is being upfront about the issues. Let's roll with it and move on.

Reply to braden23

rawagman

3/14

Amen. One of the most commendable points about BP, is that it is for the thinking fan. If we accept projections blindly, we're not really thinking, are we?
PECOTA is data, not decision.

Reply to rawagman

veganalyst

3/14

I'll respond to this comment for $40.

Reply to veganalyst

Clonod

3/14

If people didn't buy fantasy subscriptions to this site, I'd be with you. BP makes money on PECOTA having a reputation as more reliable and thorough than other projection systems. That's just a fact.

And a lot of that goes beyond PFM. I look to PECOTA to get an idea on upside, as I'm in a long term keeper league. And I don't have any faith in the long-term projections this year. They simply still do not even pass the smell test.

Is that enough to make me cancel my subscription? No. BP still has great content all over the place. It's just kind of irksome that I spent all winter waiting for this stuff to come out, and when it did, it didn't have any more reliable data than MARCEL or some other such one-year projection. Luckily my main draft isn't until April. I'm just hoping the kinks are worked out by then.

Reply to Clonod

rsambrook

3/14

I couldn't agree more.

Reply to rsambrook

Richie

3/14

I will make one suggestion. Either:

A), get those projected records down somehow; or,

B), put up a message box stating that you know they don't add up and you're working on that.

It just looks awful, putting such an obvious error out there for all to see.

Reply to Richie

tbwhite

3/14

So, how exactly are the "2010 Projections" done that are in the box at the top of the box on the player cards page ? I see that it's the Median rate stats, but they are applied to more PA's ? Where does the PA number come from ?

As for the 10 year projections they still don't pass the sniff test. Nate McLouth is not going to show any meaningful decline as a player until age 35 ? Give me a piece of that action. Meanwhile, Adam Jones, who arrived in the majors at a much younger age than McLouth shows a drop off at age 33. Doesn't this fly in the face of what we know about career arc's ? Typically, it's the later arrivers who fade first, right ?

Reply to tbwhite

dpease

3/14

If a player has a depth chart playing time estimate, we use that. If not, we use the PECOTA projected playing time. We thought this would be the best solution, but it is confusing and we haven't figured out how to make it less so.

Reply to dpease

tbwhite

3/14

How about using the PECOTA percentile projection that matched your depth chart playing time. PECOTA says the better a guys plays the more he plays. So, if you are projecting that a guy who has never started regularly before will get 600 PA's, it seems logical that you are also assuming the he will play well. Few players could play at their 10th percentile level while simultaneously establishing themselves as regulars.

Reply to tbwhite

dpease

3/14

That's an interesting idea!

Reply to dpease

BCulhane

3/14

Dave,
Having the PFM based on the depth chart is logical and not confusing. It does, however, make the accuracy of the projections dependent upon the accuracy and timeliness of the depth chart. We canâ€™t expect PECOTA to determine who plays and who doesnâ€™t. I suspect every fan has suffered through a manager who keeps a deserving player on the bench because of (insert frustrating excuse here).

I donâ€™t know who currently updates the depth chart, but I would love to have it assigned to John Perrotto. The notes at the bottom of his On the Beat article often have real insight into who is likely to get playing time and who isnâ€™t. NL and AL only fantasy leagues are usually won by late round picks, and playing time has a huge role in determining which players should be chosen in the later rounds. Having the chart updated more frequently (at least once/week) would be wonderful.

Reply to BCulhane

dpease

3/14

Clay Davenport will be updating the playing time projections every other day through the start of the season. Keep in mind he's doing it based on his guesstimates of how much playing time each player will get throughout the course of the season--that means he might be allocating considerable playing time to a player who isn't slated to start, if he thinks the guy who is starting the season at that position isn't up to the job and won't last long.

We do have internal discussions about playing time that become part of Clay's updates, and John provided a bunch of info earlier this month on that front.

Reply to dpease

vtadave

3/15

Regarding the issues with PECOTA this year, I clearly overreacted in this threat and for that, I offer my humble apologies.

The bottom line is that the frustration of myself and others relates directly to what we're used to seeing from BP - innovative thinking and great writing. We don't always get everything we want, but the fact that BP provides a forum for their customers to comment/complain/offer feedback is appreciated. Then taking the time to listen and address said feedback is even more appreciated.

I have all the confidence in the world that when you put together this many talented minds, that issues (like PECOTA) will be both addressed and resolved.

Reply to vtadave

mhixpgh

3/15

I agree with vtadave. But you guys sure could have saved yourselves a lot of trouble by getting out in front of this thing. Too much silence and for too long.

Reply to mhixpgh

mhixpgh

3/15

I do need to say this, I will not be renewing for a number of reasons: No reliable PECOTA (the silence surrounding the PECOTA issue puzzles and disappoints), no Sheehan, poor customer service, and articles and analyisis which seem to me to be strewn with grammatical, stylistic and most probably mathematical/statistical errors.

Reply to mhixpgh

sfhubbard

3/15

I tend to agree. BP is behaving like an organization that is past its prime. I see marketing outreach (with them cited on MLB network and the like), but no customer service/support to back it up.

Merely solving the PECOTA issues will not be enough. BP also has to get PECOTA released in time to be useful, and they have flirted with that timeline for years. That it's the most accurate system out there is useless if it's not ready in time.

Reply to sfhubbard

irablum

3/15

On Friday, I openned up the latest update to the Depth Charts and was immediately very upset. The more I look at it, the more upset I get, too. I'm not much of a fantasy player, so its not just the consistency aspect to it. No, what bothers me is that the Texas Rangers managed to magically lose 50 runs of offense during spring training. They lost this by losing a big chunk of their power, and this came from decreased power output from their main power hitters.

Indeed the Rangers, who have been consistently projected as having the highest slugging average in the majors, are now predicted to be 6th, tied with Toronto. This is also the first projection which shows them under 800 runs scored, and its significantly below that level. That doesn't pass the sniff test.

Reply to irablum

D1Johnson

3/15

Regarding Jose Reyes, I just noticed that the 2010 projections appearing at the top of the beta card do not match the weighted mean under the 2010 Forecast further down the sheet, nor is it close.

My guess is that the projections at the top are pre-thyroid diagnosis (e.g., 97 runs) and the forecasted numbers in the 2010 Forecast section post-thyroid diagnosis (e.g., 79 runs at the 90% level).

This also raises the question as to which of these numbers, if any, PFM uses, which upon examination appears to be the projections at the top of the page. Unfortunately, these appear to be sorely out of date.

Reply to D1Johnson

makewayhomer

3/16

Re: PFM and 50th percentile projections - I still don't see a perfect fit.

in PFM, McLouth is hitting .267. I don't see this AVG in any of his PECOTA percentiles. I do see it in the Weighted Means Spreadsheet though

Reply to makewayhomer

Bodhizefa

3/17

No mention of the massive changes in projection to players like Alex Rodriguez, Mark Teixeira, Chris Davis and others this go-round?

Guys, I love BP to death and have been a long and ardent follower, but if this is transparency, then I'd hate to see the "wool pulled over our eyes" by you guys.

How does this article explain why just a couple of weeks ago Alex Rodriguez had a .700 SLG% in his 90th percentile and now has a .630? That's not a playing time correction, and it sure as heck isn't defined in any of the changes you outlined above.

PECOTA was worthless weeks ago, and for us -- the subscribers -- to have any sense
of it having been reconciled, it would be nice if you guys would actually fess up to what in the world you did to screw these numbers up so badly and what you did to fix them. Sweeping it under the proverbial rug would not be a good first step, I would think.

I think it's telling that in 2003, when I was first introduced to the site, Baseball Prospectus told me everything I needed to know about statistics and value and the great world of sabermetrics in baseball. Now? I only come here for write-ups on prospects who are evaluated almost solely by scouts. I think the message has been lost somewhere, gentlemen.

Reply to Bodhizefa

jrmayne

3/17

I made some comments in the depth charts. It's a little difficult to suss out why there are differences in the projections from various PECOTA sources (two spreadsheets differ, which differ from listed projection, which differ from 50th percentile.) Integer rounding explains a lot of it, but I don't think all of it. Weighted means appear to still be miscalced. Big differences (like Andy Marte's slugging) between projection and 50th percentile are undesirable.

I'd note that any comparisons to past iterations needs to include slash stats or (or preferably and) total value stats, like virtually every other forecasting comparison ever done, including by BP. Failing to include that basic comparison renders such an exercise of little value.

--JRM

Reply to jrmayne

luftmich

3/17

There is a pretty interesting discussion regarding this year's PECOTA's over at Tom Tango's site, Inside The Book. The discussion can be found in the comments section of the March 3rd, "Weiters II" blog post. (I won't provide the link, but it should be easy enough to find).

Colin has reported some of his unpublished findings regarding this year's PECOTA.

Reply to luftmich

luftmich

3/17

Oh, and I don't know if this is by design, but this post no longer shows up in the Unfiltered Archives. For some reason the most recent posts that appear are from March 3rd and earlier.

Reply to luftmich

dpease

3/18

Sorry about that--we switched to a new back-end on March 3. We've changed the front-page links--now you can click 'more' to get to previous posts in the new format, and there's a separate link for stuff written on March 3 or before.

Reply to dpease

Jquinton82

3/20

Not to be THAT GUY, bc I know you guys are working hard trying to fix this... but when can we expect the cards for pitchers? (relatively soon, 2 months from now, not for quite a while?)... I ask bc this is my first time as a subscriber and genuinely have no clue what to expect in terms of time frame on all this.... honestly just curious.

Reply to Jquinton82

dpease

3/22

hi--the beta squad is looking at them now. we should roll them out early this week.

Reply to dpease

BP Unfiltered: PECOTA Update

Thank you for reading

Latest Articles

The Stash List ’25: Week Three $

Pitching Category Helpers: April 2025 $

Box Score Banter: Ain’t Life Grand B

What Are the Pirates For? $

MLU: Kross Keeps Nailing It $

Dave Pease

Latest Articles

The Stash List ’25: Week Three $

Pitching Category Helpers: April 2025 $

Box Score Banter: Ain’t Life Grand B