Against long odds, the final week of the 2009 regular season wound up producing down-to-the-wire excitement in both leagues, though for the most part, that excitement had nothing to do with stellar play. The Dodgers used a season-high five-game losing streak to keep the suspense regarding the NL West flag and home-field advantage building for an entire week, with the Phillies and Cardinals failing to capitalize, and the Rockies falling just short of overcoming a lackluster two-week stretch prior to their final sprint. Meanwhile, the AL Central has produced its second consecutive Game 163 play-in, this time due to a mad rush by the Twins and a collapse by the Tigers that may yet prove to be historic.
Against this backdrop, viewers have been treated to writers, broadcasters, and in-studio pundits admonishing such slumping teams to pull themselves together as they pontificated on the importance of heading into the playoffs with momentum. The oft-cited example remains the 2007 Rockies, who won 13 of their final 14 regularly scheduled games, then a play-in, and ultimately the NL pennant. Forget the fact that just one year prior, the Cardinals dumped nine of their final 12 before becoming the team with the lowest victory total ever to win the World Series-these experts certainly did. The question obviously arises as to whether there’s truth to such conventional wisdom about whether late-season performance carries over into the playoffs. The answer is a fairly resounding no.
With the help of Eric Seidman, I pulled late-season records for every playoff team of the Wild Card era from 1995 through 2008, 112 teams in all. For each team, we recorded their record over the final seven, 14, and 21 games, as well for September and whatever fragment of October remained. The results of Game 163 play-ins initially weren’t included in either the “week” records (which didn’t always coincide to weeks, but which were somewhat easier to gather) or the “month” records; including them didn’t change the results substantially. Here are the correlations between the interval’s winning percentage and first-round success:
Interval Corr162 Corr163 Final 7 .019 .016 Final 14 -.020 -.021 Final 21 -.042 -.043 Final Month -.028 -.028
That, folks, is a whole lot of nothing, an essentially random relationship between recent performance and first-round success. None of the correlations even reached .05 in either direction, and six of the eight were actually negative.
Okay, so those few week intervals don’t tell us much about the outcome of those five-game series. They tell us only slightly more about the entire postseason. Here are the correlations between those winning percentages and overall playoff success as measured by the number of series won:
Interval Corr162 Corr163 Final 7 -.043 -.049 Final 14 -.097 -.101 Final 21 -.119 -.121 Final Month -.112 -.115
That’s still nothing to write home about, and the slate is now uniformly negative, suggesting that, if anything, there’s an ever-so slight inverse relationship between success in the final weeks and in the postseason. Perhaps that’s because some of these playoff-bound teams are resting their regulars more often, or simply regressing to the mean after a summer of beating up on opponents. Even if we create a points system, doubling the value of winning the League Championship Series and quadrupling that of the World Series such that the same number of points are awarded per round, the magnitude of the largest correlation-for the final month, 163-game version-still doesn’t get any bigger than .137, and it’s negative at that. It’s still essentially nothing.
By and large, these teams that made the playoffs did well over the various intervals in question, winning at a .595 to .601 clip and serving to remind that there’s a selection bias at work here: the teams that did very poorly likely missed the postseason, relegating themselves to the dustbin of history. Indeed, just 13 of these 112 teams put up sub-.500 records from September 1 onward, and only two of them, the 1998 Padres and 2008 Brewers, were more than five games below .500 during that stretch run. Even so, six of those 13 teams won their first-round matchups, all six of them won their respective League Championship Series, and three of them won the World Series (the 1997 Marlins, 2000 Yankees, and 2006 Cardinals). Recall those 2000 Yanks lost 15 of their final 18 games prior to the postseason while being outscored 148-59, exhuming the memory of the 1899 Cleveland Spiders in the process before flipping the switch and trampling everything in their path to a third straight World Championship.
As well as those teams did over these short stretches, it’s noteworthy that the recent records of the teams that won in the Division Series and the teams that lost are virtually indistinguishable. Over the seven-game split, the two teams’ aggregate records differ by one win across a sample of 784 games, and over the month long split, the difference is a net of four games. The split between the two grows as the pool of teams decreases in the Championship Series and World Series rounds, but not in the direction we’d expect:
Interval DS W DS L CS W CS L WS W WS L Final 7 .594 .592 .571 .621 .551 .592 Final 14 .597 .600 .581 .612 .541 .622 Final 21 .598 .604 .578 .617 .558 .599 Final Month .599 .601 .579 .619 .570 .589
Every single split but the seven-game/Division Series one-the only one from among the first two tables with a positive correlation-shows that the teams that lost the series had a better aggregate record over the recent intervals than the teams that won, again suggesting that there might be some effect of resting the regulars or otherwise regressing down the stretch.
On a team level, recent performance as measured by wins and losses simply isn’t predictive. For further evidence of this, consider a quick-and-dirty study I did in the service of the Hit List this summer in response to the suggestion of making recent performance a stronger factor in the rankings, thus conforming to some readers’ perceptions that the hottest teams at the moment were thus the strongest teams overall.
Using the 2008 Hit List, I broke the season up into four-week chunks (“months,” for the purposes of this study) and tested the correlation between each team’s “monthly” actual, first-order, second-order, and third-order winning percentages as well as their Hit List Factor (the average of those four percentages), against the following “month’s” actual record. I used these four-week splits because they were easily created from my master Hit List spreadsheet, as I only save the adjusted standings for the days I use to compile the list.
The correlations for “monthly” ____ winning percentage to next “month’s” actual winning percentage:
Indicator Corr
Actual .21
First-order .24
Second-order .18
Third-order .17
Hit List Factor .22
Not much to hang onto there. I then tested the correlation between the various year-to-date winning percentages from those increments and the next month’s actual winning percentage.
Indicator Corr
Actual .304
First-order .289
Second-order .298
Third-order .296
Hit List Factor .312
Though it’s hardly a robust effect, this admittedly slapdash study does support the none-too-controversial idea that a larger sample size such as a year-to-date performance is more useful than a recent incremental performance in predicting wins and losses going forward. Even so, as I found last year, when it comes to the playoffs, actual records over the full season aren’t as helpful as Pythagorean ones, which are based on the underlying performances that tend to even out across even larger sample sizes.
As the postseason unfolds over the next few weeks, you’re going to hear a lot about momentum and its importance to a ballclub, and while it’s undoubtedly a good idea to bear Earl Weaver’s famous maxim in mind, the take-home message is that the conventional wisdom that a team’s recent performances foreshadows their playoff fate is generally wrong. The fact that there are no shortage of pundits who elevate the 2007 Rockies as their evidence while forgetting the 2006 Cardinals underscores either how little attention some talking heads pay to actual results, or how short their attention spans are.
In any given series, there may well be reasons to predict one team having the upper hand in a given series due to the strengths and weaknesses of the various matchups; things like the Phillies’ closer situation and the Dodgers’ rotation jumble will have a very real impact who gets to play, and what might arise from that. Even so, the differences between any two teams who make it to the October crapshoot are small enough that the range of outcomes in a short series is almost unlimited, and the effect of recent performance shouldn’t be overemphasized.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
By and large, all these are going to be winning teams, and its pointed out that most of the teams did have winning records in September. But once you get to the post-season, its a competition between good teams, with good records, yet somebody has to lose, by definition. That fact all but ensures a near-zero correlation, mathematically.
On top of that, the measure that late season record is correlated against ("first round success"?) is not defined exactly, but if its just a binary (won/not won) variable, that doesn't give you much variance to work with, which will also hold down correlations. Or said differently, can make even a very small correlation meaningful (significant). What looks like "a whole lot of nothing" might actually be something.
Your argument about the correlation being biased towards zero is definitely true, and is good to point out. But, I still think it would never really explain why the correlation consistently comes up on the negative side more often than the positive side. That adds further evidence to this argument, which seems to be a very strong one.
Of course somebody has to lose. My primary point is that recent hotness or coldness won't tell you who that is, and my secondary point is that going by won-loss record probably won't tell you either.
"On top of that..."
You're right in that I used first round success as binary (won/not won), which decreases the variance - in the rapid-fire rush to get this out I didn't have postseason series win-loss records easily accessible. Elsewhere in the piece I did use series won and then a version of "playoff success points" which gave increasing credit for later rounds - and the correlations were still pretty small.
Now I've gone back and gotten those series won-loss records (Seidman with another assist), and can report that the highest correlation between any of those short-interval records and Division Series winning percentage is just .06, and the highest correlation between those intervals and postseason winning percentage or postseason win totals is just .09.
1. We've got another play-in game #163 tomorrow, with a lot of speculation about whether that's going to wear down the ultimate winner to the benefit of the Yankees. What does the data tell us about plan-in games? Do teams that make it to the playoffs this way have worse or better records? If worse in the first round (as one would suspect), do they revert to form if they make it through to the CS? And does the NYY choice of a Wed or Th start make any difference?
2. You hinted at the possible effect of the urgency of the teams' last week or month (e.g., "resting their regulars") I wonder if you would get different results if you broke your sample into those that had to win in those last games to make it in vs. those that could pretty well coast in.
2. Subject for further research at a later date.
What I am most curious about is the related theory that teams who clinch the pennant "early" or don't play "meaningful games" are a disadvantage than the teams who have a "tough" pennant race. I think this is more what people mean by "momentum" as opposed to recent winning percentage; teams who "have" to win games (and coincidentally do so) are somehow better equipped to handle the "pressure" of October than those who "coasted" to the post-season.
I'd be curious as to how much a team's winning percentage in "meaningless" games (where a team has either clinched or been eliminated) varies from "meaningful" games; or whether teams who play "more" meaningful games have a better record in the post-season than those who coast. The 2007 Rockies (and the present edition) are being bandied about as examples of this theory -- although for some reason no one talks about the fact that winning 21 of 22 "meaningful" and "pressure-filled games" meant absolutely zero in the World Series where the team with "all the momentum" couldn't even manage a single victory.
I agree that it's mostly something for talking heads and lead-with-their-heart fans to talk about -- and I do believe that there is a psychological factor involved, where recent success tends to relax/focus a team. But to suggest that "I'd rather be the Twins, who have had to play for their season, than the Angels, who coasted to the pennant" is ridiculous.
If you think there are any great revelations to be had based upon a sample of seven games between any two teams... well, you haven't been paying very close attention, have you?
I was already seeing, just from eyeballing the numbers, that there was not likely to be much of a correlation. Thanks for saving me some work. Next time maybe I'll just shoot you an email and ask the question.