Perhaps it’s because I’m a fan of (and occasional analyst of) college basketball, but I have paid a lot more attention to schedule strength over the past few seasons. Because of the varying strength of conferences, and the elective nature of non-conference scheduling, you have to take into account the caliber of a team’s opponents in that sport in a way that you don’t in others.
Consider Jay Jaffe‘s look at the projected schedule strength of MLB teams. The range was fairly small, just 30 points of winning percentage, from the Marlins at the top to the Cubs at the bottom. That’s enough to make a difference over 162 games—you’d certainly have to feel some sympathy for the task facing the Orioles and Blue Jays—but not nearly the spread you see in the NCAA, NFL, or NBA. The primary impact is on the wild-card races, where the effects of an unbalanced schedule and interleague play can leave contenders for both Wild Cards facing disparate levels of competition. It is a fairness issue when, say, the Marlins (.519) and Diamondbacks (.492) are nominally in the hunt for the same ticket. If a schedule’s strength is worth even a single game in the standings, that’s a problem, and that we’ve accepted this unfairness implicitly as leagues have expanded their size and playoff systems doesn’t make it any more right.
That’s actually not my point today. No, the point today is that while the range of schedule strengths over a period of six months is fairly small, and yet still unfair, the range of schedule strengths over a period of weeks is very wide. While not unfair, it is a critical factor in evaluating short-term performance, especially at the start of the season, one that tends to be ignored in the rush to canonize teams with gaudy records in the early going.
I mentioned the Marlins twice above. They’re projected to play the toughest schedule in baseball this season. However, they’ve started the year with six games against the Nationals and three against the Pirates, along with three each against the Braves and Mets. That’s a weak slate relative to average, and extremely weak relative to their projection. That they began the season 11-4 has as much to do with playing the Nati(o)nals six times as it does their own qualities. There are reasons to be excited about their future, and even their present, but the overreaction to their “hot” start failed to consider just how much of an effect scheduling had on it. (That they lost three straight to the Pirates doesn’t change this analysis at all.)
There’s a similar case in the AL, where the Blue Jays have opened the season 11-5. Projected to play the third-toughest schedule in the majors, they’ve yet to play a division rival, having to date taken on the Tigers, Indians, Twins, A’s, and Rangers. That’s not quite like playing the Nationals 40 percent of the time, but for a team that will get heavy doses of the three best teams in the league, it’s not representative of the challenges to follow. The Blue Jays are 11-5 in no small part because of who they’ve played. Is Ricky Romero this good, or is it that 67 percent of his outings have come against the Mauerless Twins and the powerless A’s? You can’t make any kind of evaluation yet. We have to be willing to acknowledge the effects of opposition strength this early in the season.
Actually, we have to acknowledge it for longer than that. Over two or three weeks, everyone’s schedule is more or less balanced for home and away, for travel purposes, but not for quality of opposition. That doesn’t always balance over a month or even two. It was a year ago that the Cardinals and Marlins got off to so-called “hot” starts that were mostly the function of impossibly weak early-season slates. When the schedules leveled off, so did the two teams’ performances, and they both finished well off of the pace in their respective divisions and in the wild-card race. The Brewers, just to name one example in the other direction, were under .500 when I wrote the piece above, but had played the third-toughest schedule in the game. They played as well from that point to the CC Sabathia trade (29-21, .580) as they did after acquiring the big lefty (41-32, .562).
Schedule strength trumps just about everything in the early going. The Pirates are 9-6, but it’s possible that they haven’t played a single team that will make the postseason, and it’s conceivable that they haven’t played a team that will finish above .500. The Rockies are 6-9, and it’s possible that their entire schedule has consisted of NL playoff teams: the Diamondbacks, Dodgers, Cubs, and Phillies. Even if that’s an exaggeration, this isn’t: the Pirates have yet to play a team that’s clearly better than any team the Rockies have played. So how can we evaluate either team?
Look, I get that I’m being a killjoy on this topic, and if you think I haven’t considered spending two months away from my keyboard just to avoid being this guy, you’re crazy. The reality is, and always will be, that you can’t draw conclusions from four starts, 50 at-bats, a dozen games, or two weeks of baseball. Not only is the game harder than that, but the variability of competition over any small slice of the schedule corrupts all attempts to divine meaning from small samples.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
What I find ironic is how those big rivalries (Reds/Indians around here) always sell out; they even raise ticket prices for that "big" series. Fans who think they are "supporting" their team are actually shooting the team in the foot, by encouraging with their dollars the perpetuation of the behavior that can keep their own team out of the post-season.
But think about what happens next. If you're a top-flight free agent, where do you want to play? In the "large" league, where the money is, or the "small" league? Any self-respecting superstar doesn't want to play in the "lower" league, where the media lights are not as bright. Over time, that sort of dynamic only intensifies the differences in the caliber of play between the leagues, and pretty soon you've just established a de-facto minor league, somewhere above AAA, but with the players in the league still aspiring to move up to the "higher" league.
As to the first response above, no, I don't think it's over the top. Which part do you doubt, that 6 games against tougher competition could swing a division? Or that the reason the schedule has persisted that way is because the "rivalries" are so lucrative?
My own preference on all-star game voting is similar to Joe's, but my criteria is "which player would I rather have on my team going forward?"
Q: Have we learned anything yet that would justify modifying our pre-season predictions about how good various players and teams will be this season?
A: No.
Some of the apparent surprises will pan out; some will go away. We don't have any way (yet) of telling which are which, and more data is the only cure for that.
Which isn't to say that I can't *hope*, as a fan, that Josh Johnson and Nelson Cruz are for real, while Erick Aybar's cold start is just random noise...
To respond to the Victor Martinez comment...what we can say is something like "he's played well for a couple of weeks, and combined with his late-season, post-injury play, it appears that he's returned to his established level." We're folding in additional information. It's not just his stat line.
The A's and possibly even the Mariners have at least as good a shot at the AL West as the Angels do.
It really wouldn't surprise me to see some team from the West with 85 wins win the division. It's wiiide open.
If they open the season with the good teams all playing each other, and the bad teams all playing each other, it will look like parity deep into the season, with a greater chance of big surprises and easier storylines.
I'd like to change that over time. To some extent, that involves repeating myself. Not for nothing, but understanding the problems with small sample size is a huge hole in the game of media, fans and many people within the game. It's a fairly important issue.
I mean, let's face it, small sample size is really goddamn boring.
Also, I do believe that you can tell things from watching players in a small number of at bats. If aging players are late on 89 MPH fastballs, it suggests something that may or may not already be showing up in the stat-line. If Scott Baker gives up 7 homeruns in 8.2 innings so far this year when he only gave up 20 in 170+ last year, it suggests something might be wrong.
The Marlins Win% might be meaningless, but some things become apparent quickly.
You've got a point.
Brett Cecil will be up and show you why he is better than any of the young pop-guns the Yankees have trotted out over the past two years.
Another team that might be considered to have overperformed thus far due to schedule structure would be the New York Yankees. I haven't seen it mentioned much, but here's what the Bronx Bombers have faced in 2009:
The Baltimore Orioles
The Kansas City Royals
The last-place Tampa Bay Rays
The last-place Cleveland Indians
The last-place Oakland Athletics
Despite this easy schedule, they're only tied for second place in the AL East by luck. They're 2.6 wins over Pythag, and at a +1.9 win difference with respect to third-order runs scored and allowed, in only 15 games played. Their pitching has allowed 6.47 runs per game, 28th-best of the 30 MLB teams. By actual, second-order, or third-order runs scored and allowed, they'd be fourth-best of the five AL East teams.
Certainly the double-digit win totals of the Marlins and the Blue Jays were good examples to choose for teams that have capitalized on a weak schedule. Neither were projected to do well in 2009. I'd offer that the Yankees, with weak opponents and the second-highest win total in the AL, might be almost as good an example, even if many forecasters and fans had them bound for the playoffs, not for the cellar.
On small sample size, 22-4 will skew the Pythag pretty badly until sometime in May.
One could reasonably remove the 15 games these teams have played against the Yankees, reducing the aggregate W-L record to 25-33. That's a .431 record in 58 games. That's still a full eight games below .500 in a bit more than a third of an MLB season. That's still a pretty significant bad record.
Any of the five teams the Yankees have faced might get better: I'd expect the Indians and Rays, in particular, to do that. Any of them might get worse, too: most folks consider the Royals and the Orioles to have overperformed significantly thus far in 2009. I see it as a tossup. Others' mileage may vary.
I guess that one could challenge the premise of considering any team to be good or bad at this point of the season. Certainly some others as well as you have challenged my doing so. I'd suggest, though, that actual W-L performance over a given stretch of games may be a more accurate metric of team quality--during that particular stretch--than pre-season predictions for 162 games would be. Many well-respected pre-season forecasters miss teams' ultimate performances by eight or more wins on average, an average error equating to the difference between a contender and an also-ran. We can do better: we can use historical records to determine strength of competition, even if we then use that as a tool in estimating a team's true talent, in this case the Yankees' true talent.
If you check the Quality of Pitchers Faced and Quality of Batters Faced, it looks as if Sabathia and, especially, Pettitte have faced easy opposition, while Burnett has faced tough hitters. It looks as if most Yankees position players have faced substandard pitching.
But if it's too early, in your estimation, to consider opposing teams' talent as truly known under any metric, I'll only add that the article about which I was commenting used as a premise that the strength of schedule could be used to judge the relevance of a given team's record, and that I was doing exactly that.
Phrases like "The last-place Tampa Bay Rays" are purposely intended to be as misleading as possible in order to prove your point, and you know that. They are in last place now but (as you can determine by reading Joe's article) that means almost nothing. The only clearly weak teams the Yanks have played are the Orioles and Royals.
Joe's point wasn't: "Teams off to hot starts always regress because of strength of schedule." It was more along the lines of: "DON'T TRY TO FIGURE ANYTHING OUT JUST QUITE YET."
Honestly, citing Pythag records this early in the season?? That just shows as much gross ignorance as the media types concluding that the Marlins are the "real deal" despite the fact that you managed to sneak some disguised statistical analysis in there.
The Yanks have one pitcher, Chien-Ming Wang, who is responsible for 23% of their runs allowed. This is obviously not a trend that will continue and is almost exclusively the reason for the ugly Pyth record. It doesn't at all show their ability to prevent runs as an organization.
Playing your game, I could say, the yanks have played 9 road games and 6 home games so far, have the game's best player on their DL, and can shave off over a run a game by replacing Wang with Hughes!! But again, it's too early to worry about any of these things.
Replacement Level Yankees Weblog has a post up comparing the Yanks' performance so far to their log5 expected record. Result: they're slightly ahead of expectations, but so little that basically they're where they should be.
You told us that the A's were the 6th best team in baseball, so I'm not sure how that helps support your argument that the Blue Jays have played an easy schedule.
You don't define what constitutes a "small part" (or a large part, or whatever), but it's hard to build an analytical argument that it's much more than a win. I used team strength from BP's depth charts and a little bit of log5, replaced the Tigers, Twins, and Rangers with the Yankees, Red Sox, and Rays respectively, and that's the expectation difference that I came up with: a little more than a win.
The Jays are running hot. That's it. SOS is pretty overblown, especially in this small a sample.
I love the Giants, but: HUH?
Imagine going into a meeting in the Giants Front Office and saying "Hey, marketing guys, please don't try and sell our players as being good because it's only a small sample size. Let's wait until we have a lot more data so we can be honest with our fans and tell them our players suck."
Pshaw!
Hey, that IS cool.
Whether it's strength of schedule or sample size, I think the main problem is a robust sports media system that has to write SOMETHING for the season's first month. And this is also fed by the fans' desire to see success (or failure) in the slightest of details. The fans are entitled to unrealistic expectations, but surely those in the media (who have presumably experienced April baseball before) should do a better job.
Although, as the comments above show, reaching a consensus isn't as easy as it sounds.
This year the April 23 record will probably be less predictive. The reason I say that is that the correlation with last years records is .04. Got to be an anomaly.
http://www.baseball-reference.com/games/standings.cgi?select=2008&year=2008&month=4&day=24&submit=Find+Games
Would you rather get the Angels right now while the rotation is in limbo and Vladdy is on the DL or perhaps in August when things may be better for them?
How about a team that's on a roll with their bats all cranking up at the same time such as the Red Sox are doing right now? It's a long season with a lot of ebb and flow.
(26746)
This is a good point. It is more important how well Team A is playing at the time Team B plays them, than the way Team A plays on average for the entire 162 games.
It is also important where Team A is in terms of the SP'er rotation when you meet them, and where your rotation is as well. Going up against Team A who is going with their #1-#3 SP'ers while you run out your # 3-#5 SP'er,s will be tougher than going up against them with their #3-# Sp'ers while you roll out your #1- #3 SP'ers.
It's like a football team with 2 QB's, the team playing with Tom Brady might be tougher to beat than the same team with Cassel at QB.
Teams facing the Mariners, Blue Jays and the Royals may find the going tough now when they are playing well, but those teams who face them for the first time in July or August may find them easier to handle. They may meet these teams the same number of times, but one team meets them when they are hot, and another team meets them when they are cold or are playing at the level one would expect.
It should even out over the long run, but maybe not over a 162 game schedule. Thats worth studying.