Lies, Damned Lies: Running down SOB

Reader Mike Mitchell writes:

Watching Jose Reyes in Fox’s Mets/Cubs Flat Earth Society Game of the Week, I couldn’t help but wonder if modern sabermetrics does a fair job accounting for low-OBP, high-speed players like Reyes, who score in a higher percentage of their times on base than slower players….

Is it possible the next five years could bring a new statistic, call
it Speed-adjusted OB% (fittingly, SOB), that would take Jose Reyes’s .304 OBP, factor in his ability to turn his baserunning speed/saavy (other than mere base stealing success) into additional runs for the Mets offense, and come up with a speed-adjusted .329 SOB, meaning he contributes the same run-scoring ability to the Mets offense as an average baserunner with a .329 OBP?

A lot of times you’ll hear the case made that OBP undervalues a player like Jose Reyes or Carl Crawford because it doesn’t account for their baserunning ability. This is a perfectly reasonable argument. Getting on base, as Mike intimates, is not the goal. Rather, getting on base is a means to an end, that end being scoring runs. But running the bases well is also a means to that end. If Bill Mueller gets on base five percent more often than Scott Podsednik, but Podsednik scores 10 percent more often than Mueller those times that he does reach base, which player is the more valuable run-generator?

Actually, this question isn’t as straightforward as it might seem. Much of the value in reaching base is really in avoiding outs. Stealing second base once you’ve reached first, or scoring from second on a double when a slower runner might have held up: these are valuable skills. But they aren’t as important as reaching base in the first place, which not only puts a runner on base for the team to work with, but also preserves one of its irreplaceable 27 outs.

What would be useful, as Mike suggests, is a way to be able to quantify baserunning contributions in terms of OBP. Why OBP and not, say, slugging percentage? Because OBP is essentially a measure of setting oneself up to score runs, whereas SLG is a measure of driving runs in. Compare, say, a single and a walk. Each of these outcomes makes it just as likely that you’re going to score for yourself, but the former does more to advance other runners along; it makes sense that a single improves a player’s slugging percentage while a walk does not. Similarly, compare a single and a subsequent stolen base to a double. Either way, the batter winds up on a second base, but the double does more to drive other runners in. The notion that baserunning ability belongs in the same category as on-base ability is a sensible one.

The way to attack this problem is to think in terms of times reaching base (TRB). Is a stolen base 80% as valuable as a TRB? Is it 40% as valuable? For purposes of this column, by the way, we’re going to think of TRB as specifically referring to reaching first base with the bases empty. This allows us to sidestep a nasty little ambiguity of the OBP/SLG system: a leadoff double does more to facilitate scoring runs than a leadoff single, even if there are no other runners on base to advance. There’s an argument that, if OBP is a measure of the runner’s ability to set himself up to score, then a double should be given more credit than a single in terms of OBP. But at least the double is reflected in the SLG side of the OBP/SLG dyad, whereas a stolen base disappears into the ether. Similarly. with a runner on second base as opposed to the bases empty, a single becomes proportionately more valuable than a steal. But this is presumably what’s reflected in the SLG side of the equation–driving other players toward a run. It might be worthwhile, at some point in the future, to try and redefine OBP and SLG entirely, but that effort is beyond the scope of this article, and we’ll keep things simple by defining a TRB in the narrow way that I have above.

With that caveat in mind, let’s take a look at some key figures from the 2004 Expected Runs Matrix. These are the major league average run scoring figures that resulted in 2004 given various numbers of outs, and various numbers of runners on base:


    [A]: Zero outs, bases empty: .538 runs/inning
    [B]: One out, bases empty: .287 runs/inning.
    [C]: Zero outs, runner on first: .926 runs/inning.
    [D]: Zero outs, runner on second: 1.160 runs/inning.

When the leadoff hitter comes up to start the inning, his team will score an average of .926 runs if he reaches first base [C], but .287 runs if he doesn’t [B], for a difference of .639 runs. If reaches first base and subsequently manages to steal second, his team’s run scoring expectation increases from .926 runs [C] to 1.160 runs [D], a difference of .234 runs. We can define the value of a stolen base in terms of TRB as:


    [D]--[C]    1.160--.926    .234
    ---------- =  ----------- = ------- = .366
    [C]--[B]     .926--.287    .639

In other words, for the leadoff hitter, a stolen base is worth about 36.6% of a TRB; twenty steals are about as valuable as seven walks. Continuing to use figures from the 2004 Expected Runs Matrix, if we change the situation slightly, and assume that the leadoff hitter makes an out, a TRB for the #2 hitter is worth .436 runs, while a stolen base is worth .160 runs, or 36.7% as much. With two outs and nobody on, a TRB is worth .246 runs, and a stolen base .090 runs, or 36.6% as much. It’s probably just a coincidence that these ratios come out as close to one another as they do–if we looked at the 2005 Expected Runs Matrix instead, they wouldn’t work out quite so evenly–but we’ll define a stolen base as equal to .366 of a TRB.

What about a caught stealing? Given the way that we’ve defined a TRB, this question is easy to answer: a caught stealing exactly nullifies the value of a TRB, and so it is worth exactly -1 TRB. This implies, by the way, a breakeven stolen base success rate of about 73%. This is materially higher than the .67 (two-thirds) figure that is commonly cited. Without getting into too much of a tangent, there are a couple of reasons to think the higher figure is valid:

Where the two-thirds number is cited, it has generally been developed from an evaluation of a long cross-section of baseball history, during most of which the run-scoring environment is not as favorable as it is now. When it is harder to score runs, attempting to steal a base becomes a more worthwhile risk. On the other hand, you need to be more certain of success when there are several players behind you who could hit a double or a home run, scoring you from first.
The two-thirds figure has also generally been developed from regression analysis, which looks at various independent variables and tries to determine their relation to scoring runs. The problem is that, since there are few conventional measures of baserunning ability apart from stealing bases, baserunning is not adequately represented in the independent variables. Because the players who steal a lot of bases are also likely to be proficient baserunners, the regression analysis will pick up some residual baserunning ability in the SB column. This might actually be a favorable result if we couldn’t otherwise quantify the value of baserunning, but now that we’ve developed some measures that can quantify basestealing–we’ll get to this in just a second–it’s best to keep stolen bases and baserunning separate.

So, for example, we might look at Endy Chavez, who reached base 170 times in 535 opportunities last year (plate appearances less sacrifice hits), for a paltry .318 OBP. Chavez also stole 32 bases last year, equivalent to 11.7 TRB, while being caught 7 times (-7 TRB). His stolen base-adjusted OBP (SBOBP) can be defined as:


    H + BB + HBP + (SB * .366)--CS   139 + 30 + 1 + (32 * .366)--7
    ------------------------------- = ------------------------------
                PA--SH                           547--12

This works out to .327, not quite ten points better than Chavez’ conventional OBP, and certainly not enough to make him an adequate hitter.

As I’ve alluded to, however, there is another part of the equation to consider: baserunning ability above and beyond stealing bases. Developing an effective measure of baserunning ability is a complicated matter–you need to consider the base-out situations very precisely, while also accounting for things like the arm strength of the outfielders. Fortunately, James Click has done all of the heavy lifting in his baserunning article for Baseball Prospectus 2005, which takes all of these things into account, and boils them down into Equivalent Basestealing Runs (EqBR). (We hope to have regularly updated EqBR reports up on the site very soon).

EqBR measures everything in terms of runs relative to league average. In order to come up with our SOB formula (I like Mike’s suggestion for the name of this statistic), we need to translate back into times reaching base. Returning to the Expected Runs Matrix, we find that with nobody out and nobody on, a TRB is worth .639 runs, with one out and nobody on, .436 runs, and with two outs and nobody on, .246 runs. The average of these numbers is .440, or slightly less than half a run. To be more precise, a marginal basestealing run is worth 2.27 times (1/.440) as much as a TRB.

That gives us our formula for SOB, or Speed-Adjusted OBP.


            H + BB + HBP + (SB * .366)--CS + (EqBR * 2.27)
  SOB =     ------------------------------------------------
                              PA--SH

Here, for example, is Tony Womack‘s figure; Womack had 26 steals last year, was caught 5 times, and produced 3.04 EqBR.


            170 + 36 + 3 + (26 * .366)--5 + (3.04 * 2.27)
  SOB =     ---------------------------------------------- = .369
                              606--8

Womack’s SOB is .369, which compares favorably to his conventional OBP of .349.

Let’s take a look at a couple of leaderboards for the 2004 season. First, here are the fifteen major league regulars (minimum 500 PA) who benefitted the most from the baserunning adjustments:


Best Baserunners, 2004

Player          SB  CS  EqBR     OBP     SOB      +/-
Carlos Beltran  42  3   +3.9    .367    .398    +.030
Rafael Furcal   29  6   +4.9    .344    .370    +.025
Scott Podsednik 70  13  +2.2    .313    .338    +.025
Vernon Wells    9   2   +4.5    .337    .357    +.019
Tony Womack     26  5   +3.0    .349    .369    +.019
Ryan Freel      37  10  +3.4    .375    .394    +.019
Aaron Rowand    17  5   +3.9    .361    .380    +.019
Bobby Abreu     40  5   +1.5    .428    .446    +.018
Carl Crawford   59  15  +2.5    .331    .349    +.018
Endy Chavez     32  7   +2.1    .318    .335    +.018
Eric Byrnes     17  1   +2.3    .347    .363    +.017
Omar Vizquel    19  6   +3.7    .353    .368    +.015
Scott Rolen     4   3   +4.5    .409    .424    +.015
Mike Cameron    22  6   +2.7    .319    .334    +.015
Royce Clayton   10  5   +4.4    .338    .351    +.014

For the most part, these are exactly the names that you’d expect to see, though there are a couple of players like Scott Rolen who aren’t prodigious base stealers, but are very efficient baserunners. A pretty good baserunner might be worth 15 or 20 points of OBP more than his conventional stats would indicate; a truly elite baserunner like Podsednik or Carlos Beltran could be worth some 25 or 30 points of OBP more.

The very best season I could find using this method was Rickey Henderson’s 1985; Henderson stole 80 bases that year, was caught 10 times, and produced 8.10 EqBR. That gives him an SOB of .476, fully 57 points better than his already impressive .419 OBP. That underscores just how great a player Henderson was, but his season also appears to be something of an outlier; there were few other seasons that crept above the 30 point threshold, and nobody was in the neighborhood of Rickey’s 57-point gain.

Conversely, here are the players whose baserunning hurt them the most in 2004:


Worst Baserunners, 2004

Player          SB  CS  EqBR     OBP     SOB      +/-
A.J. Pierzynski 0   1   -4.4    .319    .297    -.022
Jorge Posada    1   3   -3.7    .400    .380    -.020
Mike Piazza     0   0   -4.7    .362    .342    -.020
Jim Thome       0   2   -3.6    .396    .380    -.017
Sammy Sosa      0   0   -3.4    .332    .318    -.014
Jacque Jones    13  10  -1.5    .315    .301    -.014
Carlos Delgado  0   1   -3.0    .372    .358    -.014
Rafael Palmeiro 2   1   -3.8    .359    .346    -.014
Mike Lieberthal 1   1   -2.6    .335    .323    -.012
Lance Berkman   9   7   -2.0    .450    .438    -.012
Manny Ramirez   2   4   -2.0    .397    .385    -.012
Jose Guillen    5   4   -2.1    .352    .340    -.011
Jose Cruz Jr.   11  6   -2.3    .333    .322    -.011
Phil Nevin      0   0   -3.0    .368    .357    -.011
Derrek Lee      12  5   -2.7    .356    .346    -.010

The bad baserunners don’t appear to hurt quite as much as the good baserunners help. In any event, this list is chock full of aging catchers and first basemen, as we’d anticipate. Brian Sabean’s unceremonious dumping of A.J. Pierzynski looks a little bit better now. As for Derrek Lee–who is a big first baseman but has a reputation as a good baserunner–well, these figures do not compensate for the Wendell Kim factor.


A comparison of the 2004 OBP and SOB leaderboards:

                     OBP                         SOB
Barry Bonds         .609    Barry Bonds         .614
Todd Helton         .469    Todd Helton         .477
Lance Berkman       .450    J.D. Drew           .448
J.D. Drew           .436    Bobby Abreu         .446
Bobby Abreu         .428    Lance Berkman       .438
Melvin Mora         .419    Scott Rolen         .424
Jim Edmonds         .418    Ichiro Suzuki       .418
Albert Pujols       .415    Albert Pujols       .417
Ichiro Suzuki       .414    Melvin Mora         .416
Travis Hafner       .410    Travis Hafner       .416
Scott Rolen         .409    Jim Edmonds         .413
Jorge Posada        .400    Eric Chavez         .407
Jason Kendall       .399    Vladimir Guerrero   .401
Eric Chavez         .397    Jason Kendall       .398
Manny Ramirez       .397    Carlos Beltran      .397
Jim Thome           .396    Hideki Matsui       .397
Erubiel Durazo      .396    Adam Dunn           .396
Gary Sheffield      .393    Ryan Freel          .394
Mark Loretta        .391    Gary Sheffield      .393
Vladimir Guerrero   .391    Johnny Damon        .390

Baserunning should have its place in sabermetric analysis, and it’s nice to see someone like Ryan Freel, who also ranks very well in the preliminary 2005 SOB figures that I’ve prepared (see below), get his due. In a few marginal cases, like that of Podsednik, baserunning might even be enough to warrant placing a player in the lineup when he otherwise should not be. For the most part, however, baserunning and base stealing ability just doesn’t make a large enough magnitude’s worth of difference to change our conclusion about who the good offensive players are, and who are the bad ones.

Finally, presented without further comment, the best and worst baserunners of the 2005 season to date (minimum 300 PA):


Best Baserunners, 2005

Player          SB  CS  EqBR    OBP     SOB       +/-
Carl Crawford   34  5   +2.7    .317    .343    +.026
Nook Logan      20  5   +2.6    .299    .325    +.026
Mike Cameron    13  1   +2.2    .341    .367    +.026
Ryan Freel      29  7   +1.7    .381    .405    +.024
J.J. Hardy      0   0   +2.7    .303    .323    +.020
Coco Crisp      12  5   +4.3    .341    .361    +.020
Bobby Kielty    3   2   +3.5    .356    .375    +.019
Bill Hall       13  1   +1.4    .311    .329    +.018
Edgar Renteria  8   3   +3.9    .342    .360    +.018
Marcus Giles    13  3   +2.8    .383    .400    +.017
Julio Lugo      30  5   +1.2    .354    .371    +.017
Ruben Gotay     2   2   +2.9    .284    .301    +.017
Jose Reyes      41  10  +1.6    .296    .312    +.016
Alfonso Soriano 18  2   +1.4    .327    .343    +.016
Mark Teahen     5   1   +1.8    .302    .317    +.015


Worst Baserunners, 2005

Player              SB  CS  EqBR    OBP     SOB       +/-
Matt Lawton         16  9   -4.3    .380    .351    -.029
Phil Nevin          1   0   -4.0    .301    .273    -.028
Chris Snyder        0   1   -2.9    .297    .272    -.025
Mike Cuddyer        3   3   -2.7    .341    .317    -.024
Sammy Sosa          1   1   -3.8    .303    .280    -.023
Brad Ausmus         3   2   -2.7    .328    .305    -.023
Freddy Sanchez      0   1   -2.6    .330    .307    -.023
Cesar Izturis       7   8   -1.6    .306    .286    -.020
Frank Catalanotto   0   2   -1.8    .353    .335    -.018
Jason Phillips      0   1   -2.4    .291    .273    -.018
Melvin Mora         7   3   -3.4    .343    .325    -.018
Hank Blalock        1   0   -4.2    .338    .321    -.017
Daryle Ward         0   2   -2.1    .328    .311    -.017
Omar Vizquel        19  9   -2.6    .334    .317    -.017
Kevin Millar        0   1   -2.5    .357    .341    -.016

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Lies, Damned Lies: Running down SOB

Thank you for reading

Latest Articles

Fantasy Shortstops to Avoid in 2025 $

Early ADP Analysis ’25: Shortstop $

What They’re Saying: Angels in the Infield Edition $

The 10/90 Scale ’25: Anthony Volpe $

The BP Dynasty 101 for 2025 $

Nate Silver

Latest Articles

Fantasy Shortstops to Avoid in 2025 $

Early ADP Analysis ’25: Shortstop $

What They’re Saying: Angels in the Infield Edition $