Crooked Numbers: More on the Lineup

Last
week’s column about lineup order optimization generated a greater
response than I anticipated, especially for one with such loose
conclusions, so I’m going to dig a little deeper into the topic. So
far, the only application of the lineup program has been checking
various basic ideas–like sorting by descending or ascending AVG, OBP,
and SLG and bunching of the better hitters–but there’s a bit more that
can be done before adding some more enhancements to the program to see
if we can attempt to adjust for baserunning, steals, and platoons.

One of the more interesting questions left unanswered last week was
just how important sorting by OBP or SLG is. By using two lineups for
each metric–one in ascending order and one in descending order–it
was clear that players with higher OBP and SLG should be near the top of
the order. Sorting by absolutely the wrong way only changed the lineup
output by 26 runs at the OBP mean and 13 at the SLG mean. Considering
the sample size and the standard deviations, the results were close to
statistically significant, but the confidence was not high. Thus, we
could only loosely conclude that OBP is more important than SLG when
determining a lineup order when all other factors are equal.

What was not addressed was the fact that teams often have to make the
choice between the two. It’s easy to choose to bat a player with a
.260/.330/.500 line earlier than a player with a .260/.330/.400 line,
but things become a little muddled with comparing something like
.260/.310/.500 to .260/.360/.380.

To begin to take a look at that question, I put together a new team,
but to keep things simple, this team only has three players. First up is
Wily Mo Pena, the resident high-SLG, low-OBP sample
point with a 2004 line of .259/.316/.527. Pena is the only player last
year who slugged at least .500 with an OBP of lower than .320 in at
least 300 PAs. Congratulations, Wily. Next is Luis
Castillo, selected for his .291/.373/.348 performance last
year. Castillo’s OBP outpaced his SLG by one of the largest differences
in the league, thus making him the perfect candidate for the high-OBP,
low-SLG slot. Finally, we’ll plug the last hole with Morgan
Ensberg who comes in at an impressively league average
.275/.330/.411. Though he is a little shy in the power department,
Ensberg makes a nice “this porridge is just right” player between Pena
and Castillo.

Each of these players was given three spots in the lineup and then all
possible lineup combinations of these three players were run through the
program (which runs each lineup through 1,000 seasons), giving us a
sample size of well over a million seasons by the time things are all
finished. The program outputs a minimum, mean, and maximum for each
lineup. I also outputted the full results for the first 50 lineups to
check standard deviations, all of which were between 39 and 41 runs. Of
all the lineups, the highest mean runs scored was 834; the lowest mean
was 816. Despite testing every possible combination with these three
players, the range of means over the entire sample was 18 runs. There’s
just not that much difference.

Still, 18 extra runs can be hard to come by when shopping for
players, so it’s still worth looking into a little more deeply. For each
player, I’ve averaged how many runs the team scored when they were in a
given lineup spot. Here’s what we’ve got:

While the range above is very small, the sample size of data is large
enough to draw a few conclusions from the data. First, notice how Pena
and Castillo are extremely divergent in the #1, #3, and #4 spots in the
order, but are almost equal in the #2 and #5 spots. Having a high-OBP
player in the top spot maximizes run scoring, but the advantage of OBP
is quickly lost to SLG, perhaps as early as the second spot in the
lineup. On-base percentage comes back with a vengeance in the bottom
four spots. Ensberg–the average player data point–appears to outpace
both high-OBP and high-SLG towards the bottom of the lineup, but I
wonder how much of that is simply the fact that he’s not as good of a
hitter as the other two; the apparent run scoring when he’s at the
bottom of the order may simply be a result of Pena and Castillo getting
more plate appearances when he’s at the top of the order.

Looking at the best and worst performing lineups confirms a little of
this. Here are the three lineups that mustered the maximum 834 run mean:


Pos Lineup 1    Lineup 2    Lineup 3
------------------------------------
#1: Castillo    Castillo    Castillo
#2: Castillo    Castillo    Castillo
#3: Pena        Pena        Pena
#4: Pena        Pena        Pena
#5: Pena        Pena        Pena
#6: Castillo    Ensberg     Ensberg
#7: Ensberg     Castillo    Ensberg
#8: Ensberg     Ensberg     Ensberg
#9: Ensberg     Ensberg     Castillo

And the two that notched the minimum 816:


Pos Lineup 4    Lineup 5
--- -------- --------
#1: Pena        Ensberg
#2: Castillo    Pena
#3: Castillo    Castillo
#4: Castillo    Castillo
#5: Ensberg     Ensberg
#6: Ensberg     Ensberg
#7: Ensberg     Pena
#8: Pena        Pena
#9: Pena        Castillo

From this small sample, Pena’s power in the fifth spot looks to
slightly outweigh his value in the second spot. Castillo still finds his
way towards the top of the lineup in the first of the worst lineups, but
the biggest difference between the worst and best lineups is the
presence of a couple Wily Mo’s at the bottom of the order. One other
interesting point to note is that Lineup 3 and Lineup 4 both have the
same bunching, they just happen to start at different parts. In this
example, bunching of high-SLG or high-OBP hitters does not appear to
have a significant effect on run scoring.

It’s rare for a team to have three Penas or Castillos, so in another
effort to see where their particular talents are best suited, I ran through
nine lineups: eight average players and Pena or Castillo batting in all
nine positions in the lineup. Here’s how they shook out:

Pena shows a great deal more range in his results than Castillo,
peaking out in the three and four spots, as expected from the previous
results. Interestingly, this result appears even without the typical
poor hitters at the bottom of a lineup. Most of the criticism of putting
a slugger towards the top of the lineup centers around the reduced
number of baserunners on base in front of a slugger, but the results
here seem to indicate that the advantage is something else, perhaps the
right combination of leading off the first inning with a better OBP, but
still getting the slugger the maximum number of plate appearances. While
Castillo’s top production is in the first spot, he shows far less change
as he moves down the lineup.

So where does this leave us? Remember that we’re dealing with a very
small range of possible outcomes, meaning that much of the data being
drawn from these results cannot be considered statistically significant.
That said, when teams have a choice between a high-SLG, low-OBP player
like Pena and a high-OBP, low-SLG player like Castillo, the traditional
lineup structure with Castillo towards the top and Pena in the 3-5 spots
yields near maximum run scoring. Though it may be ideal to bat
baseball’s best hitters–those who are among the league leaders in both
OBP and SLG–towards the top of the lineup, teams that are forced to
choose between high OBP and SLG appear to already be following a
near-optimal model for maximizing run scoring.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Crooked Numbers: More on the Lineup

Thank you for reading

Latest Articles

Single-A Dynasty Pitching Prospect Standouts, April 2024 $

FAAB Review 2024: Week Four $

MLU: Meet Sem in St. Louis $

Box Score Banter: Gladly Pay You Tuesday B

The Splitter “Revolution” Part 1 B

James Click

Latest Articles

Single-A Dynasty Pitching Prospect Standouts, April 2024 $

FAAB Review 2024: Week Four $

MLU: Meet Sem in St. Louis $