I have been spending quite a bit of time lately reading The Book by Tom Tango, and I would like to share some of what I have learned, specifically regarding lineup optimization. Even more specifically I'd like to focus on the Toronto Blue Jays when it comes to optimizing lineups. However, before I do this I will provide some background, some context if you will.
Please keep in mind that all of the numbers below were taken from The Book, and since it was published in 2007, some the data may no longer be completely relevant. I have chosen to focus solely on the American League since there is no need to deal with the complications of pitchers batting with the Blue Jays anyway.
24 Base/Out States
Lineup optimization relies heavily on the 24 base/out states in baseball - all the different combinations of base runners and outs that are possible in any given inning. Nobody on base with 0 outs, 3 men on base with 2 outs are two examples of the 24 base/out states.
The Importance of Context
First, as context to our context, let's look at plate appearances (PA) per game, by batting order.
Batting Order Position (BOP) |
PA/Game |
1 |
4.83 |
2 |
4.72 |
3 |
4.61 |
4 |
4.49 |
5 |
4.39 |
6 |
4.26 |
7 |
4.14 |
8 |
4.02 |
9 |
3.90 |
In this data, we can see that each spot in the batting order has about .11 more PA per game than the next. This equates to approximately 2.5% more PA per game. This will become important later.
In this context, without considering any outside factors, it would be apparent that your best hitter should bat in the 1 slot. However, this is obviously not the only factor - let's look at another.
How about more context? Let's look at another.
BOP |
Out 0% |
Out 1% |
Out 2% |
1 |
48 |
26 |
26 |
2 |
33 |
41 |
26 |
3 |
28 |
35 |
37 |
4 |
34 |
31 |
35 |
5 |
35 |
33 |
33 |
6 |
33 |
34 |
33 |
7 |
33 |
33 |
34 |
8 |
34 |
33 |
33 |
9 |
34 |
33 |
33 |
Here we look at the frequency at which a batting order comes up with 0 out, 1 out, and 2 out. As you can see, the 1 slot comes up to bat with 0 out 48% of the time largely due to the first inning. In this context, again, the 1 hitter should be your best hitter, specifically your best hitter at getting on base.
As previously stated, batting order relies on different base/out states and the frequency at which these states occur.
In this table (again, AL-only) we look at frequency of base/out states. How often does a batting order spot have men on base? How many men on average does the spot have on base when it comes to the plate?
BOP |
PA empty |
PA men on |
% PA with men on |
# of runners on per game |
1 |
3.11 |
1.72 |
36 |
2.39 |
2 |
2.63 |
2.09 |
44 |
2.77 |
3 |
2.38 |
2.23 |
48 |
3.00 |
4 |
2.19 |
2.31 |
51 |
3.20 |
5 |
2.28 |
2.11 |
48 |
3.10 |
6 |
2.29 |
1.97 |
46 |
2.84 |
7 |
2.20 |
1.94 |
47 |
2.74 |
8 |
2.17 |
1.85 |
46 |
2.61 |
9 |
2.13 |
1.77 |
45 |
2.48 |
In this context, the four hitter has the most plate appearances with men on, the greatest percentage of plate appearances with men on, and the most runners on per game, on average. In this context, clearly our best hitter hits 4^{th} in the lineup. Our previously assumed spot for our best hitter (leadoff) actually has some of the worst numbers here. Only 1.72 of his plate appearances take place with men on base - a mere 36% of his total plate appearances - and only has 2.39 runners on per game. This is a direct result of the hitters in front of him in the batting order being average to below-average. He also is guaranteed one more PA with no men on base due to the fact that he leads off the game 100% of the time. This is the importance of context when discussing lineup optimization.
There are many more contexts involving many more base/out states that I could go over, but I believe that these three charts are enough background to the main chart.
This next chart, Tom Tango looks at the run values of each possible offensive event for each batting slot. This can be found on page 128 of The Book for those who are looking for it. Do you remember when I mentioned that each batting slot has 2.5% more plate appearances per game than the next? This chart is where that stat comes in handy. Here we modify the run values by that 2.5% for each slot. Tango used the fifth batting slot as a reference and normalized its run value to 0%, to make this chart for PA factor (this is not the big chart...):
BOP |
PA Factor (%) |
1 |
+10 |
2 |
+7.5 |
3 |
+5.0 |
4 |
+2.5 |
5 |
0 |
6 |
-2.5 |
7 |
-5.0 |
8 |
-7.5 |
9 |
-10 |
Another way to think about this is that each batting slot has 2.5% more plate appearances per game than the slot below, and therefore has 2.5% more of an effect on the outcome of the game than the following slot.
The Big Chart
Here is the big chart. These numbers are known as Run Expectancy (RE). RE defines an event by the expected runs caused by the event. So, a single by the 1 hitter has a .515 RE value, as it on average would result in .515 runs for the batting team.
Note: NIBB stands for ‘Non-Intentional Walk', RBOE stands for ‘Reached Base on Error'.
BOP |
1B |
2B |
3B |
HR |
NIBB |
HBP |
RBOE |
K |
OUT |
1 |
.515 |
.806 |
1.121 |
1.421 |
.385 |
.411 |
.542 |
-.329 |
-.328 |
2 |
.515 |
.799 |
1.100 |
1.450 |
.366 |
.396 |
.536 |
-.322 |
-.324 |
3 |
.493 |
.779 |
1.064 |
1.452 |
.335 |
.369 |
.514 |
-.317 |
-.315 |
4 |
.517 |
.822 |
1.117 |
1.472 |
.345 |
.377 |
.541 |
-.332 |
-.327 |
5 |
.513 |
.809 |
1.106 |
1.438 |
.346 |
.381 |
.530 |
-.324 |
-.323 |
6 |
.482 |
.763 |
1.050 |
1.376 |
.336 |
.368 |
.504 |
-.306 |
-.306 |
7 |
.464 |
.738 |
1.014 |
1.336 |
.323 |
.353 |
.486 |
-.296 |
-.296 |
8 |
.451 |
.714 |
.980 |
1.293 |
.312 |
.340 |
.470 |
-.287 |
-.286 |
9 |
.436 |
.689 |
.948 |
1.249 |
.302 |
.329 |
.454 |
-.278 |
-.277 |
Again, run expectancy of each base/out/batting slot state was based on the previous contextual real-life numbers from the American League. The above numbers were all calculated from the first four tables in this article.
Making Sense of It All
I'd like to highlight some noticeable comparisons from this chart in terms of run values for base/out states between batting slots.
#4 and #5 Hitters:
The 4 hitter has a slight advantage in nearly every category, and a relatively large advantage in HR (0.034 RE). The 4 hitter should be a better power hitter than the 5 hitter.
#4 and #2 Hitters:
The 4 hitter has a 0.02 RE advantage in extra-base hits, but the 2 hitter has an identical 0.02 RE advantage in NIBB and HBP. From this, the overall quality of the #4 and #2 hitters should be roughly the same, with a slight advantage to the #4 hitter in SLG and a slight advantage to the #2 hitter in OBP.
Remember that the 2 hitter will have 5% more plate appearances than the #4 hitter, but the #4 hitter will, on average, have more men on base throughout the game.
#2 #3 and #4 Hitters:
Run values of nearly each event favour the 2 hitter over the 3 hitter by 0.02 to 0.03 runs. Run values also favour the 4 hitter over the 3 hitter overall. This means that the 3 hitter should be a worse hitter than the 2 and 4 hitters, who again are roughly equal in terms of overall offensive skill.
#3 and #5 Hitters:
The 5 hitter has a 0.03 RE advantage on 1B, 2B, 3B and NIBB, whereas the 3 hitter has the 0.02 RE advantage on only the HR. All outs for the 5 hitter are more costly than the same outs from the 3 hitter. From this we can see that the 5 hitter should be better than the 3 hitter, but the better homerun hitter should be slotted third. This is mostly due to the fact that the 3 hitter comes up to bat with 2 outs more frequently, so he has less of a chance to do damage, unless by home run.
#6 through #9 Hitters:
The 6 hitter has lower RE values than the 5 slots above his across the board. The 7 hitter is the same, but with all slots above him. This continues down to the 9 hitter, therefore the 6 through 9 hitters should decrease in overall offensive skill as they progress down the order.
#1 Hitter:
The 1 hitter is closest to the 2 and 5 hitters in terms of RE values; however the 1 hitter holds a large advantage in walks and a large disadvantage in home runs. From this, we gather that the 1 hitter should have the ability to draw walks.
#1 and #2 Hitters:
The 1 hitter gets a 0.02 RE advantage on 2B, 3B and BB; the 2 hitter gets a 0.02 RE advantage on HRs only. Therefore the quality of the number 1 hitter should actually be roughly the same as the 2 hitter in terms of offensive skill. The 1 hitter should have more discipline and the 2 hitter should have more power.
Fair warning:
These differences are for the most part approximately 0.02 runs per plate appearance at most. Over the course of an entire season, or around 700 PA, that equates to 1.4 runs gained. Over the course of a season, with proper lineup optimization, a team has the chance to gain approximately 10 to 15 runs.
This passage is from page 132 of ‘The Book' by Tom Tango. I could paraphrase it, but I believe The Book says it best:
"Your three best hitters should bat somewhere in the #1, #2, and #4 slots. Your fourth- and fifth-best hitters should occupy the #3 and #5 slots. The #1 and #2 slots will have players with more walks than those in the #4 and #5 slots. From slot #6 through #9, put the players in descending order of quality."
Here is Tom Tango's ideal lineup from The Book. I feel as though constructing a lineup as talented as this one would be a challenge, however it is still interesting to see what Tango thinks a real optimized lineup should look like.
BOP |
AVG |
OBP |
SLG |
wOBA |
1 |
.273 |
.439 |
.439 |
.407 |
2 |
.298 |
.390 |
.529 |
.405 |
3 |
.275 |
.326 |
.466 |
.352 |
4 |
.298 |
.362 |
.578 |
.407 |
5 |
.260 |
.364 |
.396 |
.351 |
6 |
.250 |
.316 |
.402 |
.326 |
7 |
.242 |
.307 |
.389 |
.316 |
8 |
.234 |
.298 |
.377 |
.307 |
9 |
.227 |
.290 |
.366 |
.298 |
Again, as stated above, with proper lineup optimization, a lineup can gain approximately 10 to 15 runs per 162-game season. Ultimately, these 10 to 15 runs over the course of a season could be the difference between winning 90 games and winning 94 games. This may not seem like a lot, but it may have been useful to a certain AL West manager. Of course, it could also do nothing. Lineup optimization is just a tool that can be useful when used in the right situation, and it's something that has always been fascinating to me.
I want to thank you for reading this far, if you've actually made it, and I look forward to writing for you again. Next time I'll look at optimizing the Blue Jays lineup.