## The Mixture is All of Us and We're Still Mixing: Do Pitchers with More Diverse Arsenals Outperform their Peripherals?

It's no secret that we are interested in determining what factors allow some pitchers to sustain lower ERAs than their strikeout-, walk-, and groundball-rates would suggest. Hence, this, another installment for this endless series in which we try to determine what factors allow pitchers to outperform their peripherals. In this installment, we'll be looking at whether pitchers with more diverse arsenals are able to keep hitters off-balance, allowing them to induce weaker contact. I have long considered this a possible reason that Shaun Marcum, who is notable for his array of pitches and his willingness to use any pitch in any count, has been able to outperform his peripherals for so long.

Before we can start looking at possible effects of arsenal diversity, we need to quantify the diversity of each pitcher's arsenal. As such, I chose the Shannon-Wiener index, a measure commonly used to estimate biodiversity in ecological communities. I used a sample of all pitchers who pitched 100+ innings in 2011 (a total of 145 pitchers) and exported their pitch type data from fangraphs and used the vegan package in R to calculate Shannon-Wiener diversity. Essentially, pitchers were analogous to communities and pitch types were analogous to species. The index takes both the number of different types of pitches a pitcher throws and the evenness of his usage of those pitches into account. The pitch distributions is an important factor here -- the index should be less influenced by a "see-me" changeup used two or three times a start than by a pitcher who uses his changeup five or six times as frequently. The index scales from zero (which would be a pitcher who uses the same pitch 100% of the time) to the natural logarithm of the number of different pitches a pitcher throws. As an example, a realistic maximum might be a pitcher who throws seven different pitches and uses them all equally. His arsenal would have a diversity index of log(7) = 1.946, so we can say that the index (for pitchers) scales from roughly 0 to 2.

Ever wonder which pitchers have the most diverse arsenals? Well, at the top of the list is actually our old friend, Shaun Marcum at 1.525 (mean diversity = 1.084; see the end of the article for the entire list of pitchers). Remember that these values are calculated on a log-scale, so Shaun Marcum has a much more diverse arsenal than the average pitcher. At the bottom of the list, as you might have guessed, you'll find extreme one-pitch specialists, like Tim Wakefield and Justin Masterson. Due to the innings exception, Mariano Rivera is not included, but his diversity score is 0.407. If you were unconvinced about the method before, I hope seeing Marcum near the top and these other pitchers near the bottom has assuaged your fears.

Now that we have figured out a way to estimate a pitcher's arsenal diversity, we need to figure out a way to evaluate that pitcher's outperformance of his peripherals. I chose to create an "Unluckiness Index" which is simply that pitcher's xFIP subtracted from his ERA (ERA - xFIP). If our original hypothesis (that pitchers with more diverse arsenals would be better equipped to outperform their peripherals) was correct, we should see an inverse correlation between the Unluckiness Index and the Diversity Index. Unfortunately, we don't. Although the best-fit line does have a negative slope, the findings are not significant (R**2 = 0.005, df = 143, p = 0.579), so I chose not to include it:

Nonetheless, these results do not necessarily mean that there is nothing to our hypothesis. I may rerun these analyses using a population of pitchers with 750 innings over the past 4 seasons (or something to that effect), which should weed out much of the actual luckiness or unluckiness present. Any other suggestions? Should I use a different index of diversity? Should I use a different index to estimate outperformance of peripherals? Is the hypothesis just completely off-base?

Thanks to Woody Guthrie for today's post title, from "She Came Along to Me" a song later written and recorded by Billy Bragg and Wilco.

Appendix

Pitcher Arsenal Diversities

 name diversity Shaun Marcum 1.524638 Freddy Garcia 1.475727 Mike Leake 1.445573 Bronson Arroyo 1.422195 Josh Tomlin 1.364977 Mark Buehrle 1.359794 Jake Peavy 1.352624 James Shields 1.337039 Kevin Correia 1.336132 C.J. Wilson 1.328347 Tim Hudson 1.326451 Luke Hochevar 1.314261 Roy Halladay 1.28896 Philip Humber 1.280785 Carlos Zambrano 1.271841 Johnny Cueto 1.268851 Carlos Villanueva 1.268442 Cole Hamels 1.259831 Jeff Karstens 1.259466 Bruce Chen 1.256228 Dustin Moseley 1.244729 Alfredo Aceves 1.243739 Randy Wolf 1.232083 Josh Beckett 1.228558 Randy Wells 1.223861 Brett Myers 1.219821 Livan Hernandez 1.219617 Paul Maholm 1.215167 Travis Wood 1.205938 Derek Lowe 1.203075 Jason Marquis 1.200518 Jason Vargas 1.195635 Ricky Nolasco 1.192767 Brandon McCarthy 1.191827 Jon Lester 1.189865 Jaime Garcia 1.187693 Matt Cain 1.186668 Gavin Floyd 1.182965 John Lackey 1.180726 Dillon Gee 1.180078 Felix Hernandez 1.178936 Tim Stauffer 1.17887 Cliff Lee 1.177074 Matt Garza 1.176225 Alfredo Simon 1.172879 Ted Lilly 1.171718 Anibal Sanchez 1.169553 John Danks 1.168387 Ryan Vogelsong 1.159864 Tim Lincecum 1.158355 Brett Cecil 1.157331 Brad Penny 1.157256 Zack Greinke 1.156668 Jered Weaver 1.15494 Chris Narveson 1.154833 Nick Blackburn 1.154105 Doug Fister 1.149009 Kyle McClellan 1.13723 Carlos Carrasco 1.133773 Dan Haren 1.133473 Homer Bailey 1.130088 Justin Verlander 1.12565 Chad Billingsley 1.125239 Mat Latos 1.118583 Jonathon Niese 1.117855 Ian Kennedy 1.115248 Ricky Romero 1.10289 Tommy Hanson 1.101529 John Lannan 1.100827 Mike Pelfrey 1.10011 Chris Volstad 1.099029 Jhoulys Chacin 1.095093 Trevor Cahill 1.094889 Chris Carpenter 1.094771 Roy Oswalt 1.089276 Kyle Lohse 1.08886 Jake Arrieta 1.087089 Kyle Kendrick 1.085234 Brandon Beachy 1.083144 Felipe Paulino 1.080192 Javier Vazquez 1.079287 Jason Hammel 1.076629 CC Sabathia 1.073862 Brian Duensing 1.072921 Madison Bumgarner 1.07266 Aaron Harang 1.072585 Colby Lewis 1.071658 Chris Capuano 1.065693 Ubaldo Jimenez 1.059019 Jeremy Guthrie 1.058724 Joel Pineiro 1.055729 Hiroki Kuroda 1.053811 Jake Westbrook 1.049538 Jeff Niemann 1.046477 Matt Harrison 1.042823 Ivan Nova 1.038063 Joe Saunders 1.0343 Erik Bedard 1.028707 Francisco Liriano 1.020755 Edwin Jackson 1.017393 Edinson Volquez 1.014461 Derek Holland 1.012528 Bud Norris 1.011684 Guillermo Moscoso 1.003701 Yovani Gallardo 1.00113 Phil Coke 0.996801 J.A. Happ 0.99581 Danny Duffy 0.988132 Brandon Morrow 0.987685 Jordan Zimmermann 0.987498 Jair Jurrjens 0.980321 Scott Baker 0.978198 Jeremy Hellickson 0.978169 Jeff Francis 0.967133 Anthony Swarzak 0.963225 Rick Porcello 0.961525 Wade Davis 0.94805 Carl Pavano 0.937912 Ryan Dempster 0.937906 Wandy Rodriguez 0.934617 Charlie Morton 0.934566 David Price 0.933655 A.J. Burnett 0.933357 Max Scherzer 0.931437 Jonathan Sanchez 0.926934 Vance Worley 0.926869 Daniel Hudson 0.922703 Tom Gorzelanny 0.915285 Clayton Kershaw 0.90879 James McDonald 0.907508 Brad Bergesen 0.889349 Jo-Jo Reyes 0.886127 Cory Luebke 0.879189 Gio Gonzalez 0.840146 Fausto Carmona 0.834539 Michael Pineda 0.829806 Zach Britton 0.798832 Ervin Santana 0.786738 Josh Collmenter 0.781294 Alexi Ogando 0.774537 Tyler Chatwood 0.763681 R.A. Dickey 0.623394 Bartolo Colon 0.583892 Justin Masterson 0.477077 Tim Wakefield 0.398671

## Trending Discussions

forgot?

We'll email you a reset link.

Try another email?

### Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

### Join Bluebird Banter

You must be a member of Bluebird Banter to participate.

We have our own Community Guidelines at Bluebird Banter. You should read them.

### Join Bluebird Banter

You must be a member of Bluebird Banter to participate.

We have our own Community Guidelines at Bluebird Banter. You should read them.