clock menu more-arrow no yes mobile

Filed under:

The Mixture is All of Us and We're Still Mixing: Do Pitchers with More Diverse Arsenals Outperform their Peripherals?

It's no secret that we are interested in determining what factors allow some pitchers to sustain lower ERAs than their strikeout-, walk-, and groundball-rates would suggest. Hence, this, another installment for this endless series in which we try to determine what factors allow pitchers to outperform their peripherals. In this installment, we'll be looking at whether pitchers with more diverse arsenals are able to keep hitters off-balance, allowing them to induce weaker contact. I have long considered this a possible reason that Shaun Marcum, who is notable for his array of pitches and his willingness to use any pitch in any count, has been able to outperform his peripherals for so long.

Before we can start looking at possible effects of arsenal diversity, we need to quantify the diversity of each pitcher's arsenal. As such, I chose the Shannon-Wiener index, a measure commonly used to estimate biodiversity in ecological communities. I used a sample of all pitchers who pitched 100+ innings in 2011 (a total of 145 pitchers) and exported their pitch type data from fangraphs and used the vegan package in R to calculate Shannon-Wiener diversity. Essentially, pitchers were analogous to communities and pitch types were analogous to species. The index takes both the number of different types of pitches a pitcher throws and the evenness of his usage of those pitches into account. The pitch distributions is an important factor here -- the index should be less influenced by a "see-me" changeup used two or three times a start than by a pitcher who uses his changeup five or six times as frequently. The index scales from zero (which would be a pitcher who uses the same pitch 100% of the time) to the natural logarithm of the number of different pitches a pitcher throws. As an example, a realistic maximum might be a pitcher who throws seven different pitches and uses them all equally. His arsenal would have a diversity index of log(7) = 1.946, so we can say that the index (for pitchers) scales from roughly 0 to 2.

Ever wonder which pitchers have the most diverse arsenals? Well, at the top of the list is actually our old friend, Shaun Marcum at 1.525 (mean diversity = 1.084; see the end of the article for the entire list of pitchers). Remember that these values are calculated on a log-scale, so Shaun Marcum has a much more diverse arsenal than the average pitcher. At the bottom of the list, as you might have guessed, you'll find extreme one-pitch specialists, like Tim Wakefield and Justin Masterson. Due to the innings exception, Mariano Rivera is not included, but his diversity score is 0.407. If you were unconvinced about the method before, I hope seeing Marcum near the top and these other pitchers near the bottom has assuaged your fears.

Now that we have figured out a way to estimate a pitcher's arsenal diversity, we need to figure out a way to evaluate that pitcher's outperformance of his peripherals. I chose to create an "Unluckiness Index" which is simply that pitcher's xFIP subtracted from his ERA (ERA - xFIP). If our original hypothesis (that pitchers with more diverse arsenals would be better equipped to outperform their peripherals) was correct, we should see an inverse correlation between the Unluckiness Index and the Diversity Index. Unfortunately, we don't. Although the best-fit line does have a negative slope, the findings are not significant (R**2 = 0.005, df = 143, p = 0.579), so I chose not to include it:



Unluckiness_vs

Nonetheless, these results do not necessarily mean that there is nothing to our hypothesis. I may rerun these analyses using a population of pitchers with 750 innings over the past 4 seasons (or something to that effect), which should weed out much of the actual luckiness or unluckiness present. Any other suggestions? Should I use a different index of diversity? Should I use a different index to estimate outperformance of peripherals? Is the hypothesis just completely off-base?

Thanks to Woody Guthrie for today's post title, from "She Came Along to Me" a song later written and recorded by Billy Bragg and Wilco.

Appendix

Pitcher Arsenal Diversities

name diversity
Shaun Marcum 1.524638
Freddy Garcia 1.475727
Mike Leake 1.445573
Bronson Arroyo 1.422195
Josh Tomlin 1.364977
Mark Buehrle 1.359794
Jake Peavy 1.352624
James Shields 1.337039
Kevin Correia 1.336132
C.J. Wilson 1.328347
Tim Hudson 1.326451
Luke Hochevar 1.314261
Roy Halladay 1.28896
Philip Humber 1.280785
Carlos Zambrano 1.271841
Johnny Cueto 1.268851
Carlos Villanueva 1.268442
Cole Hamels 1.259831
Jeff Karstens 1.259466
Bruce Chen 1.256228
Dustin Moseley 1.244729
Alfredo Aceves 1.243739
Randy Wolf 1.232083
Josh Beckett 1.228558
Randy Wells 1.223861
Brett Myers 1.219821
Livan Hernandez 1.219617
Paul Maholm 1.215167
Travis Wood 1.205938
Derek Lowe 1.203075
Jason Marquis 1.200518
Jason Vargas 1.195635
Ricky Nolasco 1.192767
Brandon McCarthy 1.191827
Jon Lester 1.189865
Jaime Garcia 1.187693
Matt Cain 1.186668
Gavin Floyd 1.182965
John Lackey 1.180726
Dillon Gee 1.180078
Felix Hernandez 1.178936
Tim Stauffer 1.17887
Cliff Lee 1.177074
Matt Garza 1.176225
Alfredo Simon 1.172879
Ted Lilly 1.171718
Anibal Sanchez 1.169553
John Danks 1.168387
Ryan Vogelsong 1.159864
Tim Lincecum 1.158355
Brett Cecil 1.157331
Brad Penny 1.157256
Zack Greinke 1.156668
Jered Weaver 1.15494
Chris Narveson 1.154833
Nick Blackburn 1.154105
Doug Fister 1.149009
Kyle McClellan 1.13723
Carlos Carrasco 1.133773
Dan Haren 1.133473
Homer Bailey 1.130088
Justin Verlander 1.12565
Chad Billingsley 1.125239
Mat Latos 1.118583
Jonathon Niese 1.117855
Ian Kennedy 1.115248
Ricky Romero 1.10289
Tommy Hanson 1.101529
John Lannan 1.100827
Mike Pelfrey 1.10011
Chris Volstad 1.099029
Jhoulys Chacin 1.095093
Trevor Cahill 1.094889
Chris Carpenter 1.094771
Roy Oswalt 1.089276
Kyle Lohse 1.08886
Jake Arrieta 1.087089
Kyle Kendrick 1.085234
Brandon Beachy 1.083144
Felipe Paulino 1.080192
Javier Vazquez 1.079287
Jason Hammel 1.076629
CC Sabathia 1.073862
Brian Duensing 1.072921
Madison Bumgarner 1.07266
Aaron Harang 1.072585
Colby Lewis 1.071658
Chris Capuano 1.065693
Ubaldo Jimenez 1.059019
Jeremy Guthrie 1.058724
Joel Pineiro 1.055729
Hiroki Kuroda 1.053811
Jake Westbrook 1.049538
Jeff Niemann 1.046477
Matt Harrison 1.042823
Ivan Nova 1.038063
Joe Saunders 1.0343
Erik Bedard 1.028707
Francisco Liriano 1.020755
Edwin Jackson 1.017393
Edinson Volquez 1.014461
Derek Holland 1.012528
Bud Norris 1.011684
Guillermo Moscoso 1.003701
Yovani Gallardo 1.00113
Phil Coke 0.996801
J.A. Happ 0.99581
Danny Duffy 0.988132
Brandon Morrow 0.987685
Jordan Zimmermann 0.987498
Jair Jurrjens 0.980321
Scott Baker 0.978198
Jeremy Hellickson 0.978169
Jeff Francis 0.967133
Anthony Swarzak 0.963225
Rick Porcello 0.961525
Wade Davis 0.94805
Carl Pavano 0.937912
Ryan Dempster 0.937906
Wandy Rodriguez 0.934617
Charlie Morton 0.934566
David Price 0.933655
A.J. Burnett 0.933357
Max Scherzer 0.931437
Jonathan Sanchez 0.926934
Vance Worley 0.926869
Daniel Hudson 0.922703
Tom Gorzelanny 0.915285
Clayton Kershaw 0.90879
James McDonald 0.907508
Brad Bergesen 0.889349
Jo-Jo Reyes 0.886127
Cory Luebke 0.879189
Gio Gonzalez 0.840146
Fausto Carmona 0.834539
Michael Pineda 0.829806
Zach Britton 0.798832
Ervin Santana 0.786738
Josh Collmenter 0.781294
Alexi Ogando 0.774537
Tyler Chatwood 0.763681
R.A. Dickey 0.623394
Bartolo Colon 0.583892
Justin Masterson 0.477077
Tim Wakefield 0.398671