The Mixture is All of Us and We're Still Mixing: Do Pitchers with More Diverse Arsenals Outperform their Peripherals?
It's no secret that we are interested in determining what factors allow some pitchers to sustain lower ERAs than their strikeout-, walk-, and groundball-rates would suggest. Hence, this, another installment for this endless series in which we try to determine what factors allow pitchers to outperform their peripherals. In this installment, we'll be looking at whether pitchers with more diverse arsenals are able to keep hitters off-balance, allowing them to induce weaker contact. I have long considered this a possible reason that Shaun Marcum, who is notable for his array of pitches and his willingness to use any pitch in any count, has been able to outperform his peripherals for so long.
Before we can start looking at possible effects of arsenal diversity, we need to quantify the diversity of each pitcher's arsenal. As such, I chose the Shannon-Wiener index, a measure commonly used to estimate biodiversity in ecological communities. I used a sample of all pitchers who pitched 100+ innings in 2011 (a total of 145 pitchers) and exported their pitch type data from fangraphs and used the vegan package in R to calculate Shannon-Wiener diversity. Essentially, pitchers were analogous to communities and pitch types were analogous to species. The index takes both the number of different types of pitches a pitcher throws and the evenness of his usage of those pitches into account. The pitch distributions is an important factor here -- the index should be less influenced by a "see-me" changeup used two or three times a start than by a pitcher who uses his changeup five or six times as frequently. The index scales from zero (which would be a pitcher who uses the same pitch 100% of the time) to the natural logarithm of the number of different pitches a pitcher throws. As an example, a realistic maximum might be a pitcher who throws seven different pitches and uses them all equally. His arsenal would have a diversity index of log(7) = 1.946, so we can say that the index (for pitchers) scales from roughly 0 to 2.
Ever wonder which pitchers have the most diverse arsenals? Well, at the top of the list is actually our old friend, Shaun Marcum at 1.525 (mean diversity = 1.084; see the end of the article for the entire list of pitchers). Remember that these values are calculated on a log-scale, so Shaun Marcum has a much more diverse arsenal than the average pitcher. At the bottom of the list, as you might have guessed, you'll find extreme one-pitch specialists, like Tim Wakefield and Justin Masterson. Due to the innings exception, Mariano Rivera is not included, but his diversity score is 0.407. If you were unconvinced about the method before, I hope seeing Marcum near the top and these other pitchers near the bottom has assuaged your fears.
Now that we have figured out a way to estimate a pitcher's arsenal diversity, we need to figure out a way to evaluate that pitcher's outperformance of his peripherals. I chose to create an "Unluckiness Index" which is simply that pitcher's xFIP subtracted from his ERA (ERA - xFIP). If our original hypothesis (that pitchers with more diverse arsenals would be better equipped to outperform their peripherals) was correct, we should see an inverse correlation between the Unluckiness Index and the Diversity Index. Unfortunately, we don't. Although the best-fit line does have a negative slope, the findings are not significant (R**2 = 0.005, df = 143, p = 0.579), so I chose not to include it:
Nonetheless, these results do not necessarily mean that there is nothing to our hypothesis. I may rerun these analyses using a population of pitchers with 750 innings over the past 4 seasons (or something to that effect), which should weed out much of the actual luckiness or unluckiness present. Any other suggestions? Should I use a different index of diversity? Should I use a different index to estimate outperformance of peripherals? Is the hypothesis just completely off-base?
Thanks to Woody Guthrie for today's post title, from "She Came Along to Me" a song later written and recorded by Billy Bragg and Wilco.
Appendix
Pitcher Arsenal Diversities
26 comments
|
4 recs |
Do you like this story?
Comments
Could you try running it again with pitch classification from Texas Leaguers and SIERA-ERA instead of xFIP-ERA?
though I don’t think there will be a big difference in the findings
His 2011 wRC+ is 26
Why SIERA instead of xFIP?
SIERA has a very small improvement on xFIP, so if substituting one of the other won’t move the needle towards significance.
Okay
My understanding is, it’s a very marginal improvement, and when you have results that show such a high level of statistical insignficance, it’s not going to move the needle (if there is indeed an effect, it may well be the case there is none)
The correlation coefficient between the two is 0.97, so they’re really close.
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
It matters heavily in analysis like this
While the overall difference between the numbers of SIERA and xFIP are not that significant, pitchers with diverse arsenals tend to have different batted ball profiles, which SIERA accounts for and xFIP does not. As a whole it doesn’t really matter but it would be a good thing to add in for analysis like this.
If you want to say dumb things, you can't get mad when I call you dumb.
by dudedudedude on Jan 22, 2012 2:22 AM EST up reply actions
Interesting
Great work!
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
A couple thoughts
First, interesting topic to look at, and interesting way to approach it. I definitely enjoyed reading the article.
I’m unfamiliar with this test of diversity, and measures of diversity generally, so I have nothing to suggest here. Looking over your list, it passes the eye test in terms of the rankings, so if there was an effect we’re not seeing, I wouldn’t think is the problem.
I think the biggest suggestion I would make is to change the criteria for inclusion. In individual seasons, even at 100+ IP, there’s going to be a noise in xFIP-ERA (or SIERA-ERA, etc), such that it overwealm the actual signal. You suggested 750 innings over the last 4 years, I think that might be overly restrictive, you would lose Marcum for example from that sample and just checking Fangraphs, for 2008-2011 you get only 33 pitchers. Indeed, it might bias the sample because guys with better results will tend to pitch those type of innings, these guys are more likely to have positive ERA-xFIP. That said, the mean might be different but I can’t think of the reason this sample would be heteroskedastic, no maybe it’s not a problem.
Anyway, I’d probably look into XFIP-ERA gaps a little more first, find a level where the split-sample correlation is strong enough that we can detect signal. My guess is, something like 400IP over the last 3 years might be a good level.
Finally, my last thought would be that if there is an effect, we might not see it in a one-factor model. Looking over that list, at the top we see mostly guys who don;t have a reputation for raw stuff. Applying a little intuition, we might say that guys who lack the stuff need the wider arsenal to keep batters off balance, and without adjusting for that we won’t tease out the effect we think might be there. My suggestion would be to take average fastball velocity and toss it in, and see it that improves the explanatory power. A related way might be to try and measure the relative difference in Pitch Values for each pitch, since guys with good stuff should have a larger differences.
Very cool idea.
I think you should trademark the term unluckiness index’. It is perfect.
After having a fairly conversation about change ups, I wonder if that pitch could cause pitchers to be luckier than average.
I blog, therefore I am.
by Tom Dakers on Jan 21, 2012 10:13 PM EST reply actions 1 recs
we already wrote about changeups and BABIP
I thought there was a correlation, Jesse didn’t. The Hardball Times did, at least somewhat.
Funny is that all articles were published at the end of september. I hadn’t read the Hardball Times article, and had been working on it before it was published, but they unfortunately did beat me to it, so now it might look like I copied their idea.
Blogging about the Toronto Blue Jays at Bluebird Banter
Dammit I hate getting scooped.
Follow me @Minor_Leaguer
by Minor Leaguer on Jan 22, 2012 11:38 AM EST up reply actions
well done sir
very interesting read and results
One thing, Jesse
This very cool, but I have just one nitpick: Arroyo and Marcum also possibly throw various combinations of what Fangraphs recognizes as the same pitch. So two sliders or two changeups could well make a pitcher more unpredictable, but it won’t show up in the results.
Blogging about the Toronto Blue Jays at Bluebird Banter
right
and some pitchers throw different speeds of changeups and fastballs, etc., which don’t show up, either. correct pitch classification is an huge assumption that the study makes and could be obscuring any findings. nonetheless, the results were so far from being significant I’m a little suspect that the idea has much merit. i do plan to rerun the analyses using the suggestions from some of the folks around here. it’s possible, for example, that the texasleaguers classifications are more reliable
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Off topic
Interesting that the Red Sox traded Scutaro to the Rockies……allows for an Oswalt signing?
by Keith72 on Jan 22, 2012 12:17 PM EST via mobile reply actions

by 
























