Hi again, everyone. As a follow-up to last week's piece on the effects of pitcher arsenal diversity on pitcher luckiness or unluckiness, I decided to do two things which might reduce some of the error. The first was to limit the sample size to include pitchers with 200+ innings in 2011 and the second was to evaluate effects of arsenal diversity on BABIP, instead of the difference between an ERA estimator (such as xFIP or SIERA) and the pitcher's actual ERA. There are drawbacks to both of these changes, of course. A reduction in sample size of roughly 100 pitchers greatly reduces the power of any test and focussing on BABIP only means that we would not be able to evaluate any effects that diversity may have on sequencing.
As per the comments, I used pitch types from texasleaguers instead of fangraphs (quite a pain, actually, since it seems like you can't export all the texasleaguers data at once) and built each pitcher's mean fastball velocity into the model. Given how small the sample size is and how volatile BABIP can be, it should come as no surprise that neither the effects of diversity (p = 0.961) nor the effects of velocity (p = 0.899) come out looking significant.
I also ran a model to determine the effects of each pitch type on BABIP. Although most pitch types did not seem to have an effect, there did seem to be an effect of increasing frequencies of sliders (p = 0.0595) and may have been one when two-seamers and sinkers were pooled (p = 0.0276), both of which corresponded with increasing BABIP. I expected these effects to be due mainly to the effects that these pitches have on batted balls -- essentially, pitchers who throw a lot of sliders or a lot of two-seamers are likely to generate a lot of groundballs and, consequently, have a higher BABIP. Unsurprisingly, this was the case.
Thanks to Bright Eyes for today's post title. Sorry, but I can't give you back the last five minutes of your life.