Earlier today (or yesterday, depending on which timezone you're in and when you're reading this), woodman663 posted a really interesting article demonstrating that changeup specialists may have a predilection for sustaining low babip. Many of the examples of pitchers that he looked at (for example, Ted Lilly) were extreme flyball pitchers. Since flyballs are more likely to turn into outs than grounders, it forces us to disentangle these two factors from one another.

Woodman and I have talked back and forth on the piece a bit and I suggested that we run some statistical analyses so that we could tease out whether the changeup effect was truly meaningful or if it was just an artifact of the flyball effect. I said much of what is in this post in the comments section, but here it is full-blown and with the output (which is important, in case I'm making mistakes here -- please let me know if you notice any).

I included all starting pitchers with 300+ innings since 2009 and used R v2.12.1 to fit a linear model for babip to fixed effects of flyball-rate, strikeout-rate, changeup frequency, total value by linear weights of all changeups, and value by linear weights per changeup. At Woodman's suggestion (and as justified in the body of the post), I included splitters as changeups.

Keep in mind that the p-values refer to whether the evidence suggests that a factor is significant (the lower the p-value, the more confident we can be that the effect is real) and the R-squared values refer to how well the model describes the variance (the higher the R-squared value, the better the description).

> summary(fit1)

Call: lm(formula = babip$BABIP ~ babip$fly + babip$K + babip$chfreq + babip$chtot + babip$chperc, data = babip) Residuals: Min 1Q Median 3Q Max -0.040995 -0.007808 -0.000659 0.008768 0.032989 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.322e-01 8.910e-03 37.283 < 2e-16 *** babip$fly -1.213e-01 2.342e-02 -5.1799.2e-07*** babip$K 1.609e-02 3.379e-02 0.476 0.6348 babip$chfreq 9.532e-03 2.125e-02 0.448 0.6546 babip$chtot -5.992e-05 1.570e-04 -0.382 0.7034 babip$chperc -2.969e-03 1.544e-03 -1.9240.0568. --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01358 on 119 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.2564,Adjusted R-squared: 0.2251F-statistic: 8.206 on 5 and 119 DF, p-value:1.100e-06

The "1 observation deleted due to missingness" (what a great word, by the way), was Tommy Hanson, who has zero changeups and splitters on record. Anyway, what we find is that the effects of flyball-rate are highly significant (p = 2 × 10**-16). The effects of value *per changeup* are moderately significant (p = 0.0568). None of the other effects (including K%!) were significant. A model including only those two factors actually fit the data *slightly better* than the initial model, which also included k-rate, changeup frequency and total changeup value. Here is the output for that model:

> summary(fit2) Call: lm(formula = babip$BABIP ~ babip$fly + babip$chperc, data = babip) Residuals: Min 1Q Median 3Q Max -0.040973 -0.008040 -0.001156 0.008629 0.032851 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.3340417 0.0078308 42.657 < 2e-16 *** babip$fly -0.1159427 0.0213364 -5.4342.87e-07*** babip$chperc -0.0031551 0.0008792 -3.5890.00048*** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01344 on 122 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.254, Adjusted R-squared:0.2418F-statistic: 20.77 on 2 and 122 DF, p-value:1.726e-08

Next, I used the Lindeman, Gold and Merenda (1990) (lgm) method to describe the relative importances of each factor. Here is the output for the first model:

> calc.relimp(fit1,type=c("lmg","last","first","pratt"), rela=TRUE) Response variable: babip$BABIP Total response variance: 0.0002381301 Analysis based on 125 observations 5 Regressors: babip$fly babip$K babip$chfreq babip$chtot babip$chpercProportion of variance explained by model: 25.64%Metrics are normalized to sum to 100% (rela=TRUE). Relative importance metrics:lmglast first prattbabip$fly 0.661761630.862561629 0.48939293 0.72622533 babip$K 0.02597355 0.007290960 0.04968005 -0.02154839 babip$chfreq 0.05245325 0.006468146 0.10981283 -0.03433426 babip$chtot 0.08818158 0.004685193 0.14603247 0.05044396babip$chperc 0.171629980.118994071 0.20508171 0.27921336 Average coefficients for different model sizes: 1X 2Xs 3Xs 4Xs babip$fly -0.1141989299 -0.1126788278 -0.1150599466 -1.190098e-01 babip$K -0.0518129769 -0.0316319422 -0.0157599046 1.072460e-03 babip$chfreq -0.0425860063 -0.0276718376 -0.0142778771 -1.016283e-03 babip$chtot -0.0002423128 -0.0001693865 -0.0001124105 -7.509733e-05 babip$chperc -0.0030463088 -0.0028538654 -0.0028884462 -3.005977e-03 5Xs babip$fly -0.1213197877 babip$K 0.0160889358 babip$chfreq 0.0095322944 babip$chtot -0.0000599228 babip$chperc -0.0029691976

In terms of relative importance, flyball-rate was most important but the changeup inputs made important contributions to the model as well. K-rate made the least important contribution (just 2% relative importance). We can either combine the relative contributions of the changeups here or use this method to calculate relative importances for our second model. Here is the output for the second model:

> calc.relimp(fit2,type=c("lmg","last","first","pratt"), rela=TRUE) Response variable: babip$BABIP Total response variance: 0.0002381301 Analysis based on 125 observations 2 Regressors: babip$fly babip$chpercProportion of variance explained by model: 25.4%Metrics are normalized to sum to 100% (rela=TRUE). Relative importance metrics:lmglast first prattbabip$fly 0.70042440.6963282 0.7046952 0.7005284babip$chperc 0.29957560.3036718 0.2953048 0.2994716 Average coefficients for different model sizes: 1X 2Xs babip$fly -0.114198930 -0.115942720 babip$chperc -0.003046309 -0.003155121

Basically, this tells us that flyball-rate accounts for about 70% of the usefulness of the model and changeups account for about 30% its usefulness.

On the overall, *according to the methods and models described above*, flyball-rate accounts for about 17.8% of pitcher babip variability. The total contributions of per pitch changeup value, total changeup value, and changeup frequency account for about 7.6% of pitcher babip variability.

> summary(fit2) Call: lm(formula = babip$BABIP ~ babip$fly + babip$chfreq, data = babip) Residuals: Min 1Q Median 3Q Max -0.039630 -0.009275 -0.000403 0.009383 0.032987 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.333697 0.008176 40.813 < 2e-16 *** babip$fly -0.107534 0.022798 -4.7176.44e-06*** babip$chfreq -0.024325 0.017948 -1.3550.178--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01402 on 122 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.1875, Adjusted R-squared:0.1742F-statistic: 14.08 on 2 and 122 DF, p-value: 3.158e-06

**18.75%**Metrics are normalized to sum to 100% (rela=TRUE). Relative importance metrics: lmg last first pratt babip$fly

**0.8625042**0.92373356 0.8167360 0.8801962 babip$chfreq

**0.1374958**0.07626644 0.1832640 0.1198038 Average coefficients for different model sizes: 1X 2Xs babip$fly -0.11419893 -0.10753360 babip$chfreq -0.04258601 -0.02432455

So this more conservative approach, which excludes linear weights, does not sufficiently demonstrate a significant relationship. The non-significant relationship demonstrated by this conservative approach suggests that changeup frequency may account for about 2.5% of pitcher babip variance.

Overall, K-rate is extremely unlikely to be a significant factor and, even if it were, it would be an extremely unimportant one, accounting for only about 0.5% of pitcher babip variance. As a side-note, this also serves as further evidence that there are serious flaws in the calculation of SIERA. I propose that SIERA should be reconstructed so as to include the effects of flyball-rate, NOT K-rate, on babip. Essentially, the only reason it works slightly better than xFIP or FIP is because it uses K-rate as a *proxy* for flyball-rate. Since flyball-rate is easily measured and batted ball data are readily available, there's no reason to proxy flyball-rate.

So what do you all think? What are some other factors we can test for effects on babip?

Thanks to David Bowie and Woodman663!