clock menu more-arrow no yes mobile

Filed under:

Baby, There's No Guidance When Random Rules: Autocorrelation in Pitcher BABIP from 2010 to 2011

Our collective interest in the nature of BABIP is no secret around these parts. A few months ago, we quantified a fairly weak, but highly significant link between BABIP and flyball-rate. As a long-delayed follow-up, I wanted to look at the actual correlation of a pitcher's babip from one year to the next.

I constructed a simple linear model attempting to fit 2011 pitcher BABIP to 2010 pitcher BABIP. I excluded pitchers with fewer than 170 IP in either season (a total of 57 pitchers in the sample). The model did not incorporate batted ball profiles, k-rates, or anything else that might be correlated with pitcher BABIP.

A significant relationship was not established (F = 1.085, df = 55, p = 0.302, R**2 = 0.02). With larger samples, I bet we would see a significant relationship, but I don't think the correlation would be any stronger (R**2 = 0.02 is extremely weak).

2011_babip_vs_2010_babi_medium

You'll likely notice that the "perfect correlation" is slightly off. That's because of a very slight decrease in BABIP leaguewide in 2011, relative to 2010. The "no correlation" line shows a horizontal line at league-average BABIP in 2011 (0.28815). Essentially, a strong correlation would be much more closely aligned with the "perfect correlation" line than with the "no correlation" line. That a. the points are not clustered around the actual correlation line; and b. the actual correlation line is quite similar to the horizontal, the correlation between a pitcher's BABIP in 2011 and his BABIP in 2010 is extremely weak.

So what does this mean for predicting BABIP in 2012? Personally, I think we can probably throw a pitcher's 2011 BABIP out the window and concentrate on his flyball-rate instead. In fact, after incorporating a pitcher's 2010 GB-rate into the model, the R**2 value increased to 0.08 and the relationship became much more significant (F = 3.447, df = 54, p = 0.039). Unsurprisingly, the relative importance of 2010 GB-rate (92%) on fitting the model was far greater than the relative importance of 2010 BABIP (8%). As it is commonly held that the longer into his career a pitcher has pitched, the better read we have on his hit-suppressing tendencies, the next topic I want to look at is whether a pitcher's single season batted ball profile is a better predictor of his next season's BABIP than his career BABIP.

What do you all think would likely be the better predictor?

Thanks, by the way, to the Silver Jews for today's post title.