Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: In Crunch Time, Spurs Don't Change Their Game

Baby, There's No Guidance When Random Rules: Autocorrelation in Pitcher BABIP from 2010 to 2011

Our collective interest in the nature of BABIP is no secret around these parts. A few months ago, we quantified a fairly weak, but highly significant link between BABIP and flyball-rate. As a long-delayed follow-up, I wanted to look at the actual correlation of a pitcher's babip from one year to the next.

I constructed a simple linear model attempting to fit 2011 pitcher BABIP to 2010 pitcher BABIP. I excluded pitchers with fewer than 170 IP in either season (a total of 57 pitchers in the sample). The model did not incorporate batted ball profiles, k-rates, or anything else that might be correlated with pitcher BABIP.

A significant relationship was not established (F = 1.085, df = 55, p = 0.302, R**2 = 0.02). With larger samples, I bet we would see a significant relationship, but I don't think the correlation would be any stronger (R**2 = 0.02 is extremely weak).

2011_babip_vs_2010_babi_medium

You'll likely notice that the "perfect correlation" is slightly off. That's because of a very slight decrease in BABIP leaguewide in 2011, relative to 2010. The "no correlation" line shows a horizontal line at league-average BABIP in 2011 (0.28815). Essentially, a strong correlation would be much more closely aligned with the "perfect correlation" line than with the "no correlation" line. That a. the points are not clustered around the actual correlation line; and b. the actual correlation line is quite similar to the horizontal, the correlation between a pitcher's BABIP in 2011 and his BABIP in 2010 is extremely weak.

So what does this mean for predicting BABIP in 2012? Personally, I think we can probably throw a pitcher's 2011 BABIP out the window and concentrate on his flyball-rate instead. In fact, after incorporating a pitcher's 2010 GB-rate into the model, the R**2 value increased to 0.08 and the relationship became much more significant (F = 3.447, df = 54, p = 0.039). Unsurprisingly, the relative importance of 2010 GB-rate (92%) on fitting the model was far greater than the relative importance of 2010 BABIP (8%). As it is commonly held that the longer into his career a pitcher has pitched, the better read we have on his hit-suppressing tendencies, the next topic I want to look at is whether a pitcher's single season batted ball profile is a better predictor of his next season's BABIP than his career BABIP.

What do you all think would likely be the better predictor?

Thanks, by the way, to the Silver Jews for today's post title.

Comment 32 comments  |  3 recs  | 

Do you like this story?

Comments

Display:

As usual, very interesting article

I find the results not surprising at all.

I would guess, a priori, that the single season batted ball profile would be more predictive for the next season, but I’m thinking there could be serious correlation problems here. Career batted ball profile should line up with career BABIP pretty well, and single season batted ball profile should, in aggregate, line up with career batted ball profile. Which means I would think career BABIP and single season batted ball profile should be fairly correlated. Anyway, I’ll be interested in seeing the results.

BTW – that graph is really helpful, having the three lines that you included. The lack of correlation is really clear when you can explicitly compare perfect to no correlation.

by MjwW on Dec 4, 2011 2:38 PM EST reply actions  

single season batted ball profile would be more predictive for the next season BABIP

by MjwW on Dec 4, 2011 2:39 PM EST up reply actions  

Right, there's definitely the problem that

pitchers change over time and the pitchers that we’d be looking at have actually had the most time to change. As such, previous year’s batted ball profile could likely be more indicative of the pitcher’s ability the next season than his career in aggregate.

On the other hand, I’d think that information would be really meaningful, wouldn’t it? Think about how much ink could be spilled trying to determine how good a pitcher might be in his next season based on how good he’s looked over his career. If simply looking at his recent batted-ball profile does the same job just as well, I’d think we’d be saving a lot of time and headaches

Thanks for reading — and for the kind words, by the way

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Dec 4, 2011 3:48 PM EST up reply actions  

Great analysis

you should really submit these to a more widely read SABR-type blog (BtB, HBT, BP, etc) rather than only having us Jays fans read them.

by SuckaMD on Dec 4, 2011 3:03 PM EST reply actions   1 recs

thanks

while a lot of this work is more applicable in a league-wide context (and probably might appeal to a broader fanbase), I kind of prefer the feedback and constructive criticism from the readers here. For all of its faults, SBNation does have an excellent interface and I really do think we have built something of a community.

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Dec 4, 2011 4:10 PM EST up reply actions  

A chart is fine, too...

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 3:27 PM EST up reply actions  

Excellent analysis

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 3:29 PM EST reply actions  

Just out of curiosity

I did a regression analysis between the 2010-2011 BABIPs of pitchers (Min. IP: 350; N = 60 pitchers) and their career BABIPs using Excel. The R^2 value I got was 0.68, and it was statistically significant (F = 123.7; p = 0.016). Some pitchers with young careers may have skewed the data (eg. Ricky Romero).

Glove tap to Fangraphs for the numbers.

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:21 PM EST reply actions  

using Excel and MyStat

Fix’d

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:22 PM EST up reply actions  

Yeah, I'd think that the young pitchers would throw that off quite a bit

I’m doing one now where I’m randomly selecting a year and looking at how well it correlates with that pitcher’s career BABIP for pitchers with 1500 + IP since 2002 (since that’s when fangraphs battedball data start)

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Dec 4, 2011 4:32 PM EST up reply actions  

Yeah

That’s probably a better model than the one I put together.

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:43 PM EST up reply actions  

This is interesting

Min 350 IP between the 2 season combined, regardless of a minimum in each season?

You have n = 60, jessef has n = 57. Given his criteria meant a minimum of 340 IP between two seasons, if I understand your criteria correctly, I find it almost impossible that a pitcher from his sample didn’t make it into yours. Which would mean just 3 more pitchers in yours, which shouldn’t mean such radically different results.

Am I missing something?

by MjwW on Dec 4, 2011 4:34 PM EST up reply actions  

That’s just the number of pitchers I got from Fangraphs.

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:35 PM EST up reply actions  

When I stated the conditions (340 IP; 2010-2011), it got me 60 starting pitchers.

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:35 PM EST up reply actions  

*350 IP

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:38 PM EST up reply actions  

Oh, I think I understand what you're talking about now

I’m not using the same variables as what jessef used. I’m looking at the relationship (if that’s the best term to use) between a pitcher’s career BABIP to total BABIP between the 2010 and 2011 seasons.

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:38 PM EST up reply actions  

Ah

Okay, thanks for the clarification. Complete reading comprehension fail on my part.

by MjwW on Dec 4, 2011 4:41 PM EST up reply actions  

Now that I'm straightened out

Yeah, it makes sense that young pitchers could really bias that. Out of curiousity, would it be easy to add a screen to only include pitchers whose have career IP / 2010-11 IP > 2 (this is arbitrary, you could go higher, I wouldn’t go lower), and then re-run the correlation analysis with only those pitchers?

by MjwW on Dec 4, 2011 4:46 PM EST up reply actions  

I would

But I need to study for my final exams. =P

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

Sorry, unauthorized hotlinking of copyrighted material not permitted.

by Frag on Dec 4, 2011 4:50 PM EST up reply actions  

What you’re calling a “perfect correlation” is actually a slope of 1 – it’s possible to have a best-fit line with that slope and have any magnitude of correlation. The entirety of the graph is wholly misleading, in fact – the slope does not determine the magnitude of the correlation.

by cwyers on Dec 5, 2011 4:32 PM EST reply actions   1 recs

right, it isn't perfect in the sense that it describes all of the variance

The figure is not “wholly misleading” — if you read the article, I make it a point to explain that the slope is only part of what is valuable information; it is the clustering around the line, obviously that determines how strong the correlation is.

And, while the slope does not determine the “magnitude of the correlation,” it is what determines how strongly the values are autocorrelated in any meaningful sense. That the slope approximates zero suggests that one season’s worth of babip data is essentially useless in predicting the next season’s babip. The farther that slope is away from 1, the less autocorrelation there is in the data from one season to the next.

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Dec 5, 2011 7:29 PM EST up reply actions   1 recs

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about our heroic azure-tinged corvidae, the Toronto Blue Jays.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Hal2_small
Quantifying the Effect of Team Defense on Over/Underperforming the Team's FIP
Small
Brett Lawrie's historic defensive prowess

Recent FanPosts

Small
Blue Jays Player Stats Multiplied by 4
Small
Petition to change Suckage Award Titles
Jaysfanimage_small
The Lansing 4: What to do when they outpitch expectations?
Misc_003_small
Jays' All-Star Alliterative Name Team
Kingkelly_small
Stats tools?
Small
Jays Future Closer?
N41306733_31278203_7401_steve_golfin_small
my MLB power ranking, May Edition
Jaysfanimage_small
Blue Jays Farm Report - Apr 29-May 5

+ New FanPost All FanPosts >

Yahoo_full_count

Managers

Bluejayperched_small hugo

Rincewind-1_small Tom Dakers

Assistant Manager

Smith_up_small JohnnyG

Authors

Hiro_small jessef

Profile_small masterkembo

Profiel_small Woodman663

Minorleaguer_small Minor Leaguer

Tony_fernandez_small TonyFernandezSavedMyLife

Moderators

J_bau_small jays182

Aejfuulciaar18g_small Bowling_Guy25