Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Stan Van Gundy Fired As Head Coach Of The Magic

He Never Complains When It's Hot: How Much Does Groundball-Rate Affect BABIP?

Over the past couple weeks we've been talking a lot around here about what factors affect pitcher batting average on balls in play (BABIP).  So far, the only factor we've found that I feel sufficiently comfortable saying is both under a pitcher's control (to a large degree) and has a large impact on pitcher BABIP is batted ball type, since flyballs in play are more likely to become outs than groundballs.

There are a number of reasons for this.  First, flyballs that go over the fence in fair or foul territory don't count as "balls in play."  So homeruns, which would often be hits even if they didn't go over the fence (there are occasionally the ones that just clear and would have been caught on the track, but plenty of homeruns would be gappers), are removed from the BABIP calculation.  This is not an insignificant event as somewhere around 10% of fair flyballs leave the park each year.  Second, the hardest hit balls in the air are often scored as linedrives, which don't technically count into the fangraphs flyball-rate (though, as Woodman pointed out the other day, they do count into StatCorner's Balls in Air).  Third, it is much easier for fielders to get to flyballs.  They have more time to get underneath them and do not have to throw a baserunner out once they've made the play.  Finally, and probably least importantly, a flyball in foul territory can count as an out but cannot count as a hit.  Compare this to a groundball in foul territory, which can count as neither, and flyball-babip is again suppressed relative to groundball-babip.  Combine these factors and it is easy to see why It is so much easier to get outs on flyballs.  In fact, even though misplays on flyballs typically do not get scored as errors (increasing BABIP) and misplays on groundballs typically do result in being scored as errors (reducing BABIP), flyballs still result in lower BABIP.  Also, by no means was this an exhaustive list so it's also possible that I've missed some other reasons flyballs become outs more frequently

However, although we've established that pitcher groundball-rate can inflate pitcher BABIP, we haven't actually looked at all at the degree to which BABIP is affected.  In 2011, AL pitcher, BABIP amongst qualified starters has ranged from Jeremy Hellickson's .223 to CC Sabathia's .318.  It should come as no surprise that Hellickson is an extreme flyball pitcher.  So was Hellickson lucky or is it simply the result of him inducing groundballs just 35% of the time and Rays having an exceptional defence?  Was Sabathia unlucky or was his AL-worst BABIP simply the result of an above-average groundball-inducing tendency?

I used a sample of all pitchers MLB-wide who pitched at least 300 innings over the past three seasons (since 2009) to test the relationship between groundball-rate and BABIP.  Here are the results:

Babipvsgball_medium 

Star-divide

The effect is significant (p < 0.01, R-sq = 0.14) and the blue line represents the best-fit linear model.  The grey shaded area represents a 95% confidence interval for the mean (not for individual observations).  There is obviously a lot of variability here, but there is also a general trend with BABIP increasing as groundball-rate also increases.

Also, we see that, outside of other factors, such as defence and luck, even the most extreme flyball-rates do not explain pitcher control of long-term BABIP below .275 or so and even extreme groundballers wouldn't generally have BABIP greater than .310.  Defence is, of course, an important factor in BABIP but since there are 30 teams whose defences can fluctuate wildly from year-to-year (or even within a year) and only a handful of starters per team in any given season, it is difficult to isolate its effects.  There are, of course, likely to be other "hidden" factors within a pitcher's control but teasing out the nature of those factors has proven quite difficult.  Until we can figure out some way to test these predicted effects (movement, etc.), this is one way to estimate what a pitcher's BABIP "should" be.  Obviously, take all this with a grain of salt, but let's see what it means for our three Blue Jays starters who made 20 starts this season and will likely be back next year.

Ricky Romero: GB-rate: 54.7%, BABIP: .242 -- Some regression to the mean is in line for Ricky next year.  That doesn't mean he won't be good but for him to maintain a BABIP so low in spite of a groundball-rate so high is quite anomalous.  On the other hand, a disproportionate number of flyballs left the yard so that should mitigate the effects of his increasing BABIP somewhat next year.  Expect him to be great, but, unless he continues to improve his peripherals (certainly possible), not quite as great.

Brandon Morrow: GB-rate: 36.0%, BABIP: .299 -- Brandon's an excellent example of why this work is meaningful.  You might think he hasn't had bad luck on balls in play, but it's actually quite likely he has.  Given his extreme flyball-rate, his BABIP should likely have been quite a bit lower, as much as 15 points.  If his BABIP comes down like I think it should and his sequencing returns to normal, causing his strand-rate to drop with it, he could be in for a remarkable season next year.  There is, of course, the possibility that his sequencing is a result of his inability to pitch from the stretch (as has been discussed and analyzed here), but his inability to do so has not been completely established.  Also, we'll see if adding in a cutter (as he has apparently done the past few starts) will result in more groundballs next season.

Brett Cecil: GB-rate: 38.2%, BABIP: .267 -- In this case, normally we'd think Cecil has been extremely lucky on balls in play (though quite unlucky on flyballs becoming homeruns) this season.  While that is true, given his flyball-rate, he hasn't been nearly as lucky as we'd expect and I'd think that his poor luck on flyballs likely cancelled out any good luck he had on balls that didn't leave the yard.  It certainly wasn't a good season from Cecil, but it was not nearly so bad as it might look at a glance.

So what do you all think?  Does it make sense to you that groundball-rate should affect BABIP?  Does it make you feel better about Morrow's prospects as an extreme flyball pitcher?  Was this just a whole bunch of wasted words?

Thanks to Pavement's "Grounded" for today's title.

Comment 21 comments  |  1 recs  | 

Do you like this story?

Comments

Display:

nice work

Sad, Drunk, And Poorly

My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL

Twit Twat.

by Pikachu on Oct 8, 2011 10:41 AM EDT reply actions  

Good work

Though, I’d caution looking into an R^2 value of 0.14 and determining future outcomes as a result of it.

Nonetheless, it’s a good starting point into looking at many of the possible variables that influence BABIP.

"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr

by Frag on Oct 8, 2011 11:28 AM EDT reply actions  

You're definitely right about predicting it based on fb-rate considering how little influence it has

As you implied, the problem with predicting future babip is, of course, that there are so few demonstrably important factors. I certainly wouldn’t even attempt to predict any single pitcher’s babip next year with anything more than +/- 20 points of error. That said, I’d bet we could predict an aggregated BABIP – GB% relationship for the entire league with a lot better accuracy, which suggests that, as you said, it’s a starting point for discussion. By no means was this supposed to be an end-all, be-all nonstarter.

Also, thanks for commenting!

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Oct 8, 2011 1:04 PM EDT up reply actions  

R^2 of 0.14

Doesn’t this mean that 14% of the variation in BABIP can be attributed to GB%? That’s significant, and interesting. Do we have any idea what the relationship would be over a longer time-frame, or with a higher minimum IP threshold?

by gabrielsyme on Oct 8, 2011 9:32 PM EDT up reply actions  

Just did a quick study

2002 to 2011, min. 1000 IP. Resulted in an r^2 of .104, and an expected BABIP range from .287 at a 35% GB rate to .304 at a 60% GB rate. Pretty much the same range, but a lower r^2.

I think the take-away here is that being a fly-ball pitcher helps your BABIP a little bit, but that luck or other skills has a much bigger impact.

by gabrielsyme on Oct 8, 2011 10:11 PM EDT up reply actions  

It's a good method but

the problem with going back to 2002 is that the game has changed some since then. Part of why you see a lower R**2 value is because — I think - babip in general is lower now and that can’t be explained away by flyball%

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Oct 10, 2011 12:12 PM EDT up reply actions  

good article

but I think flyballs/groundballs are not the only thing to look at. I still believe that the type of pitches (and the quality of those) a pitcher throws can have an influence on BABIP.

Think about this: what types of pitches does a pitcher need to throw, even if they’re not good? First and foremost it’s the fastball. Lots of pitchers throw low quality fastballs because they simply have to. Brett Cecil is likely one of them (perhaps getting his velocity back could help him). Second is the changeup. Although some pitchers survive completely “changeless”, there’s still a lot of pitchers throwing mediocre changeups (that get hit hard if recognized) because they want to take the batter off the fastball.

Then there’s the sinker and the slider. Both are known to be much easier to hit by opposite handed batters. This could increase the BABIP of pitchers who rely a lot on these pitches.

by Woodman663 on Oct 8, 2011 11:42 AM EDT reply actions  

Groundball pitchers (>48%) who have low BABIPs (<.290) over at least 500 innings, 2000-2011

Brandon Webb
Tim Hudson
Ricky Romero (mostly because of 2011)
Trevor Cahill
John Lannan
Greg Maddux
C.J. Wilson
Jason Marquis
Ubaldo Jimenez
Matt Clement
Adam Wainwright

by Woodman663 on Oct 8, 2011 12:13 PM EDT reply actions  

I should add

That I’m not tryingto “disprove” your theory, but rather listing some exceptions for further examinations.

by Woodman663 on Oct 8, 2011 12:29 PM EDT up reply actions  

oh, no worries, glad you read it and are thinking about it

I don’t really see what these exceptions are supposed to prove, though.

In the first place, babip is variable enough that, simply by luck, you’re going to find guys who lie outside the range in small samples (500 ip is large, but think about how big variance in one season’s worth of BABIP is, adding another season helps, but not enough to cancel out the influence of luck). Also, look at the model — .290 BABIP w/ a 48% gb-rate is practically within the 95% confidence interval of the mean. I’d think that if you’re looking for true exceptions to the model, you’d have to get much farther away than that. Maybe I should have shown the 95% confidence interval for individual observations which might better demonstrate what would constitute an actual exception than the 95% confidence interval for the mean. There are some guys who are certainly exceptions (Maddux, Hudson, Lannan), but the list doesn’t differentiate between them and guys who just kind of sneak in. Do we really think Matt Clement (.289 babip, 49% gb-rate) and is an exception to the relationship?

And, don’t forget:
How important is the difference between a .287 BABIP and a .292 BABIP? Five hits per 1000 balls in play . . . Chris Carpenter led the majors with 996 batters faced this season. He struck out 191 batters, walked 55, and yielded 16 hr, so there were 734 balls in play, so on a per-season basis, a 5 point difference in BABIP is roughly equivalent to 3 2/3 hits. Is that really meaningful enough to even talk about? If we’re talking about exceptions, I think we need to limit the discussion to guys who suppress BABIP to a more meaningful extent than that.

Also, what do these pitchers that you’ve listed have in common? More of them are righties, but more pitchers are righties, so I don’t think there’s necessarily anything there. Some of them throw (or threw) very hard (Webb, Jimenez) but some are more soft-tossers (Lannan, Marquis). Regarding pitch-types, sure most of them use a lot of two-seamers, but I think that is just an artifact of the fact that you’re sampling for guys who get grounders and most guys who get grounders use two-seamers.

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Oct 8, 2011 12:56 PM EDT up reply actions  

well

I didn’t know about the “confidence interval of the mean”, and now I kind of do. So I learned something. Kind of.

I we’d assume that flyballs/groundballs were the only thing affecting BABIP, how many pitchers would you expect to be below a .285 BABIP just because of random variance?

Brandon Webb, btw, didn’t throw hard. In fact, his fastball velocity never even averaged 89 mph, his career average is exactly 88 mph. That’s part of what makes him so cool.

I do think Matt Clement kind of snuck in, he shouldn’t be there. Wilson, Romero and Cahill will be interesting to watch in the coming years to see if their BABIPs hold. Of course, we were already interested in watching Romero pitch.

by Woodman663 on Oct 8, 2011 2:50 PM EDT up reply actions  

given that the league average is around .290 this past year

my guess is you’d expect about 45% of pitchers to be below .285 due to pure luck (ie if everyone’s true talent was actually .290)

by Jono411 on Oct 9, 2011 4:07 PM EDT up reply actions  

I am an idiot

for some reason I was thinking of Chien Ming Wang when I said Webb

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Oct 10, 2011 12:14 PM EDT up reply actions  

further options for study

open questions: how much of BABIP variance is player skill and how much luck? For example, Tom Glavine’s career BABIP over 4400 innings was .280; Maddux was .281 over 5000 innings. Are those true skill levels? How much regression to the mean should we attribute to Glavine and Maddux after such a huge sample size? You could do a study looking at the variation for individual pitchers in odd v. even numbered years: I think that would tell us how much “luck” is involved in large sample sizes. Others who know more about statistics could doubtless tell you what kind of sample size you need before the observed BABIP for pitchers becomes reliable.

by gabrielsyme on Oct 8, 2011 11:16 PM EDT up reply actions  

it would be very hard to separate the two IMO

unless you can think of another variable we can regress on. I think the interesting thing about BABIP studies (the early ones anyway) is how there really is so much noise involved. another thing to remember is that even as N (number of innings) increases to a very large sample, there will be pitchers – even good ones (remember this is based on previous studies but could change) – that will be “unlucky” even over the huge samples

by benk on Oct 9, 2011 11:36 AM EDT up reply actions  

I don't think you need another variable

You’d just have to see what the predictive ability is over different sample sizes.

by gabrielsyme on Oct 9, 2011 5:50 PM EDT up reply actions  

i think it's around 1000 innings when you reach the R = .5 level

ie, after 1000 innings, you should regress a pitcher’s observed BABIP 50% to the mean to get the best estimate of his true talent. so for someone like maddux at .281 over 5000 innings, you’re regressing 1/6 of the way to the league average (probably around .300 when he played). so our best guess for his true talent is .284

by Jono411 on Oct 9, 2011 4:09 PM EDT up reply actions  

from reading the book blog a lot

if you just search for BABIP there you’ll probably find a bunch of posts talking about it.

http://www.insidethebook.com/ee/

by Jono411 on Oct 9, 2011 6:47 PM EDT up reply actions  

Also

If you can track it down in the Fangraphs archives, once upon a time Pizza Cutter went through just about every stat available at the time and figured out the sample sizes required for .5.

If you have to use the sarcasm font, you're doing it wrong.

by Gerse on Oct 9, 2011 10:10 PM EDT up reply actions  

And here it (sort of) is

http://www.fangraphs.com/blogs/index.php/when-samples-become-reliable/

It seems the site housing Pizza Cutter’s research no longer exists, but that FanGraphs article has the approximate number for a whole bunch of stats.

If you have to use the sarcasm font, you're doing it wrong.

by Gerse on Oct 9, 2011 10:18 PM EDT up reply actions  

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about our heroic azure-tinged corvidae, the Toronto Blue Jays.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Hal2_small
Quantifying the Effect of Team Defense on Over/Underperforming the Team's FIP
Small
Brett Lawrie's historic defensive prowess

Recent FanPosts

Small
Blue Jays Player Stats Multiplied by 4
Small
Petition to change Suckage Award Titles
Jaysfanimage_small
The Lansing 4: What to do when they outpitch expectations?
Misc_003_small
Jays' All-Star Alliterative Name Team
Kingkelly_small
Stats tools?
Small
Jays Future Closer?
N41306733_31278203_7401_steve_golfin_small
my MLB power ranking, May Edition
Jaysfanimage_small
Blue Jays Farm Report - Apr 29-May 5

+ New FanPost All FanPosts >

Yahoo_full_count

Managers

Bluejayperched_small hugo

Rincewind-1_small Tom Dakers

Assistant Manager

Smith_up_small JohnnyG

Authors

Hiro_small jessef

Profile_small masterkembo

Profiel_small Woodman663

Minorleaguer_small Minor Leaguer

Tony_fernandez_small TonyFernandezSavedMyLife

Moderators

J_bau_small jays182

Aejfuulciaar18g_small Bowling_Guy25