## FIP and xFIP: a primer

Alrighty everybody, time for Chapter 2 in the SABR primer series. You can catch the first post (which is pretty much required reading for this post) HERE. In this post we're going to talk about two of the most popular DIPS (Defense Independent Fielding Statistics), FIP and xFIP, and again much of this information is from the FanGraphs SABR Glossary.

FIP stands for Fielding Independent Pitching. For the nerdiest of us, the formula for calculating FIP is

((13*HR)+(3*(BB+HBP-IBB))-(2*K))/IP +c

where c is a scaling constant that puts FIP on a similar scale to ERA - it also means it's rather difficult to individually calculate a player's FIP since one generally doesn't know what the standard c is.

FIP is useful because it, like other DIPS, eliminates several large sources of statistical noise in a pitcher's runs allowed: it eliminates the abilities of the fielders to turn batted balls into hits (or not), the official scorer in deciding what is a hit and what's an error, the effects of sequencing*, and ultimately significantly reduces the effects of luck on a pitcher's stats. It is only affected by a pitcher's ability to keep the ball in the park, to get strikeouts and to limit walks and HBPs.

*Sequencing is an effect that can negatively or positively affect a pitcher's (E)RA but has almost nothing to do with a pitcher's true talent. A pitcher who gives up a single, walk, home run in that order, then gets three outs, gives up 3 (earned) runs. A pitcher who gives up a home run, then gets two outs, gives up a single, then a walk and gets another out probably only gives up one run. In the long run, it doesn't really matter what order the events happened - what matters for predictive DIPS analysis is that this pitcher gave up a home run and a walk, and his DIPS will probably be adjusted negatively as a result. Sequencing can obviously play a huge role in small samples.

Basically, FIP is helpful because it much more reliably predicts a pitcher's future performance than ERA does. Of course, a pitcher who has, say, 7 seasons with a stellar ERA is likely to put up more good campaigns, but FIP has a higher year-to-year correlation coefficient than ERA and thus has better predictive value. As a result, FIP is good for telling you how good a pitcher would be on an average defensive team.

Since we know that pitchers have almost no control over their BABIP, pitchers have to do as much as they can to get strikeouts (which are outs over 99% of the time), limit walks and HBP (which can't ever be turned into outs) and homers (which can't be turned into outs, since they're outs not homers if they're caught at the wall). These BABIP fluctuations are a large reason why ERA is not highly correlated from year to year.

A downside of FIP is - of course - that it has nothing to do, really, with how many runs a pitcher gave up. Of course, it's a pitcher's job to give up as few runs as possible. However, we know that pitchers do a lot of things that they don't have much control over, so again, we use FIP for predictive value. That said, FIP isn't very useful for measuring a player's performance in small samples, or for describing how well someone pitched previously.

The other very popular DIPS is xFIP. As is often the case in statistics, the small x stands for "expected", so xFIP is Expected Fielding Independent Pitching. xFIP is identical to FIP except for one small difference - HR:FB ratios are normalized to league average.

Analysis has shown that pitchers have very little control over how many fly balls leave the park. And since pitchers don't usually give up all that many fly balls in a season (in the grand scheme of things), random variation - and park factors - can play a huge role here too.

Normalizing the HR:FB ratio has a few effects. First, it largely eliminates park factors from the equation in a roundabout way; instead of applying park factors to a pitcher (which is what happens for + stats like OPS+) it just assumes a standard number of balls leave the park, which accounts for park effects. Second, by adjusting for the HR:FB ratio it achieves a very high correlation coefficient with a player's future performance. xFIP has one of best (maybe the best) correlations with future performance of all DIPS.

There are, of course, a couple of issues with xFIP as well. The first is that a few pitchers actually do have some ability to affect their HR:FB ratio. Examples are CC Sabathia, who hasn't ever posted an HR:FB ratio higher than league average, and Matt Cain (along with other Giants pitchers), who has shown the same ability, though he's also aided by AT&T Park. That said, even these pitchers are only a couple of percentage points off league average, so it's almost definitely not fair to say that a guy who gives up 5% HRs on FBs has just "figured it out." Another issue is that groundballers like Ricky Romero tend to have higher HR:FB ratios than flyballers (you may recall that groundballers have lower BABIPs on ground balls than do flyballers, though). Finally, xFIP is scaled a little higher than FIP, so a 4.1 xFIP is a little bit better than a 4.1 FIP.

To conclude: ERA has its uses. However, there is far too much statistical noise involved in its use to have good predictive value. DIPS are better, and FIP and xFIP are two very useful DIPS statistics. You can find them on FanGraphs, on a player page.

Further reading, for the super nerdy: Tom Tango Deconstructing FIP

