I Went to the Market to Realize My Soul 'Cause, What I Need, I Just Don't Have: Anatomy of an Empty Batting Average

Just a fun look at how well batting average describes overall offensive production

What does it mean when someone says that a player has an empty batting average? Many of us have used the phrase and even if you've never heard it before there's a good chance you can figure it out. Basically, even though batting average was long seen as the most important indicator of offensive ability, it doesn't tell the whole story. Good hitters sometimes have bad batting averages and, occasionally, a weak hitter will post a good batting average because batting average overvalues singles and undervalues walks and power. So singles hitters are likely to show off empty batting averages while "Three True Outcomes Hitters" are more likely to provide a lot more offense than their batting averages would suggest.

I was interested in quantifying the idea of an empty batting average. My first question was simple and has been answered many times before but I figured I'd look at 2014 data up to now. How well does batting average describe offensive output? I looked at all qualified players per FanGraphs (data through 17 Aug 2014) and constructed a linear model for weighted runs created (wRC) per plate appearance based on batting average. Turns out that around 45% of variance in wRC per plate appearance can be ascribed to batting average (see figure below).


So there's obviously a very strong trend where higher average is correlated with better hitting but, as expected, there's a lot of spread (i.e., batting average accounts for less than half of overall variance in wRC). Based on this model, I decided to identify which players were the strongest outliers (i.e., which players had the emptiest or the "fullest" batting averages) by finding the difference between their actual wRC and the wRC expected based on their batting averages.

The results shouldn't surprise anyone: players like Ben Revere (2.2% bb-rate, 0.057 ISO) and Jean Segura (3.6% bb-rate, 0.084 ISO) are among the players with the emptiest averages, underproducing their batting averages by 30 runs or so per 650 plate appearances. Players like Carlos Santana (17.4% bb-rate, 0.203 ISO) and Mike Trout (11.9% bb-rate, 0.271 ISO) have the fullest batting averages, overproducing their averages by around 30 runs per season.

As long as the Blue Jays are concerned, their hitters run the gamut from Edwin Encarnacion (+35 runs per season but he did not qualify for the initial sample) and Jose Bautista (+23) to Ryan Goins (-26) and Munenori Kawasaki (-21). So the next time someone tells you that Ryan Goins is an acceptable everyday second baseman at this point in his career, you can tell them his bat isn't nearly as good as that .197 batting average would indicate.