clock menu more-arrow no yes mobile

Filed under:

Can You Improve this Place with the Data that You Gather?: Why I Like Sabrmetrics

"Hey, man of science, with your perfect rules of measure, can you improve this place with the data that you gather?"
-- Bad Religion, "I Want to Conquer the World"

Many of us around here like to joke that objective analysis through sabrmetrics reduces baseball to binary code.  Why would we watch games when we're likely to glean just as much pleasure (and, more importantly, knowledge!) from simply glancing at a series of strings of ones and zeroes?  Like all great jokes, it's funny 'cause it's true.  Though, if we are going to be 100% honest with ourselves and each other, there is a certain kernel of truth to this.  As a general rule, of course, given the option to spend three hours watching a ballgame (four hours, if the Yankees or Red Sox are playing!) or spend that time poring over boxscores and statistics, I choose watching firsthand over stats pretty much every time.  Still, note the qualifiers.

In any event, first and foremost, I'd like to stress that, although some might have you believe otherwise, there's nothing normative about sabrmetrics.  Whether one defines oneself as sabrmetrically-inclined or not is neither intrinsically good nor inherently damning.  In fact, whether you know Yunel Escobar's aggregate defensive rating or you think a relief pitcher's performance should be judged on how many saves he has, in a large-scale reductionistic sense, you're using an objectively determined statistic to aid you in your characterization, which, like it or not, makes you a sabrmetrician.  Now, naturally, most of us take a bit more of an holistic viewpoint on this, defining "sabr stats" as the "meaningful" ones (basically, the ones that analyses have shown are best correlated with what the statistic is purported to characterize).  As a baseball fan, my personal goal is to find simple solutions to complex questions.  And, make no mistake, most questions that plague baseball fans are complex.  The criteria that define the best player in baseball, for instance, can be interpreted many ways.  Just three things we might want to consider when we make our decision include: 1) if the player must play only one position capably or if he receives bonus points (and how many?) for varying degrees of versatility; 2) how we judge defensive skills; and, though it is infrequently discussed, 3) whether we’re filling our team with 25 doppelgangers of that player, because, if so, there must be a premium on players who can both hit and pitch (or who at least have strong arms).

So this seemingly simple question now has answers that could range from Jose Bautista to Micah Owings (no kidding!).  Now, Owings clearly doesn’t seem to pass the smell test for being considered the best player in baseball.  There’s a good reason for this – while he provides a degree of versatility that few players could even hope to match (Shaun Marcum's grand slam notwithstanding) – Major League Baseball rosters have 25 slots, so there is a premium on specialization at the elite level.  Owings does not provide the kind of specialization that would make him the best player at that level, but he could very well be the most useful MLB ringer on a college team – he’d hit the cover off the ball and be an exceptionally good pitcher.  The reason Micah Owings does not come to most of our minds first (or at all) when we consider the best players in baseball is because we are not interested in who would be the best ringer on a college team, we're interested in who would be the best player on an MLB team.  In a sense, the answer to every question depends on what specific criteria we use to answer that question so, before we can find our answers, we have to define our questions.

And that’s the rub.  Defining our questions forces us to simplify them in scope but increase them in number, complicating matters by requiring us to find lots of answers.  As humans, we are predisposed to biases when we attempt to simplify those questions.  These biases can distort our initial questions (and, in turn, distort our ultimate answers).  Objective analyses (essentially, the basis of sabrmetrics) attempt to remove (or, at least, account for) those biases.  At the same time, objective analyses are only as reliable as the data on which they are based and the ways that the analyses are designed and interpreted.  In many cases, those data are far from perfect and those studies are poorly-designed and incorrectly interpreted.  This can be truly frustrating when "bad data" and poorly conceived research do not merely compromise our capacity to reach meaningful conclusions, but actually cause us to draw incorrect ones.

However, as a group, we can identify where we've made errors and correct them, synergistically answering questions we could never answer alone.  In my opinion, what's truly fascinating about sabrmetrics is not simply minimizing distortion in condensing baseball to statistics, it's minimizing distortion in amplifying statistics to baseball.  Next time it seems like sabrmetricians are trying to reduce the game to binary code, remember that the real endgame is turning those ones and zeroes you see today back into frozen ropes and dying quails tomorrow.