Fun with Arbitrary Statistical Manipulation

So after the Jays loss last night, I tuned into the CoCo Cordero Show on Rogers Media Powered by Fans Sportsnet 590 the Fan. After a couple of - how to say it politely - less than brilliant calls, the last caller (starting at the 19:45 mark) called Mike Wilner out for a particular way he defended Cordero last week, in which he had argued that since the start of May, Cordero had a 1.74 ERA (at that point, entering last night's game it was apparently 3.00), and that a couple outings badly skew his results. The caller rightly pointed out that everything counts, to which Wilner said:

If you're looking at it - reasonably, rationally, and critically - you can look and see that yes, this is what the overall stats are, and yes, they're very badly skewed by a couple of poor outings, the majority of the outings have been fine. If you want to say that's the overall stats and that's the only thing that matters, it's a 9.60 ERA. If you take out 3 of the 14 or whatever and the ERA goes down to 2.20, oooh.

What this essentially boils down to is cherry picking numbers to fit an argument - take out the worst outings, and he looks good!. And it's done a lot, though not usually so overtly (usually in the form of using arbitrary endpoints, which at least still captured everything in between which inevitably includes good and bad).

This is a particularly bad way to analyse baseball results, since baseball is not a symmetric game. Even the worst pitchers gets outs more often than they don't, have scoreless innings more often than they don't, and the best hitters make outs more often than they get on (though Ted Williams comes pretty close, with a career .482 OBP). The reality is, if you cut a tail off of any distribution, the average is going to move significantly, but this is especially the case for baseball outcomes.

So, in the spirit of thinking ""reasonably, rationally, and critically", my goal is to present the true talent of some Blue Jays fan favourites. I will limit myself to excluding 10% of a sample (that would accord to 4 of Cordero's 39 appearances, so it seems to be in the same spirit). Behold:

  • Bautista is hitting .243/.356/.541 with 27 HR and a .381 wOBA in 83 games. Sounds great, right? But if you get rid of his 8 best games (by PA weighted wOBA), he loses 10 HR, and his line falls by 45 points of BA, 36 points of OBP and 138 points of SLG. His wOBA of .322 would be slightly above average. So, really Bautista is a league average hitter.
  • Let's go back a little further on Bautista. he's played in 393 games since the beginning of 2010. But that's skewed by a few good games, so let's take out the top 39. His wOBA falls from .420 (3rd best in baseball) to .349. Which is good, but we're talking Rickie Weeks/Dan Uggla good. Also, those world leading 124 HR? Yeah, lose 52 of them, leaving 72, which, again, is fine, but not nearly as exciting. So if we're thinking "critically" about it, Bautista is a good hitter who just has a few great games every week or so. In fact, if we think "rationally" about it, he was lucky to have finished 4th and 3rd in the MVP voting the last couple years.
  • If we get rid of the 7 games which skew Arencibia's season, his wOBA falls from .297 to a putrid .224 with only 6HR. So, really, he's been irredeemably awful. Though, on the flip side I guess we can also take out all those passed balls since they skew his ball handling results disproportionately, so he's a lot better defensively.
  • Last year, the much maligned Jo-Jo Reyes pitched 110 innings. This took a while, but I found his worst 11 innings and removed them. That got rid of 32ER of the 66 he gave up. That leaves an ERA of 3.10! Why did we dump this guy? Roy who?
  • Ryota Igarashi threw 52 pitches for the Jays before being DFA'd. Sure they weren't great, but if you get rid of the 5 worst pitches that skew his results since they went for hits, he was alright!
  • Last year, EE had 74 chances at 3B. If we get rid of the 7 worst, he only had one error. So, if we're being truly "reasonable", EE was really a Gold Glove at 3B. Also, most of the time, fans up the 1B line had nothing to worry about in terms of errant throws!
  • Conversely, sure Lawrie makes a few nice plays at 3B, but other than that and getting credit for shifts, nothing special

Feel free to add any more below. The takeaway from this is that when somebody says "so-and-so was really good/bad if you take out such-and-such", ignore them. The good comes with the bad, and life is not symmetrical.

Editor's Note: This is a FanPost written by a reader and member of Bluebird Banter. It was not commissioned by the editors and is not necessarily reflective of the opinions of Bluebird Banter or SB Nation.