Welcome to the first of two parts of a sure to be recurring series of posts on the StrikeTracker and the oodles of data that I can pull from my ever-ballooning spreadsheet! Not all of the data will be useful, but it should at least all be interesting. This week we'll be looking at the Jays pitchers, and next time we'll look at the opposing pitchers / Jays hitters.
First, a refresher on the Slightly Less Unscientific StrikeTracker, or SLUST. If you want the full explanation, click here for my original post on the matter. If you are rightly terrified of all that text or are easily bored by reading methodology, then just know that I use magic and plagiarism to determine the approximate likelihood that a pitch in any given location will be called a strike, and then sum difference between expected called strikes and actual called strikes and break it down by team by game.
While the game by game breakdown is interesting (and depressing! so very depressing!) in its own right, we can harness the powers of Excel to tell us so much more and to try to identify some of the factors contributing to the Jays' most unfortunate called strike differential. What follows is a lot of charts and numbers, but simply bear in mind that the methodology is the same as the regular StrikeTracker, the results are just broken down by factors other than game date. I've managed to finagle the following breakdowns out of the BrooksBaseball pitchf/x outputs: by pitcher, pitch type, pitcher and pitch type, inning, handedness, and count.
A note about the charts that follow: Expected is the number of strikes expected by the StrikeTracker for a given breakdown, Gain/Loss is, unsurprisingly, the number of strikes gained (positive) or lost (negative), and # is the number of pitches to date that meet the criterion in question. The data below is accurate as of 6:00pm on Friday May 9, 2014.
As we've been able to tell just from watching, R.A. Dickey and Brandon Morrow have been getting killed by the umpires, to the tune of more than 13 and 10 lost strikes respectively, with Morrow actually coming out worse on a per pitch basis. Somewhat less expected is that Dustin McGowan and J.A. Happ have been similarly punished. While Happ's net strikes per pitch is by far the worst on the team, he has only made four appearances so far (and his one start accounts for 50% of his batters faced), so it is probable that his results are skewed by a particularly unhelpful umpire or two. At the other end of the scale, only Neil Wagner, Marcus Stroman (SSS) and Chad Jenkins (super SSS) have had positive strike zone luck, and even then, only marginally so.
By Pitch Type
A note on MLBAM pitch classifications: The algorithm does not know the difference between 2-seamers and sinkers, and between curveballs and knucklecurves, so those are combined in the chart. Also, it thinks that one of every ~45 Dickey knuckleballs is an eephus. Other than Dickey's knuckleball, which I'll deal with later on, nothing really stands out here as being particularly noteworthy relative to our prior knowledge that the Jays pitchers have lost ~51 strikes to date. The slider has been the most unlucky pitch for non-Dickey Jays hurlers so far, which jibes with the collective knowledge that umpires have a harder time calling bendy stuff than they do fastballs. This is further illustrated by the chart below which lumps the pitch types together by speed and break.
Fastballs have resulted in .00257 more strikes per pitch than have off-speed pitches, which, themselves, have resulted in .01182(!) more strikes per pitch than have knuckleballs. While these numbers don't seem particularly large - and they're not - they do add up over the course of a 150 pitch game.
By Pitcher And Pitch Type
Now the good (read: sad) stuff. Below is every pitcher-pitch type combination that has resulted in a difference of -1.5 strikes or worse this season.
Umpires do not like R.A. Dickey. That, or they're really bad at following knuckleballs. Either way, that sounds like a pretty good argument for the use of our benevolent metal friends. Dickey has lost 15.3 strikes on 684 knuckleballs this season, which has cost him roughly 2.2 runs. If the umpires continue to squeeze him at that pace, he'll end up losing an entire win worth of strikes by the end of his usual 200+ innings!
Another interesting note here is that of all combinations with a sample of at least 25, six of the seven worst per-pitch differentials have come on fastballs. This is quite odd as fastballs tend to be easiest to call correctly, and, indeed, the Jays as a team have a better per-pitch differential on fastballs than they do on off-speed pitches despite the individual laggards being mostly fastballs. The pitchers and pitches in question are listed below for your viewing displeasure.
On the other end of the scale, we have only 6 combinations that have accumulated more than 1.5 additional strikes this season. Mark Buehrle, control master, leads the way with 2.9 strikes gained on his two-seamer, and the Dickey "fast"ball comes in behind that at +2.4. On a per-pitch basis, Aaron Loup laps the field with a +.049 on his change-up.
Not a whole lot to say here. I plan to integrate the run value information with MjwW's post on the inning-by-inning RS/RA numbers once I work out the same breakdowns for opposing pitches. The 2nd and 4th innings have been the two unluckiest in both absolute and per-pitch terms so far, however neither inning was particularly poor for the Jays' pitchers per MjwW's post.
By Pitcher, Batter, And Matchup Handedness
As you can see, Blue Jays pitchers have been equally unlucky whether righty or witch who must be burned at the stake, with per-pitch numbers that are identical to the 5th decimal place. Looking at batter handedness, however, we see that opposing lefty hitters have been nearly twice as lucky as their righty counterparts. Below that are the breakdowns by pitcher-batter matchup, which don't seem to tell us a whole lot of anything but are mildly interesting nonetheless.
As you can see, it appears as though the Jays have been getting absolutely slaughtered on 0-1 counts, to the tune of 22 lost strikes and -.034 strikes per pitch, while having more ordinary levels of bad luck on most other counts. However, work by Matthew Carruth shows us that the size of the 50-50 zone changes with the count, shrinking by as much as 29% on an 0-2 count and growing by as much 11% on a 3-0 count. Accounting for these changes** by multiplying the number of expected strikes above by the size of the zone relative to the overall strike zone, we get the chart below:
Now, it becomes clear that the biggest issue for the Jays pitchers is not stingy umpires bringing the count to 1-1, but rather having first pitch strikes stolen away and being forced into 1-0 counts, in which the umpires have still been stingy with the Jays. With this method, Jays pitchers have had equal luck on 0-0 and 0-1 counts, with a per-pitch net of -.0178. Also noteworthy is that even after accounting for the smaller zone with a full count, the Jays have lost four should-be strikeouts to free passes. Every other count except for 1-0 and the 3-x (smallish samples) counts had per-pitch differentials below .001
**While this method incorrectly assumes that the strike zone shrinks and grows uniformly in all directions, it should still give us a reasonably close approximation of the effects of the changing size of the zone.
It's important to remember that the StrikeTracker only considers pitches called for balls or strikes, and excludes swings and misses. It is possible, if not likely, that, beyond the umpire impact on the count and pitchcounts, the smaller zone the Jays pitchers have been getting forces them to throw pitches slightly closer to the middle of the zone, compounding the negative effects of the lost strikes by also increasing the chance that opposing hitters will make solid contact. This effect would not be seen in the StrikeTracker, so if someone brave would like to try to figure out the impact, I'm happy to send them my file.
As I said at the top of the article, without doing a good deal more work I'm not terribly sure what use this data has from a decision making standpoint, but it definitely is interesting to look at and be sad about. If any of you in commentland have ideas to that effect, or have any ideas for other breakdowns I can include in future installments, shout 'em out.