Behold, the StrikeTracker.
For those of you who don't venture into the comments, I've been tracking the number of missed ball and strike calls for and against the Blue Jays this year.
To date, my methodology has been about as unscientific as possible: Go to BrooksBaseball.net, eyeball the strikezone map for each game, and count the number of missed calls based on the extended zone (the dashed line), which is Mike Fast's approximation of where an umpire will generally call a pitch a strike. There are a number of obvious problems with this method, the most glaring being that I've been forced to treat every location in the zone as if it results in a strike 100% of the time and every location outside the zone as if it produces a ball call 100% of the time. While a pitch in a given location should be either a ball or a strike 100% of the time, intuitively and empirically we know that not how it actually works. There's definitely value in determining how many calls the umpires get wrong relative to the zone as it should be called, but I'm more interested in seeing how the Jays and their opponents have their pitches called relative to the average strikezone for the MLB as a whole.
The second major issue with the current naïve method comes in determining how to deal with pitches on the edge of the extended zone. So far the strategy has been to only count pitches that are at most 25% on the line, so as not to risk being inconsistent in my eyeballing. This obviously leads to two more issues: 1) I'm ignoring a whole slew of pitches which are not actually 50-50 propositions, but rather range from somewhere around 60-40 to 25-75, and 2) I introduce imprecision in trying to manually determine if a pitch marker is only, say, 23% on the line instead of 28%.
There must be a better way!
Behold, the new and improved Slightly Less Unscientific StrikeTracker!
Since it's impossible to find the cleaned up data that has gone into the various studies on the size of the strike zone, I've resorted to manually plotting the maps by Matthew Carruth below in Excel in matrix form. While the maps are not as granular as I would like, they are the most detailed ones I could find that covered all score differentials, counts, and base-out states for both RHB and LHB.
The first step was to determine the total height and width of the zone presented, as, while the bins are presented as squares, the range of the axes is not equal. Since there is a little bit of territory in each direction beyond the major axis lines and the bins in Carruth's maps do not divide evenly into one foot, we first need to determine the number of pixels in 1 foot. Zooming in really close (SCIENCE!) indicates that each horizontal foot is 95 pixels and each vertical foot is 112 pixels. The portion of the image that is actually covered by the strike zone map is 390 pixels wide by 346 tall, which leaves 5 pixels in each direction beyond the major markers. Some simple mathemagic tells us that the total width of the zone in the image is 4.126316 feet, and the total height is 3.178571. Since the map is 30 bins wide by 30 bins tall, it follows that each bin is 0.137544 feet wide and 0.105952 feet tall. In the interest of being able to work with the data on a more granular level, I divided each square into 4 by halving the size of each bin both horizontally and vertically. This gives us 62 bins in each direction: (30 x 2) + 2 for the bins that represents everything beyond the upper and lower boundaries of the graph. From there it's just a matter of setting up the range for each bin (easily done with Excel magic) and manually inputting the data.
Since there are 10 colours on the maps, I've inferred that each one represents a 10% jump in probability of a pitch being called a strike, and have treated the value of every colour except dark blue and deep red as being entirely in the midpoint of its range (ie 15%, 25%, 65%, etc). For the dark blue, I cross-referenced with this chart, also by Matthew Carruth, which gives a more granular breakdown of the probabilities (but unfortunately only exists for RHB). I assigned a value of 0 to much of the darkest blue area, and scaled the edges of the non-zero areas up from 0 to 5. Similarly, for the dark red I determined the middle of the zone as it is called, assigned that bin a value of 99, and then scaled the surrounding bins down to 95.
The last step in our marginally scientific endeavour is to deal with the transition from bin to bin. A pitch in the leftmost dark red bin does not have the same probability of being a strike as one in the middle dark red bin, and a ball in the light red immediately beside the outermost dark red is not exactly 10% less likely to be called a strike than is a ball in the neighbouring bin.
To smooth the transitions from bin to bin, I've taken an average of the 49 bins surrounding each bin in the original map on the left (which is to say, current bin +/- 3 rows and columns), which produces the image above on the right. That seems excessively large at first blush, but since I've quadrupled the number of bins, averaging 49 of them is the same as averaging 12.25 of the surrounding bins in the original maps (current bin +/- 1.25 rows/columns). I considered using 25 and 81 bins instead of 49, but using 49 matched up most accurately with the aforementioned granular chart.
With the matrix set up, the next step is importing the pitchf/x data from Brooks*, recording called balls as 0 (strikes) and called strikes as 1, and having Excel look up the probability for a pitch in a given location to a given handed batter being called a strike. The ball/strike call minus the probability of the strike call gives us the number of strikes gained or lost for each pitch. SUMIF those up by day, and you have the finished product as seen in the Google Doc. An aside for anyone who spends even a little bit of time using the excellent VLOOKUP and HLOOKUP functions in Excel: I very strongly recommend learning the INDEX-MATCH (for one dimension) and INDEX-MATCH-MATCH (for matrices) methods. The formulae are nominally more complicated to remember (though equally easy to learn), but are much more flexible and lightweight than the standard lookups.
*The Brooks data is unfortunately divided by pitcher and by game. If any of you know of somewhere that tracks the data by game or by team by game and has it available for download, please let me know below.
While I have not yet had time to consider the data with my analyst pants on, here are some easily digestible tidbits for those of you who don't feel like clicking on a link: To date, the SLUST has the Jays at -51.4 strikes, while the naïve model has them at -43 strikes. The most favourable game for the Jays netted them +2.46 strikes (Apr 15), while the least favourable cost them -9.25 (Apr 8), and the Jays have had as many positive differential games of any magnitude as they've had games of -6.5 or worse. The greatest discrepancies between the two methods have come on April 9 (SLUST 7.86 more favourable than naïve), April 12 (6.25 less favourable), and the awful game last night (5.40 less favourable), with only one other game coming in more than 3 net strikes away from the models agreeing.
I do recognize that there are literally dozens of ways in which I could be more rigorous about this project - more scientifically sound smoothing methods, controlling for score differential, inning, count, base-out state, pitch type, race of the pitcher etc. - however I lack the data and the technical know-how to do so in a reasonable amount of time and effort for what is a personal/blog-interest project rather than an advancement of knowledge. In addition to it being too much work, using a different zone for each count, pitch type, and base-out state would implicitly accept the fact that umpires have different zones depending on the situation, when they should not, and would require my figuring out how to apportion credit and blame to each of the pitcher, catcher, and umpire. While those are all massively relevant considerations for a broader analysis, the focus of this project is simply to determine how well the umpires apply the average definition of the strikezone (we'd ideally use the prescribed strikezone, but that's clearly unreasonable since we don't yet have robots, so the average zone as it is called) to both teams.
In future episodes of The Adventures Of The Slightly Less Unscientific StrikeTracker, I'll be adding breakdowns by Jays pitcher, pitcher handedness, batter handedness, opponent, whether a pitch is inside or outside the 50% line, the worst blown calls, and whatever else the commentariat comes up with that I'm able to extract and track from the pitchf/x data. In the mean time, give me your suggestions for making the model more Slightly Less Unscientific in the comments.
Last second edit: Turns out Jeff Sullivan is banging a similar drum this year over at FanGraphs