In light of some recent discussions around these parts about pitcher hit and walk sequencing, the tendency of pitchers to give up more or fewer hits when runners are on base or not, I wanted to look at how repeatable sequencing is. The topic came up, as I'm sure you probably know, from Brandon Morrow's seemingly very low strand-rate. The topic was first covered by our very own hugo and was more recently treated by Steve Slowinski over at Fangraphs, with hugo concluding that Morrow needs to mix in his offspeed pitches more against righties and Slowinski concluding that Morrow needs to develop a new pitch.
I tend to agree with hugo on this one more than Slowinski, though I do think that he'd benefit greatly from developing a new pitch (which he started to do with the cutter towards the end of last season anyway). I've said it before, I don't think that working in new pitches is necessarily going to help him strand runners. My guess is that Morrow's strand-rate last season had more to do with bad luck than anything else. Nonetheless, I think major league hitters will adjust to starters who don't mix up their offerings so Brandon's days as an above-average starter who can afford to rely so heavily on just two pitches are numbered. In any event, this piece is not meant to treat the Brandon Morrow subject so much as it is to look at the effects of sequencing throughout baseball.
In order to look at the effects of sequencing, I first needed to quantify it. I decided to call a pitcher's sequencing the difference between his estimated strand-rate and his actual strand-rate. To estimate strand-rate, I used a population of all pitchers in 2011 who pitched 150 or more innings (n = 105). Given recent research, I used Analysis of Variance testing to verify the significance of the effects of BABIP (p < 0.01), k-rate (p < 0.01), and gb-rate (p = 0.07). Strand-rate is positively correlated (increases) with increasing k-rate and gb-rate and negatively correlated (decreases) with increasing BABIP. The effects of the interactions between these factors were not significant. Thus, I estimated strand-rate with a linear model fit to a pitcher's BABIP, strikeout-rate (as a percentage of at-bats, not per 9 innings), and groundball-rate (as a percentage of batted balls). The line was:
Estimated Strand Rate = 0.9632 - 1.034*BABIP + 0.334*k-rate. + 0.046*gb-rate
Note how much stronger the relationship between strand-rate and BABIP is than the relationship between strand-rate and k-rate. Now note how incredibly weak the gb-rate effect is, likely in large part because it inflates BABIP.
Next, I expressed the batter's "negative sequencing" as the difference between his estimated strand-rate and his actual strand-rate (Estimated - Actual). Essentially, if a pitcher's "Negative Sequencing score" is positive, the pitcher had poor sequencing (stranded fewer runners than predicted), if his score is negative, he has had good sequencing (stranded more runners than predicted), and if his score is zero, he has had neither good nor bad sequencing. The score is basically an estimate of how many more baserunners (expressed as a percentage of that pitcher's total baserunners used to calculate his strand-rate) scored last season than should have.
As points of reference, Jair Jurrjens had the best sequencing in baseball last season (-0.08) and our friend Brandon Morrow had the worst (+0.09). Amazingly, Justin Verlander, who stranded 80% of his batters, did not rely on sequencing at all (0.00). Since the formula for pitcher strand-rate is quite cumbersome [(H+BB+HBP-R)/(H+BB+HBP-(1.4*HR))], converting this to actual runs scored is a bit of a bear, because we need to first calculate how many baserunners were put into the strand-rate equation in the first place. I have calculated it for both Jurrjens and Morrow.
According to my calculations, Morrow had about 214 total baserunners to strand and allowed about 76 to score. If he stranded runners at the rate predicted by the equation above (74.5% of runners, instead of 65.5%), he would have stranded about 20 more runners, a difference of about one run per nine innings! Jurrjens had fewer opportunities to strand baserunners (170), so the difference between Jurrjens predicted strand and actual strand is slightly smaller (14 runs) and comes to about 0.8 runs per nine innings.
But does this mean that Morrow was unlucky and Jurrgens was lucky? Maybe, as Slowinski conjectured, there's something to Morrow's terrible sequencing after all. Using the same formula to predict a pitcher's strand-rate, I predicted all pitcher's strand-rates from 2010 (again, minimum of 150 IP). I first compared the pitcher's actual and predicted strand-rates in 2010 (p < 0.01, R**2 = 0.44) to verify that predicted and actual strand-rates were still linked. Next, I compared a pitcher's sequencing score in 2010 with his sequencing score in 2011 and found that they were almost certainly not linked (p = 0.95). Finally, I compared the actual and predicted strand-rates from 2010 with a pitcher's actual strand-rate in 2011. I found that a pitcher's 2010 predicted strand-rate had a significant effect on his actual 2011 strand-rate (p < 0.01, R**2 = 0.08), whereas a pitcher's actual 2010 strand-rate was not significant (p = 0.17, R**2 = 0.01). Scatterplots below!
Note how poorly both models predict a pitcher's strand-rate in 2011 -- the better model accounts for only 8% of the variance -- this is because strand-rate is so variable.
So does this mean that Brandon Morrow will be fine? I wouldn't say so -- but it does provide some evidence (at least for me) that we're reading way too much into samples that are way too small to be predictive. Undoubtedly, some pitchers are better from the stretch than others, but it seems like a large part of a starter's sequencing is related to blind luck and how well relief pitchers strand the runners they inherit as well. And I don't know about you but, like Bag End, that provides me some comfort.
Thanks to Frag for linking the fangraphs piece, MjwW, benk, and pikachu for their comments, and J.R.R. Tolkien via the Legends' "There and Back Again" for today's post title.