I've Been There and Back to the Place that We Used to Call Home: Strand-Rate and Sequencing Revisited
In light of some recent discussions around these parts about pitcher hit and walk sequencing, the tendency of pitchers to give up more or fewer hits when runners are on base or not, I wanted to look at how repeatable sequencing is. The topic came up, as I'm sure you probably know, from Brandon Morrow's seemingly very low strand-rate. The topic was first covered by our very own hugo and was more recently treated by Steve Slowinski over at Fangraphs, with hugo concluding that Morrow needs to mix in his offspeed pitches more against righties and Slowinski concluding that Morrow needs to develop a new pitch.
I tend to agree with hugo on this one more than Slowinski, though I do think that he'd benefit greatly from developing a new pitch (which he started to do with the cutter towards the end of last season anyway). I've said it before, I don't think that working in new pitches is necessarily going to help him strand runners. My guess is that Morrow's strand-rate last season had more to do with bad luck than anything else. Nonetheless, I think major league hitters will adjust to starters who don't mix up their offerings so Brandon's days as an above-average starter who can afford to rely so heavily on just two pitches are numbered. In any event, this piece is not meant to treat the Brandon Morrow subject so much as it is to look at the effects of sequencing throughout baseball.
In order to look at the effects of sequencing, I first needed to quantify it. I decided to call a pitcher's sequencing the difference between his estimated strand-rate and his actual strand-rate. To estimate strand-rate, I used a population of all pitchers in 2011 who pitched 150 or more innings (n = 105). Given recent research, I used Analysis of Variance testing to verify the significance of the effects of BABIP (p < 0.01), k-rate (p < 0.01), and gb-rate (p = 0.07). Strand-rate is positively correlated (increases) with increasing k-rate and gb-rate and negatively correlated (decreases) with increasing BABIP. The effects of the interactions between these factors were not significant. Thus, I estimated strand-rate with a linear model fit to a pitcher's BABIP, strikeout-rate (as a percentage of at-bats, not per 9 innings), and groundball-rate (as a percentage of batted balls). The line was:
Estimated Strand Rate = 0.9632 - 1.034*BABIP + 0.334*k-rate. + 0.046*gb-rate
Note how much stronger the relationship between strand-rate and BABIP is than the relationship between strand-rate and k-rate. Now note how incredibly weak the gb-rate effect is, likely in large part because it inflates BABIP.
Next, I expressed the batter's "negative sequencing" as the difference between his estimated strand-rate and his actual strand-rate (Estimated - Actual). Essentially, if a pitcher's "Negative Sequencing score" is positive, the pitcher had poor sequencing (stranded fewer runners than predicted), if his score is negative, he has had good sequencing (stranded more runners than predicted), and if his score is zero, he has had neither good nor bad sequencing. The score is basically an estimate of how many more baserunners (expressed as a percentage of that pitcher's total baserunners used to calculate his strand-rate) scored last season than should have.
As points of reference, Jair Jurrjens had the best sequencing in baseball last season (-0.08) and our friend Brandon Morrow had the worst (+0.09). Amazingly, Justin Verlander, who stranded 80% of his batters, did not rely on sequencing at all (0.00). Since the formula for pitcher strand-rate is quite cumbersome [(H+BB+HBP-R)/(H+BB+HBP-(1.4*HR))], converting this to actual runs scored is a bit of a bear, because we need to first calculate how many baserunners were put into the strand-rate equation in the first place. I have calculated it for both Jurrjens and Morrow.
According to my calculations, Morrow had about 214 total baserunners to strand and allowed about 76 to score. If he stranded runners at the rate predicted by the equation above (74.5% of runners, instead of 65.5%), he would have stranded about 20 more runners, a difference of about one run per nine innings! Jurrjens had fewer opportunities to strand baserunners (170), so the difference between Jurrjens predicted strand and actual strand is slightly smaller (14 runs) and comes to about 0.8 runs per nine innings.
But does this mean that Morrow was unlucky and Jurrgens was lucky? Maybe, as Slowinski conjectured, there's something to Morrow's terrible sequencing after all. Using the same formula to predict a pitcher's strand-rate, I predicted all pitcher's strand-rates from 2010 (again, minimum of 150 IP). I first compared the pitcher's actual and predicted strand-rates in 2010 (p < 0.01, R**2 = 0.44) to verify that predicted and actual strand-rates were still linked. Next, I compared a pitcher's sequencing score in 2010 with his sequencing score in 2011 and found that they were almost certainly not linked (p = 0.95). Finally, I compared the actual and predicted strand-rates from 2010 with a pitcher's actual strand-rate in 2011. I found that a pitcher's 2010 predicted strand-rate had a significant effect on his actual 2011 strand-rate (p < 0.01, R**2 = 0.08), whereas a pitcher's actual 2010 strand-rate was not significant (p = 0.17, R**2 = 0.01). Scatterplots below!
Note how poorly both models predict a pitcher's strand-rate in 2011 -- the better model accounts for only 8% of the variance -- this is because strand-rate is so variable.
So does this mean that Brandon Morrow will be fine? I wouldn't say so -- but it does provide some evidence (at least for me) that we're reading way too much into samples that are way too small to be predictive. Undoubtedly, some pitchers are better from the stretch than others, but it seems like a large part of a starter's sequencing is related to blind luck and how well relief pitchers strand the runners they inherit as well. And I don't know about you but, like Bag End, that provides me some comfort.
Thanks to Frag for linking the fangraphs piece, MjwW, benk, and pikachu for their comments, and J.R.R. Tolkien via the Legends' "There and Back Again" for today's post title.
41 comments
|
3 recs |
Do you like this story?
Comments
Good piece
And I agree with you; I think people are reading too much into Morrow’s numbers with men on base. Given a larger sample size, I expect him to be much closer to league average or better (since high K%)
His 2011 wRC+ is 26
Excellent!
Very interesting! So basically, we can expect some deviation in strand rate due to skill (Ks and inducing GBs), quite a bit due to BABIP variation (which will largely due luck), and then there’s random variation on top of that.
Can you just clarify one things:
I first compared the pitcher’s actual and predicted strand-rates in 2010 (p < 0.01, R**2 = 0.44) to verify that predicted and actual strand-rates were still linked.I think you just mean you were testing the linear model to ensure it fitted the 2010 data?
Coming back to Morrow, his FIP was 1.08 higher than his ERA this year. If we take out the 20 runs you identify (neutralizing the random variation in LOB%), then they come basically in line. Interestingly, this means that in both 2010 and 2011, he significantly underperformed his peripherals, but for completely different reasons – BABIP in 2010 and strand rate in 2011. I was a little worried that maybe he was the guy who will never get the peripherals in line, but I am quite reassured. Though, I still believe the lack of a third pitch is a problem, not just forgoing forward as you do, but I’m digging through the Pitchf/x data.
Also, in that vein, the first thing I was looking at in terms of the Pitchf/x data was whether there was a difference in the pitches themselves with Men on Base vs. Bases Empty (in other words, does Morrow throw poorer pitches out the stretch?). I was going to hold off posting them until I finished more of the Pitchf/x analysis I wanted to do, but I think it would fit in nicely here because it largely dovetails with the conclusions here. I thought of one extra dimension I wanted to test, and when i’ve done that I’ll post in the comments here.
Thanks for doing still – I never paid much attention to strand rate, since I figured it was highly correlated to BABIP, and there’s wasn’t much more to it. And as it turns out, BABIP is an important determinant of strand rate, but there is still random variation beyond that and with this information I can think about it, account for skill factors, and think about the strand rate results and implications.
Excellent work
Btw, I found this article from Hardball Times regarding pitch sequencing. Though it’s from 2009, I thought it may interest you if you haven’t already read it.
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
It’s a shame Josh Kalk got snapped up by Tampa. Not only do we not get great analysis like this (or his player cards), but it also meas he’s actively working to beat the Jays.
Don't worry
The Jays have Tom Tango. =P
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
Alright, so here's some Pitchf/x analysis of Morrow's pitches
One reason for a lower strad rate could be problems pitching in the stretch, ie, Morrow just throws poorer pitches which are easily to hit. In hugo’s article linked to above, he alludes to this possibility:
My thought is that Brandon should be working on two things to get ready for next season: locating his pitches when pitching from the stretch…This thought occurred to me as well, so I decided to see if there’s anything in the pitchf/x data that would confirm or refute this.
I looked at 6 dimensions on which we can evaluate pitches: initial speed (essentially out of the hand, this is what stadium guns measure), ending speed, horizontal break and vertical break, and horizontal and vertical location (these values may not be very useful, since pitchers can vary location for strategy, but I wanted to see if there was a difference). For each pitch, I tabulated the Men on Base/Bases Empty split, and the batter handedness split separately to see if there was an effect. I calculated the averages, as well as the standard deviations to see if the distribution was different.




In all, I can’t really see much of a difference at all between the performance with men on and with bases empty. The most noteworthy observation I could find was that in 2011, Morrow’s slider velocity and vertical break seemed to have a higher level of variance than in 2010, but that was the same for men on and bases empty. Also, in 2010, there was a small difference in Morrow’s slider break along MOB/BE split, but given the relatively smaller samples in 2010 compared to 2011, I’d hesitate to make too much of it.
From this, I can see nothing to indicate that Morrow actually has problems pitching in the stretch, as has been suggested by some. This would seem to dovetail with the conclusion of the article that Morrow’s strand problems in 2011 were essentially random variation. Anyone see a pattern I’m missing?
by MjwW on Feb 4, 2012 5:32 PM EST reply actions 5 recs
Well done
I was wondering the same thing regarding Morrow pitching from the stretch differently from the wind-up. I just didn’t know where to find such numbers to examine this. Which site did you get those numbers from?
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
Data from Joe Lefkowitz's site
http://www.joelefkowitz.com/index.php. If you search for something using the Pitch F/x tool, when the page loads, there’s a button to download the Excel data relating to the search. I just pulled all 2010 and 2011 data and sorted in Excel, but you can download much more granular data
by MjwW on Feb 4, 2012 6:01 PM EST up reply actions 1 recs
My god
It’s beautiful! I hope they do the one for umpires soon. I would be very interested in that data.
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
When I saw that exclamation
My mind immediately went to the “”http://www.youtube.com/watch?v=xl_rljOB4AE" >Battle Hymn of the Republic" for whatever reason. And with a few simple modifications, it makes sense:
Mine eyes have seen the glory of the coming of the Datum:
It is trampling out the platitudes that Griffin spews verbatim;
It hath loosed the fateful lightning of the following ultimatum:
The statheads do march on!
(I would have used data rather than datum but the rhyming words don’t fit)
by MjwW on Feb 4, 2012 7:34 PM EST up reply actions 1 recs
Also
I’ve just learned that Brooks Baseball’s Pitchf/x tool (which is a collaboration with The Hardball TImes) now has individual player cards, which are totally awesome.
They use their own classifying algorithm, so the results are a little different (for example, they say a lot of the split-fingered pitches that I figured were change-ups are actually fastballs), and you can’t sort for splits like MOB/BE, but you have all sorts of BIP data and what not. It’s really great.
Here’s Brandon Morrow’s player card
when you look at your huge charts
is it like that scene in The Matrix where Cypher says he doesn’t see numbers anymore, he sees pictures?
I agree
1) I was surprised to see that his velocity didn’t dip from the stretch.
2) I don’t see anything in these data that suggest Morrow shouldn’t be as effective with runners on.
3) One thing that may be useful is if you can summarize pitch f/x numbers for someone who didn’t seem to have any trouble with runners on (Romero, for example) and see if the actual norm for pitchers is to ‘reach back for something extra’ with runners on. My guess is that you wouldn’t find that but I don’t know for sure.
Thanks for bringing this up — quite enlightening! One other small note — I ran tests to see if pitchers with more diverse arsenals strand more runners (they don’t) and if pitchers with more better than average pitches strand more runners (they do). This might support the argument that adding a pitch would help Morrow, but I’m not sure it’s really possible to disentangle the effects of more above-average pitches from the effects of being a better pitcher.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Also --
the trouble with the aggregate pitch f/x numbers above is that they don’t quite give us the location information we need to determine whether or Morrow isn’t locating his pitches from the stretch - for all we know he could just be aiming for the centre of the plate and letting the ball go where it will. If that’s the case, the aggregate location likely wouldn’t change and the variance wouldn’t if he’s wild. Of course, since his k and bb-rates don’t seem affected, I’d think that we can probably rule out that possibility.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Yep
I don’t quite agree with what you’re saying about aiming for the middle and letting it fly, because the movement f the pitches is similar, but I see the potiential for aiming for different places – “being more fine/careful”. I kept the location info for for a couple reasons:
1) The added time to include it was very marginal (or to add any other pitchf/x variable), so if there was something it would be shame to not see it
2) There’s enough data points that if there was something significant in aggregate, it would probably indicate something notwithstanding the cuationary points above. For example, if his fastball was 3 inches higher across hundreds of pitches, I’d think it was something about the delivery that was causing him to elevate the ball, rather than intentionally throwing higher. In any event, an observation like this would be worth looking into further, but we really don’t see any differences so it’s moot.
Aside from the changeup, RR's velocity did increase a bit with runners on vs. bases empty
FT:
Bases empty: 91.2
Men on: 91.9
RISP: 92.1
RISP w/ 2 outs: 92.4
FF:
BE: 91.8
Men on: 92.3
RISP: 92.5
RISP w/ 2 outs: 92.8
CH:
BE: 85.2
Men on: 84.5
RISP: 84.6
RISP w/ 2 out: 84.3
CU:
BE: 76.9
Men on: 77.3
RISP: 77.3
RISP w/ 2 out: 77.1
Now, whether the difference is significant or not I’m not sure. As well, I don’t know if this is more widespread.
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
I don't think it's significant
You’d have to run some statistical tests to be sure (which I would but I’ve have to pull out some textbooks and I’m not inclined to do so right now), but Morrow’s also increased by a similar ~0.5 MPH, so I don’t think there’s much to make of it.
I'll run Romero's full numbers
Can’t do it right now, but it’ll only take about 20 minutes to do the same treatment for Romero since I’ll be able to copy over the formula structure used to sort the data, and have the summary structure and formatting already.
Based on the velocity difference Frag found, I doubt there will be anything particularly interesting, but it’s easy to check.
just to clarify
by this:
and if pitchers with more better than average pitches strand more runners (they do)
you meant that pitchers with more better than average pitches strand more runners than the overarching model would have predicted, right? because there’s a correlation between good-ness (high Ks, low BBs) and strand rate in general, and the more above average pitches you have, the better you are pretty much by definition
you're right, better pitchers strand more runners
and pitchers with more above-average pitches are going to be better pitchers, so the point may be moot, hence why it’s pretty weak evidence, really.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
ahh okay
I wasn’t sure if there was some sort of compounding effect, where you’d see pitchers with diverse arsenals stranding more runners than their peripherals would suggest
I re-ran the test
and weighted the number of above-average pitches against the total pitch value. It seems like the number of above-average pitches was just a proxy for a pitcher’s total pitch value, which is really in itself a proxy for babip. I’m not really all that much for the linear weights-based pitch type values, simply because they’re so heavily influenced by BABIP.
It seems like it would be much easier for them to do a defence-neutral version (though it could be problematic for HR/fly rates, since those could be quite correlated with different pitch types).
Anyway, I think any tentative evidence I mentioned in the comment above is essentially debunked.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
I'm curious what causes such a large std. dev. for his fastball. for example 4.4 in this year 5 in. last year in that 68% bubble between 1 and -1 std dev. it seems like a lot
by Seal Clubbing on Feb 4, 2012 10:35 PM EST up reply actions
Keep in mind, that’s horizontal break, measured from the middle of the plate. The plate is 17 inches wide, and pitchers will often use several inches on each corner, so the target area is at least 24 inches. Pitchers want to work the whole plate, and if anything, stay away from the middle of the plate. And remember the baseball is about 3 inches wide, so that’s really not much.
To be honest, of all the measures there, that’s the one I’d pay the least attention to, if any. And particularly the SD. I included it because it took almost no effort, and if there was a difference, I would be interesting in knowing what it was and then maybe thinking about it.
O ok I thought interpreted it wrong thanks.
by Seal Clubbing on Feb 5, 2012 3:08 PM EST up reply actions
Just out of curiosity what is your background in math
by Seal Clubbing on Feb 5, 2012 3:13 PM EST up reply actions
I really think...
As we saw late last year – the cutter is the missing piece. He now has a full quiver of sharp pointy pitches. That means more pain for batters – with or without men on base.
Nice article.
Interesting stuff. With this said, I think the Cartesian wave is fostering a tendency to mistaken “difficult to quantify” for randomness, particularly in pitching.
He could be tipping pitches, his breaking ball could be breaking earlier, he may have more difficulty hitting a certain zone consistently, or any number of subtleties that could impact in strand rate, any of which could effect outcomes and all of which would be difficult to capture in a quantitative analysis.
I don’t think the sabermetric community has a good grasp of the nature of pitching yet. Great piece nonetheless.
Great piece nonetheless.
Most arguments are really about context.
Yes, you read this on a baseball blog:
I think the Cartesian wave is fostering a tendency to mistaken "difficult to quantify" for randomness
Follow me @Minor_Leaguer
by Minor Leaguer on Feb 4, 2012 10:39 PM EST up reply actions
Honestly, I'm not sure exactly what a Cartesian wave is
But frankly, it’s precisely because I read stuff like that on this blog that I like it.
I know I shouldn’t be reading it but…. here’s Bleacher Report’s sabermetric projection for the 2012 Blue Jays.
Follow me @Minor_Leaguer
by Minor Leaguer on Feb 4, 2012 11:38 PM EST up reply actions
Lol
A sabermetric projection without the sabermaterics (to be fair, there was pythag W-L, but that was for 2011, not part of the projection). But what can you expect from an article with a spelling mistake in the first sentence.
You know what really ticks me off? I rewarded that drivel with 3 more page hits from thier sidebar, so I could indluge in drivel like:
I believe an OPS over .740 is generally about average for a major leaguer. Only Escobar, Bautista, Lawrie, Edwin Encarnacion, Eric Thames and Kelly Johnson were over .740Well, if I have 6 starters out of nine above average, I’m pretty happy (never mind positional adjustments in all this, that’s a bridge too far for them)
My guess is that it refers to a (somewhat-)widespread movement
towards rigorously challenging long-held beliefs
(see http://en.wikipedia.org/wiki/Cartesian_doubt )
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
You're absolutely right
But the likelihood that he’s tipping his pitches over an entire season and no one on the Jays staff realized it?
there are, as you point out, a slew of things that could be acting for Brandon. As you say, just because something is difficult to quantify does not mean that it’s random. On the other hand, just having numbers showing that he has trouble with men on base doesn’t necessarily mean that it’s nonrandom, either.
You may be right, I may tend to be on the “random” side of things too often. I’ll admit, I’m kind of fascinated by the idea that all these things that we spend time considering are just statistical noise inherent in the system.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by 































