I'm Waiting For My Man: What Factors into a Pitcher's Strand-Rate?
In an excellent piece this past week, hugo looked at some of the difficulties Brandon Morrow's been having this season. He noted that Morrow has seemed to have had a lot of trouble stranding runners, possibly as a result of yielding a greater number of flyballs with runners on, which may or may not be related to having problems locating his pitches at the bottom of the strike zone when pitching from the stretch. The article got me to thinking -- what factors actually control a pitcher's strand-rate?
First, very little is actually known about strand-rate. To be honest, not only did I not know the method used to calculate it, I didn't even know what it actually purports to measure. Does strand-rate even attempt to account for the effect of homeruns automatically scoring runners who are already on? While the more sabrmetrically-inclined of y'all may not need this review, I certainly did. Per Hardball Times, the stat is calculated as: (H+BB+HBP-R)/(H+BB+HBP-(1.4*HR)). In words, this just means the number of baserunners a pitcher strands (the number he lets on minus the number he allows to score) divided by the number of baserunners he could potentially strand (total number of baserunners minus the number that score on homeruns). Note that the method does not do a perfect job of describing the probability that a pitcher has stranded any given baserunner because it does not include hitters who have reached on errors and it uses a general formula to calculate how many of a pitcher's runs were scored on HR (essentially, it assumes each homerun scores 1.4 runners). While this is not perfect in small samples, thanks to good friend to every statistician and half-cousin to Charles Darwin, Sir Francis Galton (to be honest, I'd never actually heard of him), who postulated the central limit theorem, in large samples, it's certainly good enough to allow us to make inferences.
To determine how closely strand-rate was correlated with (not affected by!) a number of factors, I performed separate simple linear regression analyses.
Predictions
Peripheral stats and fastball pitching:
I expected that K-rate affects strand-rate (positive correlation) more than any other single factor. I also expected that BB% would be weakly positively correlated with strand-rate. Upon reaching base on a walk, the hitter is at first base. Since non-HR hits can go for doubles and triples (and thus those baserunners are inherently more likely to score), it makes sense that a pitcher who has a higher proportion of his baserunners reach via the walk should have a better strand-rate. Additionally, walks are less likely to drive in runs than hits. It is also possible that pitchers who throw harder would have better strand-rates. As estimates of how hard pitchers throw, I used fastball velocity, fastball % (of pitches thrown), and weighted fastball value (per 100 fastballs) by linear weights.
BABIP, ERA, and ERA Estimators:
Strong negative correlations (as x increases, y decreases) should exist with BABIP (BABIP should drive strand-rate) and ERA (strand-rate should drive ERA). As BABIP is likely driven by LD-rate, I expected strand-rate should also be correlated with LD-rate. Since defence-independent (DIPS) ERA estimators/predictors take K% into account, I also predicted weaker, but still significant, negative correlations with DIPS metrics, such as FIP, xFIP, SIERA, and tRA.
Batted Ball Types:
Since linedrives are much more likely to become hits, I expected a negative correlation between linedrive-rate (LD-rate) and strand-rate. I did not expect to see a correlation with batted ball types besides linedrive-rate because, although groundballers are more likely to induce double plays and less likely to allow extra base hits, they are also more likely to allow singles. Over extremely large samples, there must be an effect of pitcher batted-ball splits, but -- outside of LD-rate -- I expected that effect to be very small and obscured in one-year samples.
Results
For those of you unfamiliar with what p- and R-squared values mean, here is an extremely quick and relatively painless explanation. The p-value refers to whether or not the factor is significantly correlated (i.e., if there is a relationship between the two variables at all). A general rule of thumb is that a p-value below 0.05 means there is likely to be a significant effect of the factor. A p-value greater than 0.05 means the evidence does not strongly support there being a correlation. The R-squared value refers to how strong the relationship between the two variables is. An R-squared value of 0.5 means that about 50% of the variation in y (say, strand-rate, for instance) is related to variation in x (say, K%) and the other 50% is controlled by other factors or random statistical noise.
Peripherals and Fastballs
K%
p-value < 0.001 R-squared = 0.1163 
BB-rate
p-value = 0.41, R-squared = 0.007
Fastball velocity
p-value = 0.45, R-squared = 0.0006
Fastball% (of pitches thrown)
p-value = 0.34, R-squared = 0.0095
Weighted fastball value
p-value < 0.001, R-squared = 0.2711
BABIP, ERA, and ERA Estimators
BABIP
p-value < 0.001, R-squared = 0.2943
ERA
p-value < 0.001, R-squared = 0.5707
FIP
p-value = 0.037, R-squared = 0.0450
xFIP
p-value = 0.048, R-squared = 0.0407
SIERA
p-value = 0.019, R-squared = 0.0564
tRA
p-value = 0.005, R-squared = 0.0806
Batted Ball Types
LD-rate
p-value = 0.031, Rsquared = 0.0482
Groundball : Flyball-ratio
p-value = 0.098, R-squared = 0.0285
HR/fly
p-value = 0.224, R-squared = 0.0155
Discussion and Conclusions
Peripherals and Fastballs
As expected, pitchers with higher strikeout-rates tend to strand more runners. This makes sense intuitively and has been discussed previously. I expected walk-rate to be positively correlated with strand-rate but the results show very little evidence that a correlation exists. It is likely that any positive effects of walking more batters on strand-rate may be obscured by other factors associated with wildness (such as wildness in the strike zone, an increasing number of wild pitches, or more trouble holding runners on base). Fastball value was strongly correlated with strand-rate, but that does not mean that fastball-pitchers are better at stranding runners, all it means is that pitchers who throw good fastballs strand runners better. In fact, not only does throwing more fastballs not increase pitcher strand-rates (I wouldn't necessarily expect it would), but increases in fastball velocity are not correlated with increases in strand-rate. Unexpectedly, simply throwing harder has no bearing on whether a pitcher will be better or worse at stranding runners than his peers.
BABIP, ERA, and DIPS
Again, expectedly, both BABIP and ERA are strongly negatively correlated with strand-rate. The most likely relationship is that increases in BABIP drive decreases in strand-rate. Corresponding drops in strand-rate are directly associated with increases in ERA. DIPS ERA estimators and predictors are all weakly negatively correlated with strand-rate. The relative strengths of these correlations are likely due in large part to the relative weights of the inputs for each of these ERA estimators. tRA uses actual batted ball data, which include a linedrive-rate component, and is most strongly correlated with strand-rate among DIPS stats. SIERA uses K% as an estimator of BABIP. Although this may cause it to overestimate the influence K% has on ERA, it does cause it to be more tightly coupled to strand-rate than FIP or xFIP.
Batted Ball Types
As predicted, pitcher linedrive-rate is correlated with strand-rate (presumably through BABIP). Somewhat unexpectedly, there is some evidence that groundball pitchers can maintain lower strand-rates than flyball pitchers. On the other hand, both the significance (p-value = 0.098) and strength (R-squared = 0.0285) of this correlation suggest that it is not all that meaningful. At the same time, since groundball-pitchers tend to have higher BABIP than flyball-pitchers, the fact that we see even a slight positive effect of groundballs is surprising and likely stems from groundballers being better able to suppress extra-base hits and induce double plays. There is virtually no correlation between HR/fly-rate.
Thanks to fangraphs for 2011 pitching statistics, the Velvet Underground for the post title, and hugo and benk for giving me the idea to do this.
101 comments
|
7 recs |
Do you like this story?
Comments
That is some grade A science there
I appreciate that you even followed the scientific article format of introduction, hypothesis, results and discussion.
Very nice
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
by Pikachu on Sep 12, 2011 9:37 AM EDT via mobile reply actions
Well done!
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
thank you much
i thought the results — particularly the ground ball stuff — were interesting. It isn’t exactly firm footing by any means, but it does suggest that groundballers could potentially outperform their peripherals more than we expected
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
A nitpick, but I’m a little confused with this statement:
At the same time, since groundball-pitchers tend to have higher BABIP than flyball-pitchers, the fact that we see even a slight positive effect of groundballs is surprising and likely stems from groundballers being better able to suppress extra-base hits and induce double plays.
Going by the graph you provided, if the GB:FB ratio goes up while strand-rate goes down, doesn’t that indicate a slight negative correlation not a positive one? The result wasn’t statistically significant anyway, so it’s not a big deal, but I was just curious if I missed something or if that’s a typo.
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
You're right
To clarify, what I meant by “positive effect” was a positive effect for pitchers (i.e., negatively correlated with strand-rate). You definitely didn’t miss anything, I should have clarified that in the article (will do so now). Thanks for reading so closely!
I am always a little uneasy with an alpha-value of 0.1, so, yeah, definitely take it with a grain of salt (if at all). I do kind of wonder what would happen if we expand our samples to include two or three years of data instead of just the 2011 season.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Well four things;
1.) I finally have become accustomed to articles on sabremetrics and was able to almost-completely follow your article and understood your key points and almost all terms used.Thanks for writing a very well written article
2.) The Velvet Underground are great, and the song you referenced is amazing
3.) I’m glad you posted when you did, especially a longer article, because I am pretty bored at the University library and don’t really want to start summarizing chapters
4.) Just an all around great read and good job. It was interesting, your style of writing encourages others to make their own predictions instead of mindlessly following the words across a page.
"Weaseling out of things is important to learn. It's what separates us from the animals... except the weasel."
glad you enjoyed it!
Yep, the Velvet Underground are great for sure . And you’re right — reading it critically is the way to go. While I occasionally (at least!) disagree with many of the folks on the site, there are lots of smart folks out here (and these include the ones who disagree with me as often as not) that make it really fun (and, truth be told, worthwhile) to write. Gerse and Siver (I think) mentioned it a few months ago — the sabr community is great because even the ones who don’t publish essentially do peer review for fun.
By the way, don’t think I’ve mentioned it before but I love that signature!
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Well, while we are on the topic of signatures...
I have always appreciated yours, the Simpsons are great and yours inspired me to find my own piece of beloved American comedy.
"Weaseling out of things is important to learn. It's what separates us from the animals... except the weasel."
by dannyofbosnia on Sep 12, 2011 11:34 AM EDT up reply actions
Yay, a shoutout!
I think it was actually Sivvi, not Siver (unless they’re the same person).
Once again, great stuff.
correlation
doesn’t have to add up to 1.0 if events aren’t mutually exclusive, right?
I find the correlations surprisingly low. Perhaps I just want the smoking gun, but I thought there’d be stonger correlation with xFIP and LD Rate, for example, than just 5%.
The weak correlation with LD-rate is intuitive, no?
Both strand rate and LD rate are largely uncontrollable by the pitcher, and all the results (with a meaningful n) of each metric are clustered very close to each other in a seemingly-random fashion (with a sample size of 1 year, at least, as jesse mentions).
And yes, you’re correct about the 1.0 thing, not only because they aren’t mutually exclusive (and some in fact are intertwined (like BABIP —> Strand rate —> ERA)), but because we lack the capacity to isolate and test every possible variable that exists. Even then, the correlations would be so weak with most things that they would just be labelled as statistical noise.
Scratch that
I’ve gone and confused myself a bit.
There very well could have been a strong correlation between the two even though both move seemingly randomly year-to-year when looked at on their own.
Which is to say
Without having seen the data it would not have been valid to assume a weak correlation just because each stat shows little correlation with itself year-over-year or because they seem to not be very skill-based.
/end reply spam
Right
what I think suppresses the correlation is the number of factors between LD% and strand%.
Consider it this way, LD-rate is not that strongly correlated with BABIP (even though linedrives are way more likely to go for hits, they are such a small percentage of batted balls that the correlation is weakened). Then, the correlation between BABIP and strand-rate, while strong, weakens the overall correlation between LD-rate and strand-rate.
Does that make sense?
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
In this case, yeah, that makes sense
But I’m not sure you could say as a rule that correlations are transitive (not that you are at all, I’m more just musing atype).
In other words, given:
1) a weak correlation between variables A and B
2) a strong correlation between B and C
3) A and B are not mutually exclusive
you can’t know for sure that A-C is weak, unless you determine that B-C is very close to 1
Right
I was operating under the assumption that the only reason LD-rate is correlated with strand-rate (A to C in this analogy) is because it increases BABIP. I would think that if you removed the LD x BABIP interaction, the effect of LD-rate would become non-significant.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
I don't entirely follow
BABIP doesn’t impact LD rate, the causation goes the other direction, what with LD rate being a component of BABIP.
How would you remove LD-BABIP interaction? Hold LD constant? You’d end up trying to find a correlation between a point (well, 300 points with the same value) and a distribution.
Neither of these conclusions I’ve inferred make any sense, so the error is clearly on my end here — can you re-explain?
I think you've got it
basically, try and find a large sample of pitchers who have the same BABIP but varying linedrive-rates. I wouldn’t think you would see any effect of variation in the LD-rate in strand-rate.
Don’t know if I worded that correctly, but, basically the way I interpret the connection between LD-rate and strand-rate is:
LD slightly affects BABIP
BABIP affects strand
If you break the correlation between LD-rate and BABIP, I would think you’d also break the correlation between LD-rate and strand.
Does that make sense?
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Okay, makes way more sense now
I was interpreting it as “Hold LD constant and I think it’ll remove the correlation between LD and strand rate,” which seemed silly since that’s tautological.
I get it now though, you’re saying “Control for BABIP and it’ll eliminate the correlation between LD and strand rate”
Great question
I guess I sort of left this implicit in the article but your comment is drawing it out. The fact that most of the factors are weakly correlated with one another shows just how much noise there is strand-rate.
The strong correlation between strand-rate and ERA basically just underlines how much ERA can be affected by something so random.
Also, don’t forget that being able to chalk up something like 10% of variation to K-rate is nothing to be sneezed at. A difference between a strand-rate of 70% and a strand-rate of 75%. Over a a 200 season inning where a pitcher allows 230 baserunners and 21 homeruns, the difference in Runs / 9 IP between a pitcher with 70% strand-rate (3.11) and a pitcher with a 75% strand-rate (2.59) is half a run. That’s pretty important!
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Also, by the way,
those “runs / 9 ip” don’t include runs scored on HR (projected as about 29.4 runs. That would increase our theoretical pitchers’ “actual RAs” to 4.43 (strand-rate of 70%) and 3.91 (strand-rate of 75%). If you’re interested in scaling RA to ERA, you can cut about half a run (so it’s the difference between a 3.41 ERA pitcher and a 3.93 ERA pitcher).
Still a difference of half a run, though, which, considering we’re looking only at the elements of strand-rate that should be controlled by the pitcher, would be nontrivial.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
although how much of the k% effect is from the actual strikeout ability
and how much is from more strikeouts = better pitcher in general.
I'm not sure what you mean here
Are you saying that better pitchers in general can maintain high strand-rates through means other than the strikeout? I’m guessing the skill you’re referring to is hit-suppression?
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
more like baserunner suppression
basically a pitcher that allows 4 baserunners per inning on average is going to have a very low strand rate. a pitcher that allows .1 baserunners per inning on average is going to have almost a 100% strand rate.
my point is this: higher k% leads to fewer hits. so when you see the correlation between k% and strand rate, it might actually be a correlation between hit rate and strand rate you’re seeing. the interesting question that i think you’re trying to answer is “what’s the difference in strand rate between two pitchers who are identical other than that one gets more strikeouts and one gets more outs on balls in play”, and to tease that out you’d need to do a multiple regression with interaction terms.
I'm not sure I see what you're driving at
If you take as given:
1) there is no correlation between strand-rate and BB% (see above)
2) outside of random noise in BABIP, hits / inning is basically the result of balls in play
Of course more hits will mean fewer runners stranded, but the ultimate reason that hits are being suppressed should be K%.
Also, the correlation between BABIP and strand-rate (which stronger than the correlation between K% and strand-rate), seems to be what you’re looking for since it’s basically summing up pitchers who are giving up different numbers of hits per ball in play. No, it does not take the interaction between K% and BABIP into account but, regardless of what SIERA claims, that seems to be a pretty weak correlation.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Also, forgot to say,
if I’m mistaken here, please let me know. Thanks again for reading.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
I realize I'm a little late to be jumping in here, but since line drive rates are brought up a lot around these parts as something which is given assumed significance, I'll throw in my 2 cents.
Line drive data has some pretty big issues with being accurately measured to begin with, and that’s before you even get into issues of how controllable or predictive it is. I don’t like using it to judge either hitters or pitchers, especially in samples of one season or smaller. Preferably I like to avoid it altogether as it’s value as a predictive tool for any player is shaky at best. gb/fb rates are much more stable when looking at batted ball data.
Colin Wyers wrote an excellent, very well researched piece hypothesizing why the issue with line drives might exist a few years back if anyone is interested:
http://www.hardballtimes.com/main/article/when-is-a-fly-ball-a-line-drive/
tl;dr version of these.
There’s a good chance that line drives aren’t actually a good measurement of hard hit balls, or predictive of future babip for hitters or pitchers. There are sound statistical reasons to believe the latter, and real world explanations to believe the former to be true.
That's all true
and I’m not trying to disagree with you here. I also haven’t seen any data showing a pitcher’s ability to prevent linedrives (possibly, as you said, due in part to a lack of standardization in scoring linedrives.
However, there is a statistically significant (p < 0.001) and pretty important (R-squared = 0.14) correlation between pitcher linedrive% and BABIP.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Is Morrow a Number 2 SP? - Who has higher trade value: Lind or Thames?
I think we need a Number 2 SP. I believe both Lind and Thames are dispensable. Who can get us that Number 2?
I would say neither.
Follow me @BBBMinorLeaguer | 2011 Jays record while in attendance: 9-11 (.450)
by Minor Leaguer on Sep 12, 2011 12:29 PM EDT up reply actions
neither Thames nor Lind
could come close to netting us a top-50 or so pitcher
Total Internet points: 10 000
We need a FA pitcher
CJ Wilson?
"I want to set the record straight: I thought the cop was a prostitute."
old and not actually that good
though still likely better than our current options in the very short run. that said, given his financial demands/requirements, not a good idea IMO
Total Internet points: 10 000
Dana Eveland is going to be a FA!
Follow me @BBBMinorLeaguer | 2011 Jays record while in attendance: 9-11 (.450)
by Minor Leaguer on Sep 12, 2011 2:39 PM EDT up reply actions
there, problem solved!
Hic sunt fortuna dracones
If I ask you guys to answer my sock survey am I going to get booted from the site?
I’m low on ways to get feedback…sad I know.
"I want to set the record straight: I thought the cop was a prostitute."
Yep :-)
"I want to set the record straight: I thought the cop was a prostitute."
Sock it to me
I am ready to answer any and all sock questions.
by FD98 on Sep 12, 2011 1:16 PM EDT up reply actions 1 recs
It's 10 questions.
I don't know how to post links
http://www.surveymonkey.com/s/8J68LPL
"I want to set the record straight: I thought the cop was a prostitute."
I just answered all 10 questions
What do I win? :)
A smile!
Thanks!
"I want to set the record straight: I thought the cop was a prostitute."
That sounds fair
A single sock.
"I want to set the record straight: I thought the cop was a prostitute."
I don't know what I was expecting
but I can’t say you didn’t describe it correctly
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
haha
Sorry to place it on your post Jesse. It was a good read and I enjoyed it.
"I want to set the record straight: I thought the cop was a prostitute."
Jessef - seriously - you need...
To drink more scotch (single malt), chill and let the beauty of the game overcome your desire to hoard stats and reproduce charts.
Your post (while well intentioned I’m sure) brought the same glazed over look to my eyes that She Who Must Be Obeyed brings to them when she asks for an accounting of my Liquor Store bill. Life is just too short for some forms of mental gymnastics.
by Mylegacy on Sep 12, 2011 1:37 PM EDT reply actions 1 recs
and besides
life may be short, but it’s the longest thing you’ll ever do
Total Internet points: 10 000
If you haven't figured out Mylegacy by now, go over to Batter's Box and peruse his comments
I like his point of view, especially when it comes to enjoying baseball, prospects and drinking. But then, I’m from BC, so I might be biased.
Hic sunt fortuna dracones
by JaysfanDL on Sep 12, 2011 2:42 PM EDT up reply actions 1 recs
Some people enjoy stats
others don’t. If you don’t really like it, you don’t have to read it. :)
by leaflover4ever on Sep 12, 2011 1:42 PM EDT up reply actions
hahaha
Oh, I drink plenty of single malt. I’ve got a bottle of Laphroaig waiting for me at home, just biding my time ‘till the weather cools a bit. It’s no Ardbeg but it’ll do in a dimple pinch
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
you have to try the Japanese single malt distillers
fantastic taste for your money, not peaty like many scottish single malts, but the Yamazaki has quickly become one of my favs
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
yoichi is also quite good
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
will have to try
I’ll take a look over at the liquor store
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
I sympathize
My eyes glazed over too and I went “Oh look at all the graphs with slopes and everything.” To preserve my sanity, I skipped to the conclusions and nodded sagely at each one.
When I’m presented with this type of article, I find this to be the best strategy.
i love scatter graphs with lines in them, as much as the next guy, but I love skipping to conclusions more.
by ayjackson on Sep 12, 2011 2:28 PM EDT up reply actions 1 recs
Oh but c'mon
They also had p-value and R-squared!! How often do you get that?
I have a book recommendation for you
I think you’ll really enjoy it (hint hint this is a stupid and terrible book)
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
my brother can be accused of many things
but not insufficient love of baseball or scotch
fantastic post, jessef.
For anyone who thinks that sabr takes away from the “beauty of the game”, here’s our collective response:
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
you're kidding me
you two are brothers? Or am I taking you too literally?
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
haha
it’s true, we are both metaphysical and biological brothers
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
brain asplode
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
is that really so surprising?
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
yes
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
i thought the intronet was a place where you don't personally know anybody else
you just broke that rule
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
Perhaps, we are the exception
that proves the rule (whatever that actually means, which I don’t think is anything).
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
I think you're actually using that phrase correctly
which is rare
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
we are not the only pair of brothers on Bluebird Banter
at least, I can think of one more
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
Craig and Edwin Encarnacion?
Follow me @BBBMinorLeaguer | 2011 Jays record while in attendance: 9-11 (.450)
by Minor Leaguer on Sep 12, 2011 10:21 PM EDT up reply actions
honestly?
can you expose?
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
I'll let one of them do it, should they choose to
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
yup
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
second asplosion
SuckaMD? really?
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
Is he in arms length of the sun also?

Not changing my signature until Hechavarria is promoted to the big leagues.
[Funny phrase about how few followers I have on Twitter]
Some might call it that
but I was one of the lead authors on a very unsuccessful baseball blog before I started writing here!
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
and that's classic
Sad, Drunk, And Poorly
My friends, love is better than anger. Hope is better than fear. Optimism is better than despair. So let us be loving, hopeful and optimistic. And we'll change the world. - JL
completely OT
but Frank Catalanotto (not nearly as good as I remembered) had a season with a 27.4 LD% with the Jays
Total Internet points: 10 000
interesting stuff
but in the end all it’s really saying is that good pitchers strand more runners.
also one mathematical note: it’s not true that a 0.05 p-value is enough to conclude statistical significance. that’s only true if you’re running one test with one variable. if you’re running more tests (or more variables), you need to have a lower p-value cut-off to get the same significance. ie getting a 0.05 p-value if you have one variable in your regression/study is the same significance as getting a 0.01 p-value if you have 5 variables in your regression/study.
Thanks for reading and great comment
You’re right, the likelihood that there is a false positive somewhere in there is way higher, the greater the number of tests that you run. In this case, though, I’d consider each of these separate analyses. If they’d been presented as separate studies by separate individuals (for example), the p-values would not have been adjusted, but the same problem would have occurred.
The reason that I gave p-values (instead of just saying, “Significant” and “Nonsignificant” was so that folks could make their own interpretations of the data as well. As long as folks keep in mind, as you obviously have, that each study has its own probability of having a false positive, they’ll realize that the alpha value of our whole study is not 0.05, it’s considerably higher.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Also, by the way,
you’re right in that it’s saying that good pitchers strand more runners. The point I was asking here was, what kinds of good pitchers strand runners and to what degree? Not only are there different ways for pitchers to be good (high K-rate, low BB-rate, high gb-rate), I had not seen any studies that actually quantified what “normal” strand-rates should be, given a pitcher’s peripherals.
Using this study, we could potentially develop a model based on a pitcher’s K-rate (and, depending on whether you buy into the ability of pitchers to control ld-rate or the weakly supported correlation between gb-fb-ratio and strand-rate, those factors could potentially be included as well) to estimate a pitcher’s strand-rate.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"
Not only are there different ways for pitchers to be good (high K-rate, low BB-rate, high gb-rate), I had not seen any studies that actually quantified what "normal" strand-rates should be, given a pitcher’s peripherals.
That’s the bottom line and why the piece is so useful. In my mind I had no clue at all – I just basically would take league average strand rate and adjust it based on how much better than league average the pitcher is and that’s what I figured his strand rate “should” be. Now we’re getting somewhere.
"Let us go forth awhile, and get better air in our lungs. Let us leave our closed rooms... The game of ball is glorious." - Walt Whitman
any chance
you could make a chart modelling expected strand-rates at, say, every natural number K/9 from 4-11? I don’t think that would be too hard, but it would be very illuminating
Total Internet points: 10 000
sure
the figure above is a simple linear regression on K%
the formula for the linear model with K/9 is:
strandrate = 0.0091 (K/9) + 0.6649
So, a pitcher who doesn’t strike anyone out can still strand 66% of runners. This number seems to make intuitive sense because slightly fewer than 1/3 of all batted balls go for hits, so a pitcher who struck no one out would put baserunners on around 31% of the time, but quite a few of those baserunners would get on with one or two outs already.
The slope of the line suggests that an increase of one K/9 should occur with an increase in strand-rate of slightly less than 1%.
Can’t make a table here, but:
K/9… xStrand%
4…… 0.7013
5…… 0.7104
6…… 0.7195
7…… 0.7286
8…… 0.7377
9…… 0.7468
10….. 0.7559
11…… 0.7650
Also, as noted in the comments above, the difference between stranding 70% of baserunners and stranding 75% is quite large . . . given a WHIP of 1.15 and a HR/9 of 0.95, it amounts to about half a run per nine innings.
Maybe we should revise the article to include the changes from the comments, it’s pretty neat stuff.
"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by 






















