Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Win or Lose, Boston Celtics' New Big 3 Era A Success

How many extra pitches are required for ten defensive runs allowed?

For example, if Travis Snider was 10 runs better than Eric Thames defensively this year what would the effect on the bullpen be? Would the number of extra pitches be negligible or would it be a valid concern if you are choosing one player over the other?

My guess would be that to allow an extra 10 runs a team would have to throw at least 100 extra pitches or perhaps an equivalent of around 7 extra innings. That is just a guess though and I’m curious is someone out there can quantify the number more precisely.

Comment 87 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

Interesting question

First – a little confusing…you talk mostly about just extra pitches, but there’s a reference to the effect on the bullpen. I’m not sure you could isolate the effect on the pen alone, though I think you’re just interested in pitches more broadly

My thoughts on a method:
Take say, 2002-2011 (10 years of each teams, so 300 teams season), and a regression using total team pitches thrown as the dependent variable (Y) and team defensive WAR (X1) and team pitching WAR (X2) as independent variables. This would control for the quality of the pitching, which one would think is the largest determinant of pitches thrown. This should tell us not only how many extra pitches, but whether there is even a statistically significant effect at all.

I don’t have the ability to run the data right now, but unless someone has a bettermethod, I’ll maybe try to do later tonight unless someone else wants to.

by MjwW on Feb 22, 2012 5:04 PM EST reply actions  

I’m assuming an extra pitches are coming from the bullpen because starting pitchers are typically on a pitch count and don’t pitch complete games very often. For example if there are 150 pitches thrown in a game instead of 140 it wouldn’t usually have any effect on the number of pitches the starter throws.

by JaysSaskatchewan on Feb 22, 2012 5:13 PM EST up reply actions  

Okay

I don’t know that it really follows, but I don’t think it really matters either in terms of strictly trying to find an answer to your question (right?)

by MjwW on Feb 22, 2012 5:24 PM EST up reply actions  

That’s right. The answer is whatever it is and then it can be interpreted in different ways.

by JaysSaskatchewan on Feb 22, 2012 5:27 PM EST up reply actions  

Can the ratio of pitches to runs overall be used or is that too simplistic?

by JaysSaskatchewan on Feb 22, 2012 5:25 PM EST up reply actions  

In what way?

I assume you mean Pitches/Run as the Y variable? And then what as the X? Just defensive runs?

by MjwW on Feb 22, 2012 5:36 PM EST up reply actions  

I don’t know the run or pitch totals but for example, if there were 20 pitches thrown for every run in mlb generally would 20 (the ratio) *10 defensive runs = 200 pitches be a useful estimate? It seems to simplistic though…

by JaysSaskatchewan on Feb 22, 2012 5:46 PM EST up reply actions  

Hmm

That doesn’t sound bad…I feel like there’s something I’m missing wheni think about it, but I can’t give you a reason that wouldn’t work. I like my method better from a statistical perspective.

by MjwW on Feb 22, 2012 7:30 PM EST up reply actions  

I think the defensive runs would be scored a bit differently. For example it is difficult for poor range to cause a HR. They would also differ depending on the position. Poor OF range would result in more doubles than poor SS range.

The simple way would give a good “ballpark” number.

by JaysSaskatchewan on Feb 22, 2012 8:07 PM EST up reply actions  

I don’t know where you get total pitches thrown numbers though (for all of mlb).

by JaysSaskatchewan on Feb 22, 2012 8:11 PM EST up reply actions  

Sorry, I missed this earlier

Fangraphs has this…if you go to teams —> pitching 2011 (or any other year) —> batted ball tab, you get summaries by team

Or maybe a link is easier. Likewise, for any player, you can get individual data under the batted ball tab on their page.

by MjwW on Feb 23, 2012 1:02 AM EST up reply actions  

Thanks!

So using this link I got a total of:

683854 pitches
20167 runs

For a ratio of mlb pitches to runs of 33.91

Which gives 339.10 expected pitches for 10 runs. Pretty much the same as Pikachu (340.97).

This isn’t likely to be the exact number because extra hits due to fielding aren’t walks (unless a misplayed foul popup resulted in a walk) and are very infrequently HR’s (go Jose Canseco video). The actual number should be within 100 pitches of 340 though?

by JaysSaskatchewan on Feb 23, 2012 1:38 PM EST up reply actions  

Alright, I ran the team numbers for 2002-2011

So there’s a sample of 300 (30 teams x 10 years), using Fangraphs data
Y = number of pitches thrown
X1 = team pitching WAR
X2 = team UZR

Regression results (standard error in brackets underneath coefficents):
Y = 24,077 – 31.80X1 – 1.52X2
(87.433) (5.280) (0.929)

So basically, the team pitching WAR has a t-stat of -6.02, meaning it is highly significant which makes sense (better pitching, less pitches). The team UZR has a t-stat of -1.64 (p-value of 0.1023) which is moderately statsistically significant (if using the traditional 95% confidence interval one would reject statistical significance). But if we examine the coefficent, 1 more point of UZR (so 1 run prevented) would save 1.5 pitches, so 10 runs would save about 15 pitches.

by MjwW on Feb 23, 2012 12:30 AM EST up reply actions  

well it would be rejected under alpha of 0.05

but it would be pretty close, no? isn’t t-crit around 1.72 or something?

by benk on Feb 23, 2012 12:31 AM EST up reply actions  

We'd accept under alpha = 0.11 (accoridng p-value)

with n= 300, the t distribution will basically approximate the normal distribution (the n=1 restriction is basically meaningless), and the cutoff would be 1.96 (I think I have this right)

by MjwW on Feb 23, 2012 12:43 AM EST up reply actions  

Wouldn't your hypothesis...

…be that better UZR lowers pitch count, making it a one-tailed test. Therefore, the p-value is actually 0.050115, just a hair’s breadth over the 5% threshold we normally use. I’d say you have evidence that UZR works.

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:47 AM EST up reply actions  

I just plugged it into by significance calculator...

…and got:

-1.52/0.929 = 1.6362, significance of t = 0.051432, based on 297 df

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:50 AM EST up reply actions  

Right, I forgot it's one tail

So even at the 5%, it’s basically significant.

(Sorry, non-stats major here!)

by MjwW on Feb 23, 2012 12:52 AM EST up reply actions  

Stats prof...

…here.

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:53 AM EST up reply actions  

And let me add...

…you’re doing fine!

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:54 AM EST up reply actions  

Yeah I got that!

Couple of sloppy errors, mainly from rustiness…the one-tail thing, and in that regression we’d have 297 df not n-1=299 as I said above.

by MjwW on Feb 23, 2012 12:55 AM EST up reply actions  

Errors - except in baseball...

…are important, because we learn from them.

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 1:08 AM EST up reply actions  

I should be doing CFA level 1 In December

And once I get going, all of my former stats knowledge should come back, but would you mind if I run a few questions by you if I have some difficulty?

@VagabondBansal

by Vagabond13 on Feb 23, 2012 10:36 AM EST up reply actions  

So what are we going to do...

…set up a Fanpost or Fanshot titled “Prof. Bluejaysstatsgeek’s stats help for CFA students?” :-)

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:16 PM EST up reply actions  

That sounds about right

Or maybe e-mail correspondence if you’d oblige

@VagabondBansal

by Vagabond13 on Feb 23, 2012 3:56 PM EST up reply actions  

Doesn’t a run have to add at least as many pitches as the average mlb plate appearance? (Since almost every run results in an addiitional AB).

by JaysSaskatchewan on Feb 23, 2012 12:35 AM EST up reply actions  

Keep in mind these are aggregate results. And we’re explaining relatively little variance in the number of pitches with these two variables (I forgot to report the R^2 for the UZR regression, but it’s 0.115, so basically we’re explaining 11.5% of the change in R^2). We may be omitting important variables, in which case the model is not properly specified, so the assumptions of OLS are violated, and basically the result are junk…can we think of anything else that would be important in explaining pitches?

Intuitively, I understand where you’re comng from…still rying to completely think through the implications

by MjwW on Feb 23, 2012 12:50 AM EST up reply actions  

While I was at it, I pulled TZR data as well

Fangraphs only has it for 2005-2011, so the sample is only 210

Y = number of pitches thrown
X1 = team pitching WAR
X2 = team TZR

Regression results (standard error in brackets underneath coefficents):
Y = 24,142 – 32.04X1 – 3.16X2
_ (110.169) _(6.723) (1.226)

(The brackets are there to line-up the standard errors, nothing more). So this one was similar stat significance on the pitching, but the TZR numbers have a t-stat of -2.58, so that’s statsitstically significant. I’m pretty sure TZR is the same scale, 1 point of TZR = 1 run, so this would suggest 31-32 pitches.

One note: This regressions, while significant parameters, explains relatively little of the variance (R^2 = 0.143, adjusted R^2 = 0.135)…so we’re explaining only about 14% of the variance in pitches with these tw data points.

I have DRS data too, will report the results, though I’m unsure as to the interpretation

by MjwW on Feb 23, 2012 12:41 AM EST up reply actions  

DRS, 2003-2011

Sample = 270 team seasons

Y = number of pitches thrown
X1 = team pitching WAR
X2 = team DRS

Y = 24,142 – 30.65X1 – 4.69X2
__(89.799) (5.363) (0.858)

Now, I have to dea how to intepret DRS in terms of translating the numbers to runs, but it’s the one with highest statistical significance, t-stat = -5.44. Anyone? Is it 1 point of DRS = 1 run?

by MjwW on Feb 23, 2012 12:59 AM EST up reply actions  

Also, this explained more the variance

R^2 = 0.193, so much higher than the other two

by MjwW on Feb 23, 2012 1:14 AM EST up reply actions  

What the correlation between...

…DRS and Pitching WAR?

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 1:22 AM EST up reply actions  

nevermind...

…it should be zero, or at least any correlation spurious. I was confusing DRS with another stat that could have been correlated.

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 1:27 AM EST up reply actions  

Yeah

I had the spreadsheet open, so I just quickly checked (more to verify the lack of correlation) and r=0.0206

by MjwW on Feb 23, 2012 1:34 AM EST up reply actions  

well, DRS

and baseballreference pitching WAR should be significantly correlated

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 23, 2012 9:03 AM EST up reply actions  

DRS is defensive runs saved

so, yeah, it is scaled to runs

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 23, 2012 9:18 AM EST up reply actions  

good question, good approach to answering it

but can you control for innings pitched? I don’t know that it will necessarily make any difference but it should account for some of the variance.

I’d also be interested to know if there was an interaction between pitching fWAR and team UZR . . . my guess is that, since WAR is calculated from FIP and BABIP affects FIP, there should be a significant effect of their interaction.

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 23, 2012 9:02 AM EST up reply actions  

To answer the second part first

For 2002-11, the correlation between team fWAR and team UZR is -0.0093925. Here’s the chart:

by MjwW on Feb 25, 2012 9:46 PM EST up reply actions  

this is pitching fWAR and batting fWAR combined, right?

What about just team UZR vs. team pitching fWAR?

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 27, 2012 12:12 PM EST up reply actions  

I figured there'd be a trend,

albeit a weak one, but I guess not

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 27, 2012 1:38 PM EST up reply actions  

To answer the second part

I added IP into the DRS model, since it was the best one (min IP = 1411.67, max = 1484.67 – I converted to proper though it was probably immaterial)

Y = number of pitches thrown
X1 = team pitching WAR
X2 = team DRS
X3 = team IP

Y = 12,708 – 36.63X1 – 5.15X2 + 7.98X3
(3658.9) _(5.612) _(0.859)__(2.55)

Model R^2 = 0.220, adj-R^2 =0.212

So the model explains a little more, IP variable is significant, the other variables are slightly more significant, but it really takes a bite out of the intercept’s sigificance, moving the s-stat from 271 (!) to 3.47.

by MjwW on Feb 25, 2012 9:56 PM EST up reply actions  

However

Thinking about it for a second, I thought there might be interaction between team fWAR and IP…more IP (above replacement), more fWAR…And I was right…r =0.34

I’m going to rerun that above regression, but replace fWAR with fWAR/IP, eseentially making it purely a quality stat rather than a counting and quality stat since the counting component is reflected by IP

by MjwW on Feb 25, 2012 10:10 PM EST up reply actions  

Well that was disappointing

Y = number of pitches thrown
X1 = team pitching WAR*10 / (IP/9) —> essentially runs above replacment/9, easier to understand
X2 = team DRS
X3 = team IP

Y = 13,278 – 587.17X1 – 5.15X2 + 7.585X3
(3633.7) _(90.159) _(0.859)__(2.535)

Model R^2 = 0.220, adj-R^2 =0.212 (not a mistake, same as above)

So that basically didn’t move the needle at all. It really begs the question for me: Can we explain more? Only around 22% of the variation is explained, despite controlling for pitching quality, IP, defence, all of which are statisically significant.

You made a good point initially – FIP is not strictly DIPS, since BABIP affects the denominator. Maybe if FIP we modified FIP to use Expected IP in the denominator, it might explain more, or give more accurate results in any event?

by MjwW on Feb 25, 2012 10:34 PM EST up reply actions  

That might be helpful

because, as you said, our estimate of “quality of pitching” actually includes the “quality of fielding” in it, too

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 27, 2012 12:16 PM EST up reply actions  

I'll try to do this

Won’t get a chance until tonight at the earliest, and maybe not for a couple days…manipulating the data isn’t a problem, I just need to think about how to do it, and things fit together.

If this drops off the front page of FanPosts before I get a chance to do this, I’ll try to leave a message in another post when it’s up

by MjwW on Feb 27, 2012 2:22 PM EST up reply actions  

sounds good

Might not make a difference but might be worth a look

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 27, 2012 4:48 PM EST up reply actions  

Yeah

Even if it doesn’t, I’ve wanted to do some thinking about truly DIPS measurements and possibly look into differentials (out of interest) – so this is a good inroads even if it doesn’t show anything.

by MjwW on Feb 27, 2012 5:45 PM EST up reply actions  

Alright, here's the method I used

This took a little more thinking than I initially though, so I want to put the method out, and if there’s something you disagree with and that I maybe haven’t fully thought through we can tweak. We want a fully DIPS stat, meaning one where IP isn’t BABIP dependent. Starting with identities:

BABIP = (H – HR) / (TBF – BB – HBP – HR – K) …this is what Fangraphs uses and seems to be commonly accepted.

TBF = IP*3 + RS + LOB

Fangraphs doesn’t report LOB, and while I think I could have backed it out of LOB%, I decided on a slightly different angle.

I estimated TBF using TBF = 3*IP + H + BB+ HBP. This overestimates TBF, since it doesn’t account for DP, CS, etc, but the correlation with actual TBF is r = 0.95 so it’s a very good proxy.

We can rearrange this into 3*IP + H = TBF – BB – HBP and substitute it into the BABIP formula —> BABIP = (H – HR) / (3*IP + H – HR – K). I then calculated each team’s BABIP by this formula and the league BABIP.

Now, going back to the initial idea, the idea is to get rid of the BABIP effect. That means for a team with a worse than average BABIP, figuring out how many balls in play should have been outs rather than hits. So I created a formula to figure this out, creating a formula for adjusted BABIP —>A-BABIP = [(H-x) – HR] / [(3*IP+x) + (H-x) – HR – K], where x is the number of hits on balls in play that should have been outs. x will be positive for teams with a bad BABIP and neagtive for teams with a good BABIP. I then set this equal to the league average BABIP and solved for all 270 data points using goal seek (and a macro to automate the calculation!).

I took this number, the number of extra outs that should have been recorded, divded by 3 and added to IP to get DIPS-IP, and used this to recalculate the RAR/9. For reference, the worse team, 2007 Tampa Bay recorded almost 133(!) less outs (allowed that many more hits) than a team with an average BABIP. By contrast, the 2011 Rays were on the opposite end, 111 fewer hits recorded (20 more than the next best team, 2002 Angels). So that passes the logic test, given these teams defensive reputation.

Results from using this data in the regression below.

by MjwW on Mar 8, 2012 2:22 AM EST up reply actions  

As we kinda suspected, didn't really make a big deal

Y = number of pitches thrown
X1 = team pitching WAR*10 / (DIPS-IP/9) —> essentially runs above replacment/9, easier to understand
X2 = team DRS
X3 = team IP

Y = 13,090 – 592.97X1 – 5.06X2 + 7.585X3
_(3632.7) _(89.943) _(0.857)(2.534)

Model R^2 = 0.223, adj-R^2 =0.214

by MjwW on Mar 8, 2012 2:28 AM EST up reply actions  

One more thought

Maybe DIPS-IP should be used as a variable rather than IP (to try and seperate the effect of defence). Not enitrely sure if this makes sense, but it improves the model a bit, so I’ll show the results:

Y = number of pitches thrown
X1 = team pitching WAR*10 / (DIPS-IP/9) —> essentially runs above replacment/9, easier to understand
X2 = team DRS
X3 = team IP

Y = 10,390 – 560.97X1 – 2.78X2 + 9.55X3
_(3092.6) _(84.476) _(0.923)(2.145)

Model R^2 = 0.252, adj-R^2 =0.243

by MjwW on Mar 8, 2012 2:34 AM EST up reply actions  

While doing the DIPS stuff

I had a thought…why not use TBF rather than IP in the regression?PResumably this is a little granular.So I used the DIPS-IP calculaed RAR/9, but subbed in TBF rather than team IP. The model really improves, but there’s some problems as evident by the coefficients.

Y = number of pitches thrown
X1 = team pitching WAR*10 / (DIPS-IP/9) -aka team RAR/9
X2 = team DRS
X3 = team TBF

Y = 2195 + 23.39X1 – 0.60X2 + 3.43X3
_(2052.3) _(87.551) _(0.809)(0.321)

Model R^2 = 0.438, adj-R^2 =0.432

So the TBF variable is very significant, and the explanatory power is double, but the other two variables are insignificant and the coefficient for RAR/9 goes in the opposite direction to plain logic. Hall mark of multicollinearity, which makes sense…better pitching, fewer TBF. I checked the interaction and r = -0.52. Also, there’s interaction between DRS and TBF, r = -0.41. This makes sense too, better defence, fewer batters faced. And in fact, it obscured what we’re trying to find, the pitch value of a defensive run, since it sumps it into more TBF.

So, to have any utility, the numbers would have to adjusted to dump the interaction. I haven;t thought it through enough to think if it’s possible, how to do and even makes sense to do – I’m reporting the results just because it really bumped the explanatory power. Maybe you have some thoughts.

by MjwW on Mar 8, 2012 2:49 AM EST up reply actions  

Right, that makes sense

Instead of using Total Batters Faced, can you use TBF – (K+BB+HR) since that eliminates plays the defence isn’t impacting? What (I think) we’re trying to separate out is the K, BB, and HR outcomes that the defence can’t affect.

this should weaken the model considerably overall but it might help tease out the impact that the defence is having. aside from that, i’ve got nothing

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Mar 8, 2012 9:35 AM EST up reply actions  

Yeah

But it still doesn’t eliminate the interaction with the quality of the pitching, r=-0.406 between DIPS-RAR/9 and [TBF-(B+BB+HR)]. Not sure there’s really anthig else that can be done

by MjwW on Mar 8, 2012 2:18 PM EST up reply actions  

Can't you just add in the interaction term for that?

assuming neither of those interacts with quality of defence, you should still be able to draw conclusions about the specific effects of defence, no?

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Mar 8, 2012 3:22 PM EST up reply actions  

Ah I see you already saw this very soon after

Sorry, didn’t check earlier. I’m afraid I’m going to perhaps betray my lack of advanced stats backgroud here. I’m not really sure what you mean by adding in an interaction term. Tried doing some quick googling, but not seeing anything that was really helpful. So do I need to add another variable? If we can control for that interaction, then I would think we could draw conclusions about the defence. Anyway, sorry, and hopefully you can tell me what needs to be done.

by MjwW on Mar 9, 2012 3:13 AM EST up reply actions  

I don't know whether excel allows it or not

but most statistics programs do allow you to add a term to the regression that accounts for interactions.

the model would look something like:

Y = defence quality + pitching quality + TBF-(defence independent outcomes) + pqual*TBF

do you have R? it is pretty easy and, if you’d like, I can send you a quick primer to get you started with the code (same for any and everyone on this site)

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Mar 9, 2012 12:34 PM EST up reply actions  

I don't have R

The only statistical software I’ve used other than Excel is some pretty basic work in STATA.

If I understand, it’s basically just creating a new variable where I multiply the two – that’s easy. I guess I’ve never heard of it referred to as an interaction term. I can’t do it right now, but I’ll try to get to it later this afternoon/early evening.

Just to clarification – would the multiplication be pqual*TBF or pqual*TBF-(def. indep. outcomes)?

by MjwW on Mar 9, 2012 1:04 PM EST up reply actions  

Not exactly

the * does not necessarily (and generally wouldn’t) refer to the simple product of two independently measured variables but rather the actual interaction between the two. 1*2 is just frequently used nomenclature signifying “the interaction between variable 1 and variable 2”. It is also occasionally referred to as “1 × 2” or “1 + 2”

I have never used STATA. R is quite powerful and free. I’d recommend playing around with it some. It will take a little while to get the hang of it but once you do, not only is it relatively easy to run tests you can’t do with excel but you can generate nicer figures as well.

as I said before, if you’re interested, drop me an e-mail and I can send you a (very basic) primer to get you started that should let you hit the ground running.

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Mar 9, 2012 2:18 PM EST up reply actions  

Ah okay

I didn’t think multiplying them together made much sense intutively in terms of controlling for interaction. That it’s nomenclature makes more sense, though still something I’m completely unfamiliar with. I’d be curious as to how exactly it’s measured/controlled for, but I can look into that. I don’t think Excel is set up to do that, at least as far as I know.

I’ll send you an email sometime this weekend. I quite like Excel (I’m a change resistent Luddite), but it’s probably time to graduate to something a little more advanced. I looked into R a little bit and it looks okay. Though I’m definitely skeptical of the nicer charts – with a little formatting, there’s almost nothing an Excel chart can’t be made to do.

by MjwW on Mar 10, 2012 2:00 AM EST up reply actions  

I'm probably the worst person on the board to try and give you an answer but here it goes

First of all your making the assumption that their bats are equally as good. If we used last years numbers, Snider would lose much more than 10 runs playing full time.

Last 10 years average pitches per plate appearance (excluding 2011, so 10 years before that): 3.77

So every run would at least cost at LEAST 37.7 pitches, assuming the runner can also steal home (and not be driven in). The question is how many base PA are needed on average to score a run.

Personally I think 10 runs = 100 pitches on average, you figure maybe 1 or 2 are HR’s possibly with people on base.

100 pitches over the course of a season isn’t much, 1.62 a game extra. That’s only if it is 100 pitches.

by Mike Andrew on Feb 22, 2012 7:17 PM EST reply actions  

I don’t know how you get from 3.77 pitches/PA to 37.7 pitches/ run…that essentially means 10 PA = 1 run, and there’s no basis I’m aware of for that.

by MjwW on Feb 22, 2012 7:32 PM EST up reply actions  

Yes...

But I’m still not sure how 10 baserunners minimum = 10 runs gets us to 10 PA = 1 run…in terms of a value determined through linear weights which is relevant here.

It’s not a problem, you’re just spitballing, as am I at this point, trying conceptualize this

by MjwW on Feb 22, 2012 10:50 PM EST up reply actions  

it is actually 0.62/game extra at that number. That is 100/162 not 162/100. It isn’t a huge number but it would still represent an equivalent of about 7 innings pitched and might be something to consider if it is a close decision.

by JaysSaskatchewan on Feb 22, 2012 8:15 PM EST up reply actions  

There's another thing...

…I love about baseball: A season is 100(phi) games!

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:53 AM EST up reply actions  

would this work?

Take a sample of years (5? 10?), take the total # of plate appearances (or total # of batters faced?), divide by total # of runs, multiply by 10 times average # of pitch per PA during that time span.

Extremely simplistic method, no idea if it’s accurate. For 2011 this method gives you 340.97 more pitches per 10 runs, which is like 2.1 pitches per game.

His 2011 wRC+ is 26

by Pikachu on Feb 22, 2012 9:09 PM EST reply actions  

That’s a lot. It is also the equivalent of about 20-25 innings assuming 100 pitches is about 7 innings. I also assume that most additional pitches are added to the bullpen because the starting pitchers are on pitch count and don’t pitch many complete games. 20-25 additional innings is about 1/4 to 1/3 of one reliever’s innings? That would be a lot for the difference in one fielder’s defence.

Btw there aren’t any additional innings of course but translating the pitches to an inning equivalent seems to make the number more useful.

Also, if our overall defense is 50+ runs worse than the Red Sox and Rays then we are pitching 5X that number in extra pitches to account for our defense (in comparison).

by JaysSaskatchewan on Feb 22, 2012 10:36 PM EST up reply actions  

Intuitively

This seems okay…there’s still something about it that bugs me.

I think the basic question being asked is, is there something special about defensive runs in terms of preventing extra pitches. Basically, this method would just take the average number of pitches per run, and use that…I’m not sure that at a conceptual level it’s really answering the question (or maybe my understanding of the question JaysSask is asking is different)

by MjwW on Feb 22, 2012 11:00 PM EST up reply actions  

Basically I would just like to know if the difference between a below average defensive player and an average defensive player could add enough pitches to be relevant. 15 would be pretty irrelevant but 341 seems like enough to be a consideration (That’s almost 3 extra games worth).

by JaysSaskatchewan on Feb 23, 2012 12:42 AM EST up reply actions  

Right

The first step is to come up with a model that depicts reality. My thought was to run a fairly simple regression model with as much data as possible to give us output to analyze. And basically, we get statistically significant results, but that say basically it’s insignificant in terms of extra pitches. Now, it’s possible the results are junk (bad model), and I’m still not quite sure how to explain the results in terms of why they are what they are (I always like to do this with results, be able to explain the practicla implications).

Pikachu presented a different model, one that’s intuitive, and yet there’s something not sitting right with me about it in terms of its mechanisms – it’s an intuition, and I can;t quite put my finger on it. And now it’s bugging me, cause it’s on my mind.

by MjwW on Feb 23, 2012 1:13 AM EST up reply actions  

i think the problem you have with it

is that it doesn’t address fielding, so it can’t really ascribe anything to fielding.

i’d be interested to see your results if you account for IP. last season, the standard deviation of team IP was about 14 innings and the standard deviation for DRS was about 30 runs, which suggests (to me, at least) that the effects of IP (which are almost certainly significant) may be partially obscuring the effects of defence.

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 23, 2012 9:21 AM EST up reply actions  

Aren’t extra innings pitched just related to extra-inning games?

by JaysSaskatchewan on Feb 23, 2012 2:36 PM EST up reply actions  

No

bad teams tend to pitch fewer innings because they don’t have to pitch the 9th inning on the road

The point I’m making is that some teams pitch more innings so those teams are more likely to have to throw more pitches (though that effect may be obscured because bad teams may need more pitches to get through innings)

Also, low-scoring teams should tend to play more extra-inning games, since it is more likely for teams that score fewer runs to be tied after 9 innings than teams that score a lot of runs

"Look at me! I'm Tomokazu Ohka of the Montreal Expos!"

by jessef on Feb 23, 2012 4:33 PM EST up reply actions  

PS

Didn’t respon earlier today but it’s easy to get that and I’ll try to build it into the model when I have a chance.

by MjwW on Feb 23, 2012 5:00 PM EST up reply actions  

Another aspect of this...

…is the impact that confidence in the outfield has on the pitchers. I read somewhere that morrow got peeved about the outfield defense last season when we had Copa and Davis taking unnatural routes to the ball and the Garden Gnome standing in the outfield. Since he has demonstrated that he could strike out almost everyone, that became his game plan, and with not very good results. It would have an effect not unlike the effect that Detroit’s infield will have on an extreme groundball pitcher: “Are you kidding me?”

My conjecture: Thames will surprise people with his defense in Spring Training. I heard him interviewed after the season, and he knows his defense has to improve for him to stay, and I suspect he will have worked very hard to be a good defender.

Hugo thinks I'm a lazy academic

by bluejaysstatsgeek on Feb 23, 2012 12:42 AM EST reply actions  

Is looking at runs the right approach?

I know DRS and UZR use runs saved as the metric, but shouldn’t we just look at this from the perspective of extra baserunners allowed? That’s the principal result of poor fielding, and basically the only way bad fielding results in extra runs. Each extra baserunner results in about 1.5 additional batters faced (1 + .340 OBP +.34^2 + .34^3…). Anyone have the ratio of extra baserunners to extra runs? Because then P/PA*1.5*xBR/R will equal the number of pitches per additional defence-induced run.

by gabrielsyme on Feb 23, 2012 5:21 AM EST reply actions  

That makes sense too but I don’t know where the numbers are.

by JaysSaskatchewan on Feb 23, 2012 2:37 PM EST up reply actions  

Well, I know DRS and UZR start with outs and scale it to runs, so the information is somewhere. I’ll see if I can find it. I see that somewhere above P/PA has been 3.77 the past few years.

by gabrielsyme on Feb 23, 2012 2:44 PM EST up reply actions  

Okay

Haven’t found a perfect source, but the Fangraphs UZR primer notes that the difference between a batter reached and and out on a hit to the outfield is .83 runs. Over at BTF, their explanation notes that an infield error is worth .78 runs and a ball just outside of an infielder’s range is worth .76 runs. Let’s take .8 overall for simplicity’s sake. The AL had a .332 OBP with runners on last year.

Since I have completely forgotten my calculus, a quick arithmetic calculation shows that the extra baserunners faced will be a little under 1.5.

1.5*(1/0.8)*3.77= 7.068 extra pitches per extra defensive run allowed.

That’s probably a bit of an overestimate, since it doesn’t allow for outs made on the bases and double plays (some runners who reach are erased, so to speak). I’m not sure how to deal with that, but I doubt it makes up more than 10% of runners – and additional runners often force extra pitches before being “erased”, so I’m happy to say the final answer is somwhere around 6.5 extra pitches per defence-caused run.

by gabrielsyme on Feb 23, 2012 3:07 PM EST up reply actions  

I think this misses the pitches that occur when there are baserunners due to defense but they don’t score.

by JaysSaskatchewan on Feb 24, 2012 10:29 AM EST up reply actions  

How so?

I fear I may be missing your point. If you mean that we are only counting in UZR and DRS the misplays or non-plays that lead to players scoring, I don’t think that’s the case- they use the average value of a misplay/nonplay.

by gabrielsyme on Feb 24, 2012 12:28 PM EST up reply actions  

I might have misunderstood. In any case, what accounts for the difference between 7.068 pitches/ defensive run and the ratio of 34 pitches / run overall?

by JaysSaskatchewan on Feb 24, 2012 1:19 PM EST up reply actions  

Because most of those 34 pitches/run have to be thrown whether or not any runs score, I’m assuming. The 7 P/DR are the extra pitches that have to be thrown.

by gabrielsyme on Feb 24, 2012 4:36 PM EST up reply actions  

This seems wrong to me or it answers a different question.

The question is, “How many extra pitches have to be thrown if a fielder is 10 runs below average?”, so we are concerned about pitches that have to be thrown when no runs score. Or am I misunderstanding?

by JaysSaskatchewan on Feb 25, 2012 11:19 AM EST up reply actions  

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about our heroic azure-tinged corvidae, the Toronto Blue Jays.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Grain-of-salt_small
On random variation: LOB%, BABIP and FIP vs. ERA
Graffiti-cbgb-bathroom_small
You know what Grinds my Gears?
Hal2_small
Quantifying the Effect of Team Defense on Over/Underperforming the Team's FIP
Small
Brett Lawrie's historic defensive prowess

Recent FanPosts

Img_0569_2_small
Tell me where to go...
Small
Blue Jays Player Stats Multiplied by 4
Small
Petition to change Suckage Award Titles
Jaysfanimage_small
The Lansing 4: What to do when they outpitch expectations?
Misc_003_small
Jays' All-Star Alliterative Name Team
Kingkelly_small
Stats tools?

+ New FanPost All FanPosts >

Yahoo_full_count

Managers

Bluejayperched_small hugo

Rincewind-1_small Tom Dakers

Assistant Manager

Smith_up_small JohnnyG

Authors

Hiro_small jessef

Profile_small masterkembo

Profiel_small Woodman663

Minorleaguer_small Minor Leaguer

Tony_fernandez_small TonyFernandezSavedMyLife

Moderators

Ryder_small jays182

Aejfuulciaar18g_small Bowling_Guy25