It's Opening Day! Everyone's tied for first, hope springs eternal and we wonder about what the next 6 months of baseball hold for our beloved Blue Jays. Will they contend for a playoff spot, or once again finish a distant 4th in the AL East? Thankfully, we have a lot of projection systems to help us think about this - ZiPS, Marcel, Steamer, OLIVER, CAIRO, various wisdom of the crowd systems, etc. But while they're great, I have a problem with using this data to think about projecting players and teams. Almost invariably, the projections are presented as point estimates, that is, the most likely value to occur. For example, ZiPS projects Bautista to hit .273/.408/.566 with 36 HR. But baseball is filled with random variation from true talent levels - hence the common caveat "small sample size". The point is, any projection is going to have uncertainty associated with it, but we often ignore this when thinking about how players will perform. But this doesn't have to be the case - it is rather simple to introduce uncertainty into projection models, and this is what I aim to do below. This way, rather than just looking at one scenario, we look at different scenarios and the expected production.
Basically, when forecasting a position player's production, we have to think about three different things that vary from year to year - offensive output, defence, and base-running. Additionally, the playing time will also vary, and that affects everything else. So basically, there are 4 moving pieces that have to accounted for. The way to do this is to use a technique called Monte Carlo simulation. Basically, the idea is for every input that can vary, we estimate the expected value (the point estimate referred to above that projection systems often use), and the expected variance to build a model to predict production - in this case, essentially simulating a season. Then, we use random numbers to run simulations over and over and over, in order to represent the uncertainty and generate a distribution of possible outcomes.
So, for each starting Blue Jays position player, this is exactly what I have done, running 1000 simulations for each (with 4 variables for each simulation, that's 4000 random numbers for each player). I don't want to crowd this up with too many details about what projections I used for each variable and how I estimated uncertainty, so I put it in a section at the end. I'd recommend reading it, because these type of models are really dependent on the assumptions - Garbage In, Garbage Out. My modeling is far from perfect, but I'm confident that it generally reflects reality and allows for meaningful analysis. Looking over the results, it passed the eye test in terms of logical results, so I'm not concerned it's spewing out complete garbage.
For each starting position player, I have a graph showing the distribution of his expected WAR for 2012. I have assumed that the player only plays at the one position for the purposes of position adjustments. Below that, I have a table showing 5 levels of expected outcomes - the 3rd percentile (essentially the worst case scenario), the 25th percentile (bad season), the 50th percentile (average season), the 75th pecentile (very good season) and the 97th percentile ( essentially best case scenario). I show various stats for percentile - plate appearances, wOBA (offensive production), wRAA (runs produced by the player above average), wRC (total runs produced by the player), UZR (defence), BsR (baserunning), RAR (runs above replacement) and finally WAR. Note that, each component of each simulation is generated independently, so for a given percentile, you can't just add up the components of WAR to get it to tie into the overall WAR associated with that percentile. For each simulation, WAR is calculated separately for each simulation. I show each component in order to allow for a better understanding of the variability of each component.
J.P. ARENCIBIA, C
JOSE BAUTISTA, RF
YUNEL ESCOBAR, SS
KELLY JOHNSON, 2B
BRETT LAWRIE, 3B
ADAM LIND, 1B
COLBY RASMUS, CF
ERIC THAMES, LF
Some Important Notes on Methodology
Playing Time: By far the most difficult part of the projection, especially since it affects everything. After looking at various projection systems, I went with Steamer for projecting plate appearances, since it seemed to best fit with intuition when look at various players, and it did a pretty job when I checked last year's projections against actual PA (also, some confirmation of this at Tango's blog). So that's where I got the expected value, but then there's the whole problem of the distribution of possible outcomes. This is clearly not a normally distributed variable. I estimated this by taking last year's projections for all players with 450 or more PAs (since we're dealing with starters, n=215 which is close to the number of MLB starters) and compared to the actual number of plate appearances in 2011, expressed as a ratio. The distribution looked like this:
This was a good start, since it intuitively looks like a credible distribution of playing time (injured players end up below, to varying extents, and a smaller amount of players get more than expected). But further adjustments were necessary, since applying this distribution results in logical breakdowns. For a player with 500 projected PA, it works well - the absolute maximum PA would be around 675, which is about right. But for a player with 650 forecast PA (like Jose Bautista), this would imply a possibility of over 800 PA, which is inconceivable. The problem is, for these players, they are expected to be very healthy, so the distribution should be different (essentially flatter). The way to do this is to use an exponent of between 0 and 1 - for numbers greater than one, it will make the numbers smaller, for numbers less than 1, it makes them bigger. By playing around with the numbers, I devised a sliding scale so that up to 500 PA, the distribution is unchanged, and then from that point on it gets progressively flatter:
For a player with 650PA, it means the projection is capped at around 720 PA. That's a high number, but that's also the absolute max. It's not a perfect depiction of reality, but I think it broadly gets things right, and that's the important part. Random numbers were then used to assign PA for each simulation (and the same PA carried throughout the simulation).
The other part of playing time is games played, which is needed for things like UZR and BsR. Unfortunately Steamer doesn't project games. So for each player I used the ZiPS PA-to-games conversion rate, and there was nothing obviously off.
Offense (wOBA, wRAA, wRC) - I used the ZiPS projected wOBA, and assumed the same league average and wOBA scale as 2011. For the uncertainty, I used the formula given in the Appendix to The Book to calculate the standard deviation of wOBA, and then used random numbers (standardized to z scores) to vary this.
Defence - I estimated true talent defensive talent by using the player's last three years UZR/150, and applying a 5/4/3 weighting. Where a player did not have 3 years of data (Brett Lawrie for example), I assumed league average for the missig year, which would regress the results towards average. I only considered UZR data at the player's projected position for 2012. To estimate the uncertainty, I took all players with at least 100 PA in each of the last three seasons, did the same 5/4/3 weighting of their UZR/150, and calculated the standard deviation. Then standardized random numbers were applied to this standard deviation to arrive at a simulated UZR/150. Finally, UZR/150 multiplied by games resulted in the UZR value. A more optimal method would have been to calculate the standard deviation for each position separately (since the variance is different by position), but this would have been far more difficult and adds relatively little value.
Baserunning - Same method as UZR, though without the positional caveat.
WAR - I assumed the same run environment as last year and used the same conversion factor of runs to wins (approx. 9.5)