This morning in the comments, we discussed the differences in calculating WAR between the baseballprojection (rWAR or Rally WAR) method and the fangraphs (fWAR) method. In gauging position players, the two groups tend to be fairly similar, differing mostly in how they determine the value of a player's fielding contributions -- the rally system uses Sean Smith's TotalZone (TZ) and fangraphs uses Mitchel Lichtmann's Ultimate Zone Rating (UZR). For some players, there is quite a bit of divergence between the fielding metrics but, on the whole, the systems are pretty convergent in terms of position player evaluation. When evaluating the contributions of pitchers, on the other hand, the systems tend to diverge because of a distinct difference in methodology -- Rally determines a pitcher's value (on a per-inning basis) based upon the number of runs he actually yields whereas fangraphs determines a pitcher's per-inning value based on how many runs they expect him to yield according to his Fielding Independent Pitching (FIP) statistic, which is based solely on strikeouts, walks, and homeruns. There are benefits and drawbacks to each method but I prefer the fangraphs method, as it does not reward pitchers for contributions made by exceptional fielders or fortuitous bounces on balls in play.
However, while I do think the fangraphs (FIP-based) method is preferable to the rally method, I think it could be signficantly improved by modeling WAR on xFIP (which, as discussed earlier this week, normalizes a pitcher's HR/fly-rate to league-average) because most pitchers seem to have very little control over whether or not flyballs become homeruns. Unfortunately, fangraphs does not allow you to query WAR based on xFIP. After the jump, I will outline a quick-and-dirty method for those of us interested in doing so.
The very first thing to do is determine the relationship between Wins Above Replacement and Runs Above Replacement. While it's usually around 10 runs = 1 win, it depends on how many runs are being scored across the league. MLB-wide offence is down in 2011. Go to the pitcher leaderboards for value and divide an AL pitcher's Runs Above Replacement by his WAR (e.g., Ricky Romero, 17.4 RAR / 1.9 RAR = 9.16 Runs / Win).
The first step to this is exporting data from fangraphs to your spreadsheet program (most windows users probably have MS Excel, though -- and I know this makes me sound prickish -- I just have OpenOffice Calc), a fairly easy task. To do this, simply query the fangraphs pitcher WAR leaderboard and click "export to Excel" or "export to CSV." Once you have the data, you can estimate league-adjusted replacement-level FIP based on whichever pitcher is closest to zero RAR (runs above replacement) in the league you want (today, I used Wade Davis's 4.94). Now that you know what replacement-level FIP is, you're halfway there.
Divide that FIP by nine to determine FIP replacement-level runs per inning. Now, depending on how you want to deal with this, you can either convert this FIP replacement-level runs per inning to xFIP or just use it as a proxy. If you want to convert it, determine how much average xFIP differs from average FIP. Else, you can just use FIP replacement-level as a proxy for xFIP replacement-level. Today, for example, the ratio of xFIP : FIP is 1.01 so multiply your per-inning replacement-level estimate by 1.01. I found that:
Replacement-Level = [(4.94) / 9] (1.01) = 0.554 runs per inning
Now you can take whichever pitcher you're interested in (as an example, we'll use Ricky Romero) and divide his xFIP by nine to estimate how many runs per inning he is likely to give up by xFIP. We find that, by xFIP, Romero is likely to yield:
(3.5) / 9 = 0.389 runs per inning
Now, some very simple arithmetic gets you where you need to be. Subtract Romero's per-inning xFIP from the replacement-level per-inning xFIP estimated during the previous step (0.554 - 0.389 = 0.165). This is the number of runs Romero saves above replacement-level per inning. Finally, multiply the difference by the number of innings Romero has pitched [ (0.165) (118) ] and you have your xFIP-based RAR (19.47). Convert RAR to WAR based on the very first step by dividing xFIP RAR by RAR/WAR (19.47 RAR / 9.16 RAR/WAR = 2.13) and you're done! By xFIP-based WAR, Romero has been worth 2.13 wins -- slightly better than he appears by FIP-based WAR.
What do you guys think -- is an xFIP-based method better or worse than an FIP-based method? What about the Rally method? Anyone think we should periodically run the numbers to determine league leaders based on xFIP WAR?
As WAR is park-adjusted, to find replacement-level FIP, optimally, another pitcher from the same team should be used but make sure it is a starting pitcher if you're comparing him to a starter.
After exporting all the data, I created the fields necessary to query xFIP-based WAR (xWAR). I estimated replacement-level FIP by league and estimated xFIP by separating out the ratios by league. I estimated RAR/WAR by the values given on player value pages (so, while FIP is not park-adjusted, RAR/WAR is).