Preface: This is meant to be fairly accessible, so I've written a lengthy introduction. If you have a working understanding of linear $/WAR assumptions and implications, you may want to skip the first three paragraphs.
If there is a Holy Grail of Armchair General Managing in baseball, it's how to determine how much a player should be paid for the value he provides on the field. It's an age-old question, but in an age where players can command hundred million dollar contracts, it's arguably the singularly critical factor in building a successful team. Recently, I've been looking into the value that teams receive from giving out free agent contracts, particularly compared to when they extend their own players. I've spend a lot of time and effort trying to establish a consistent method, but at the heart lies the assumption that on-field production should be valued using a linear $/WAR model, derived from the aggregate of the contracts the market actually gives out.
It's a controversial assumption, and far from settled ground, so I wasn't surprised to see some of my conclusions (very articulately) questioned on the basis that the linear valuation method fails to reflect actual market conditions. Suffice to say, I would concede that if that linear $/WAR valuation assumption does not generally hold, my study is basically junk. I have to admit, when I first discovered this is how the linear $/WAR method was done, I thought it was asinine. After all, elite players are far more productive at turning scarce inputs into maximal outputs, and normally there's a direct link between productivity and rates of pay. But as I read more into it, I came to realize that it was a valid way to value on-field production, both from a theoretical perspective and in terms of fitting the actual data. Certainly not perfect, but good enough to rely on as a starting point and to use to study larger trends.
My purpose is to evaluate the linear valuation assumption, but first, an important distinction is in order. The market for baseball players is made up of two distinct groups - those with less than 6 years of MLB service time, who remain under the control of their teams; and those with 6 or more years of service time, who are eligible for free agency. Naturally, the first group will be paid less since they have no leverage other than salary arbitration once they have 3 years of service. Teams will prefer to build a team around these players, but you generally have to develop them, so it's not usually possible to round out a playoff calibre team with only controlled players. Therefore, you have to supplement your roster with players who have free agent rights and whose value is based on the open market to the extent possible. Some players will sign contracts that buy out years when they would be free agent eligible, but nonetheless the salaries must be reasonably close or equal to free agent salaries to induce the player to forgo or defer free agency
The most common counterpoint to using linear valuation is that when bidding on free agents (and retaining your own players beyond control years), teams in larger market teams with big payrolls can afford to pay more for on-field production relative to smaller market teams with small payrolls - and when it comes to elite talent, this is particularly the case since by definition, it is much scarcer. And to a certain extent, we should expect this, since larger market teams will generally realize more revenue per win than smaller market teams. Additionally, by virtue of having more resources, these teams are more likely to contend for playoff spots, and be in the 80 to 90 win range where the added revenue of each win is highest. While this is true, I believe there is a more important factor that essentially overrides these and leads to a linear valuation, being the substitution effect. Essentially, getting two 3 WAR players is just as good as getting one 6 WAR player in terms of the added production to your team, so if the price gets driven up on the 6 WAR player who can look to other players to make up that production. Granted, the price of players is linked together through this effect, so if the price of one rises the downstream price will rise, but the point is, the salary inflation would be broad-based across all players, and not break the general linearity of $/WAR in the market. What it will do (in the short run) is push some teams out of the market, or prevent them from buying as much production as they would have liked to, given a fixed budget. In the longer run, it could induce teams to reallocate money towards developing players, an option that is now somewhat taken away in the new CBA.
So my goal is to take actual salary data and try to determine: do teams with higher payrolls end up paying more per additional win in the market for players with more than 6 years service time, or do they they just buy more wins in that market (at the same cost per win as other teams)? Or is it some combination of these two?
Cot's Baseball Contracts provides a salary breakdown of every team's Opening Day roster including service time for each player, for the years 2009 to 2011 (for a total of 90 team seasons). Payrolls will rise and fall as moves are made throughout the season which will not be captured here, but the opening day payroll is a reasonable proxy. Also, Cot's generally lacks salary data for young players with minor league options remaining, so that will slightly underestimate the total MLB payroll, but it's a very small amount that will be similar from team to team. There are not similar problems for players with more than 6 years service time, who are the focus of this exercise.
I pulled each player's fWAR for the individual 2009-11 seasons and matched it up against their salary data, and sorted to get 3 numbers for each team season: the total team salary (A), the total team salary for players with 6 or more years service time (B), and the total WAR contributed by players with 6 or more years service time (C). I made a few adjustments to properly allocate when salary dumps occurred (for example, when the Jays traded Vernon Wells, they ate $5 million, I counted this against the Jays total with no associated WAR - it was dead money for the Jays, and the remaining salary and all production for the Angels, since the acquiring would presumably only assume the portion of salary approximating fair market value). That gave me the following:
The numbers jumped around quite a bit on a year to year basis, so I aggregated the data by team. This reduces the data points to 30, but also smooths out a lot of noise so we can look at the underlying numbers. The last column shows the amount each team spent per WAR on players with 6or more years service time. Below, I've shown them by rank:
At first glance, it certainly doesn't appear that larger market teams spent more per WAR than other teams - The Cubs, Dodgers and Yankees are almost smack in the middle. Boston and Philadelphia are actually on the lower side. To get a better idea of this, we can plot the total amount spent on players with free agent eligibility against $/WAR and run a linear regression to see if there's a trend:
The regression shows that the amount a team is spending in the market for players with 6 or more years experience explains less than 5% of the variation in the $/WAR; as a two factor regression the t-stat on the total spent is -1.16, so it is not statistically significant. Moreover, there's two clear outliers: I'm sure no one's surprised the Yankees are the team having spent about $200 million more than anyone else, and the team getting terrible free agent value is Pittsburgh. If we omit these and re-plot, we get essentially a random plot:
Finally, we get the same result if we disaggregate the data into individual team seasons:
In this regression, the t-stat for the total salary is -0.74, which is also insignificant. The model had a negative adjusted-R^2, meaning that basically the explanatory variable is explaining absolutely nothing more than a random explanatory variable would.
Having failed to find a relationship between salary and the $/WAR, we'll turn towards seeing what drives the number of WAR the team purchases in the market by plotting total payroll for players with 6 or more years experience against the WAR produced by those players:
Clearly, it's a very strong relationship - 80% of the variation in the number of WAR produced is explained by the difference in payrolls. The t-stat for the total salary is 11.35, which is highly, highly significant. This is consistent with the idea that $/WAR is linear, and therefore to predict total WAR purchased you would look at the money available to spend.
One final bonus chart, not directly related to the above, but a relationship I found interesting:
There is a very mildly negative relationship between the estimated amount of WAR produced in those three years by homegrown (controlled) players, and the total spent on non-control players. One would think, all else equal, that if you develop more players, you need to buy less production on the market, This chart suggests the relationship isn't very strong, or at least relative to what I would have expected.
For the years 2009-11, there was no relationship between the amount of payroll that teams spend on buying players with 6 or more years service time, and the amount they spend of each of those WAR. On the other hand, there was a strong relationship between the total payroll and the number of wins purchased. This would support the proposition that in the market for non-controlled players, where teams can flex their financial muscles, teams with more money improve their teams by buying more wins than teams with fewer resources; albeit at the same price, and not by paying more per win. As an example, despite the fact that the Yankees disproportionately (vs. other teams) accumulate elite players through free agency, in aggregate they paid very close to the league average per WAR from 2009-11.
One possible improvement to the methodology would be to substitute projected performance instead of actual performance, since this is what teams have to do when bidding on players. In practice, however, this would be quite difficult and perhaps prohibitively so.