Defensive Aging Curves

I've been doing a lot of thinking and data crunching on defensive metrics recently (for reasons that will be clear in a week or two), which eventually lead to thinking about how players age defensively. I ended up constructing some defensive aging curves for UZR and DRS, and thought they may be of interest to some BBBers, especially some of the slicing and dicing of the data.

One major thing to discuss at the outset is how to measure defensive progression. Both DRS and UZR measure players relative to the position they are playing. Obviously, a CF who is a -3 run defender is much better than a LF who is a -3 run defender, since playing CF requires more range and generally a better arm. Moreover, often times a player will end up playing at multiple positions in the same season, even if they are primarily at one position (ie, CF plays a few games in either corner position). One option would be to look at aging at each position individually, but that would reduce the sample size at each age, and additionally include a lot of small, partial seasons.

I felt the better approach was to roll-up the data to get one data point for every player season, by using positional adjustments to adjust for the difference level at each position. Positional adjustments aren't perfect, especially at the level of individual players, but they work fine in big sets of data with large sample sizes. So if a player played 75 games in CF (+2.5 runs/150 games adjustment) with a -1 run rating, that's a net of 0.25 runs for CF (-1 + 1.25 position adjustment). If that player also played 25 games in each corner (both -7.5 runs/150 games adjustment) with a +2 run rating, that's a net of -0.5 runs for LF and RF, and a total of -0.25 runs for the season in total. From here out, I will refer to this as the aggregate rating, either (UZR+Pos) or (DRS+Pos) depending which metric is being used.

Fangraphs has UZR data starting in 2002, and DRS data starting in 2003, so I looked at the 10 seasons from 2003-12. In that time, there have been 6249 non-pitcher seasons by 1545 players (meaning the average players only played in 4 of the 10 seasons). I used the age that Fangraphs had for each player in each season. I believe they use a June 30 cutoff (a player born June 30, 1980 would be 32 for the 2012 season, a player born July 1, 1980 would be 31 for the 2012 season). It's not perfect, but since large numbers of players are being combined, it shouldn't be an issue.

My methodology was to to sum up the innings played and aggregate defensive ratings for all players of a given age (Age X), and translate that to a rate statistic per 150 games (1350 innings). This would be (UZR+Pos)/150 or (DRS+Pos)/150, similar to UZR/150 or DRS/150 but with positional adjustment. Then for those same players, look at what they did the following year (at Age X+1) in the same manner and look at the difference in their performance.

For example, from 2003-12, players at age 19 played 2,555 IP, with a (UZR+Pos) of 9.5 and a (DRS+Pos) of 10.8, which translate to a (UZR+Pos)/150 of 5.0 and a (DRS+Pos)/150 of 5.7.

The following season, those same players totaled 2,085 IP in their age 20 seasons. Again, it is important to reinforce that these players are a subset of all players who played at age 20, which was a total of over 10,000 IP. We want to look at the progression of defensive performance, and so the groups have to stay constant. In their age 20 seasons, those players had a (UZR+Pos) of -1.6, and a (DRS+Pos) of 16.4. That works out to (UZR+Pos)/150 of -1, and (DRS+Pos)/150 of 10.6.

So UZR is saying players got worse defensively from age 19 to age 20 by about 6 runs per 150 games, whereas DRS is saying they got better by 5 runs per game. The reality is, the innings totals are only the equivalent of two whole seasons, and not big enough to draw any conclusions. Below is a chart showing the number of innings played at every age (Age X) and then for those same players in the following season (at Age X+1):Defensive_aging_large

From the ages of 20 to 24, as a population, players get more playing time in the following season than they do in the initial season. Since we know that players tend to peak around the ages of 26 to 30, this is unsurprising since a higher skill level earns them more playing time. Starting in their age 25 seasons, they play less in the following seasons than in the initial season. Presumably, injuries overwhelm the effects of more playing time due to higher skill level and then in their late 20s and beyond skill level decline makes the decline in playing time even more precipitous.

Since defensive metrics take larger samples to stabilize, I established a minimum cutoff of 20 full seasons (27,000 IP) worth of data in the initial season in order include an age in the analysis. This means that when looking at all players, I'm only looking at the changes starting with age 21-22 seasons, and ending in the 38-39 seasons. Below is a chart showing the year to year changes:Yy_full_pop_large

There is a fair bit of noise in the data (age 21 to 22 players improve according to the metrics, age 22 to 23 they get worse and then from 23 to 24 they again get worse), but the general trend is that players are getting better from ages 20 to about 24, flat-line through about age 30 neither improving nor getting worse, and then decline at faster rates. Another way of showing this is to "bootstrap" these year to year changes together to create an expected aging curve which is the cumulative of all these year to year changes:Curve_pop_large

Personally, I find this chart easier to read than the above one, and so from here out I will be showing only these expected aging curves rather than the year to year changes (of course, the year to year changes can be inferred by looking at the difference of two years). I have assumed that the player is an overall average defensive player at age 21 for the purposes of constructing the curve, but changing the initial level would just shift the curve up or down.

On top of the actual curves, I have added a parabolic line of best fit, which fit the data quite well. By UZR, a player with an average aggregate rating at 21 increases by about 4 runs total through their late 20s before declining defensively, returning to average around their early/mid-30s and then entering a period of accelerated decline. By DRS, a player would only be expected to increase by 2 runs from age 21, peaking earlier around age 26 and then declining in a similar manner. UZR and DRS disagree slightly on the magnitude of improvement early in a players career, but see a very similar aging pattern and are broadly quite similar.

Slicing and Dicing the Data

The above charts and analysis consider all players who played between 2003 and 2012. As discussed earlier, the average player only played 4 seasons out of a possible 10. Some of that is players starting their career after 2003 or retiring before 2012, but most of that is due to there being a lot of players who didn't play a lot. The distribution of baseball talent is not normal (in terms of distribution, that is, symmetric), it's very pyramidal. Of the 6,249 player seasons, in only about 25% did the player play enough qualify for the batting title. In general, we're more interested in looking at regulars since they have the most impact for their franchise. In accumulating the data in the manner I did, it's essentially weighted by playing time, which means regular players have more influence, but I thought it was worth restricting the sample to filter out more bit players and focus exclusively on regular type players.

To accomplish this, I set a restriction that a player had to have played 500 innings (about 55 full games) in both the initial season and the following season. This will not be exclusively regulars, but it also avoids screening out too many players who miss some time due to injuries. Curve_500_500_large

Broadly speaking, it's quite similar to the defensive aging curve for all players, with similar expected improvement and peak age. Interestingly, the decline is more precipitous, with players declining by a total of about 15 runs rather than 10 runs. I imagine this may be influenced by a selection bias whereby regular type players who are still playing in their late 30s are sticking around due to the strength of their bat, and are very limited defensively.

Another way to slice the data to screen out non-regulars is to set a career minimum for playing time rather than setting yearly minimums. Below is the aging curve for all players with minimum 2,500 career IP (around two full seasons):


It is very similar, almost identical to the defensive aging curve for all players. However, 2,500 innings is not that high a bar, so below is the curve for players with 5,000 career innings:


The defensive aging curve for this set of players is quite different. By both metrics, there is no expected improvement initially. By DRS, players are expected to decline defensively immediately, though the decline is much slighter than in the above curves. By UZR on the other hand, a player basically holds his value for about 5 years, before beginning the decline. According to both metrics, the total decline is not as large: about 8 runs from peak to trough by UZR and about 11 runs by DRS.

It would be interesting to further raise the minimum career innings level and see how the curve changes, but unfortunately, the sample size falls and the number of years that can be analyzed reliably starts falling, no it's not really possible at this point. Overall, it seems players with longer careers tend to have more stable defensive aging curves. Whether that's a result of skill or due to more data reduce the noise in the evaluation I'm not sure.

Do Players of Different Defensive Skills Age Differently?

So far, the only restrictions have been for different levels of playing time, with the implicit assumption that the aging curves are the same for all players. But it's possible that the aging effects are different - perhaps better players age more gracefully, and worse players more precipitously.

To test this, I created two groups of players: though who were in the top 15% of (UZR+Pos) or (DRS+Pos) from 2003-12 (min. 12.7 and 11.9 respectively), and those in either of the bottom 15% of those same two metrics (-15.2 and -17.2 respectively). This is counting total as opposed to a rate metric, and so it discriminates against players who might have been greatly defensively but too poor offensively to accumulate much playing time. I'm okay with that, since I'm not really interested in replacement type players. The reason for choosing 15% is a tradeoff between selecting only the best and worst defensive players, and accumulating enough of a sample to construct a curve. As it is, at 15% there's only enough data to construct the curve to age 36 rather than age 38 as before.

Defensive aging curve for the top 15%: Curve_top_15_large

Defensive aging curve for the bottom 15%: Curve_bottom_15_large

Clearly, these aging curves are quite different. It's worth noting first that to some extent, this amounts to a positional screen, as SS, 2B, 3B and CF are generally better defenders and 1B, DH, RF and LF are generally poorer defenders. But it's possible for a really poor defensive SS (cough, Jeter, cough) to be in the bottom 15% and for a good defender at an easier position to be in the top 15%.

The players in the top 15% as a group improve quite significantly from age 21 through their defensive peaks around age 27, by about 6 to 8 runs. Players in the bottom 15% on the other hand improve very little through an earlier peak around age 25/26,but really it's better characterized as holding their value through that age. From peak to their late 30s, the top 15% decline about 20% less, losing around 7 runs compared to 9 runs for the bottom 15%. The net consequence is that the expectation for a top 15% defender is to be about 3 runs worse at age 37 than at age 21, whereas a bottom 15% would be about 8 runs worse at 37 than 21.

What to make of this? On one hand, it might be evidence that better defenders put in more effort in improving defensively. If anything, we'd expect that poorer defenders have more room to improve their defensive game. There could also be a selection bias whereby lesser defensive players are promoted and earn playing time based on their bats, whereas players with great defensive tools generally have lesser bats and get more time when they grow into their defensive skills. With that said, with the better defensive players play more innings at all ages, which is not indicative of such a bias.

What this means for the Blue Jays

Overall, I think it's a mixed bag. It's really good news in terms of Lawrie. The metrics have both rated him well above average thus far, and while that has to be regressed a good deal in terms of future projection, there's good reason to expect some improvement in his defensive true talent in future years as he's both young and a good defender. For Jose Reyes, it's more mixed. At 30, he's in the decline phase, moving towards the rapid decline phase. However, he qualifies as a top 15% defender (due to position), so that would mitigate the expected decline somewhat. For Bautista, it's really not good news. As a 32 year old bottom 15% defensive player, he's very close to the age of rapid defensive decline and there's no reason to expect it to be gentle. His bat should keep him as productive player well into the future, but his overall value would be reduced by either playing the field very poorly, or being limited to 1B/DH.

Editor's Note: This is a FanPost written by a reader and member of Bluebird Banter. It was not commissioned by the editors and is not necessarily reflective of the opinions of Bluebird Banter or SB Nation.