Now that Spring Training is about to get going and pretty much everyone who does prospect lists has them out, I thought it would be interesting to see how various Blue Jay prospect lists compare with each other, in terms of overall similarities and differences. I performed a similar exercise with the overall Top 100 lists at Minor League Ball, and it's a little more difficult to do here because the lists are all of different lengths, but I thought I'd give it a crack. The idea is to compare each set of lists with each other, and look at how many players are common to both lists, and also how the degree of difference in player rankings among the lists. One note to make at the outset is that being similar or disimilar does not make a list right or wrong, or good or bad, it just means it's more unique, and something to keep in mind when thinking about those lists.
I included prospects lists from Baseball America (Nathan Rode), Batters Box, the BBB list just completed, Baseball Propsectus (Kevin Goldstein), Bullpen Banter, ESPN (Keith Law), Fangraphs (Marc Hulet), MLB.com (Jonathan Mayo) and John Sickels. For fun, and for comparison's sake, I also the BBB Community list in the overall comparison.
Before getting to the macro comparison of lists, I thought I'd start by looking at the micro level and how each of these 10 lists ranked individual players differently. As the lists vary widely in the number of players included, I've only included players on a minimum of two lists, and determined the aggregate ranking and awarding points in descending order (50 for 1st, 49 for 2nd, etc - the longest list went to 50) for each list a player was on.
No real big surprises - Gose and Marisnick 2A and 2B, a cluster of pitchers separated into two rough tiers, and then another drop off to the next position players. Looking at the standard deviation of the rankings is also interesting, to see how much of a consensus (or lack thereof) developed around players. Hopefully it's a handy reference. Moving on to comparing the lists themselves, since they were of varying lengths and this can throw off comparison analysis, I only considered the top 15 ranking for each list. Two of the lists only went to ten (Baseball America and Keith Law), but I chose to leave them in and just have to treat the comparison for them with a grain of salt. The first point of comparison is to look at how many players are common to any pair of lists, and this is summarized in the table below. Note that the table is a mirror along the diagonal so each value is presented twice - I find it's easier to read this way, since you can go all the way along horizontally or vertically rather than having to switch midway. I highlighted the results involving the lists that went to 10 only, since they can't be directly compared.
Among the lists that went to 15 or more, the Bluebird Banter list has the highest average number of players in common (closely followed by Marc Hulet's), and Kevin Goldstein's has the least. I found in the other exercise I did with Top 100s that the community list had the highest commonality, partly I assume because of the wisdom of crowds. Curiously, we don't see this with the BBB Community list, even though both were started before expert lists were largely published.
Of course, this isn't the only basis of comparison - it's also important how much variance in placement occurs on the individual lists. Below is a similar table showing the average difference in placement between the same prospects on pairs of lists. There's an issue is when players aren't both lists - in this case, I just assumed the other list had that player at 25 (22 for the lists with 10 to account for the shorter lists, otherwise their average difference is biased and no comparison is possible). This isn't a perfect assumption, but roughly accounts for the implicit lower ranking and is better than omitting players when they aren't on both (since this is where the largest variance in ranking occurs).
Once again, we see a similar story - The Bluebird Banter list was far and away the list most similar on average on the other lists, and Goldstein's list was by far the most dissimilar to the others. The other lists were pretty closely clustered in between. The two most similar lists were the Batters Box list and the BBB Community list (interestingly, the Batters Box list was the only one out when the pitcher/positional lists were compiled, and there's definitely overlap in readership. In the future, I think it's worth trying to think about ways to avoid some bias in these community lists). The most dissimilar lists were Goldstein's list and the BBB Community list, though Batters Box registered almost equally strong disimilarity with Goldstein's list and Sickel's list. Below is a summary of all lists, as well as their most similar and dissimilar list:
Finally, since the 15 limit is somewhat arbitrary and most lists went beyond that, I repeated the exercise only with the lists that went to 20, to see if anything different emerges. I show the results below, but basically, the same trends were evident, with a high degree of overlap in the prospects on the Top 20 lists, and similar patterns of similarities and dissimilarities.