By Kevin Minkus (@kevinminkus)
It’s late April, which means the NFL Draft is here. Unless you’re an NFL fan, or a Union fan faced with kafkaesque traffic closures because of the construction of a ridiculous 3000-seat amphitheater on the steps of an art museum, that probably doesn’t matter much to you. A number of really fascinating articles were written over the last few days, though, analyzing NFL teams’ skill at drafting. To list just a couple - Reuben Fischer-Baum wrote on each team’s ability to appropriately assess the value of prospects given their pick numbers, and Michael Lopez analyzed the efficiency of the league as a whole in its evaluation of prospects. Methods from both articles can be applied across football to soccer, to better evaluate the MLS draft.
One of the most interesting draft problems to solve, that can then form the baseline for further analysis, is “What is the expected value of a given draft pick?” Put another way, “How good should we expect a player to be based on his draft position?” The difficult part about soccer is how we measure ‘good’. This is much more straightforward in other sports- the MLB has WAR, the NFL has Pro Football Reference’s AV, and the NBA has win shares. In soccer, though, no reasonable one number metric of a player’s value currently exists (it may not ever exist). A number of metrics have been used in public for draft analysis (here’s Howard Hamilton's, Ford Bohrmann's, and Tom Worville's), but any approach taken to this measure is going to have its tradeoffs.
I’m going to use a player’s total minutes in his first two years in the league as a proxy for career value. Importantly, I’m not saying that number is equal to a player’s value, only that it is very indicative of it. This stat has some obvious drawbacks, the big one being that there are examples of players who don’t see much playing time in years one and two, but go on to have perfectly successful MLS careers. Andre Blake, so far, is a good example. The opposite can also happen - Saad Abdul-Salaam had 36 starts his first two years, but has just one in 2017.
Despite this, I think the stat is good as rough shorthand. With landscape altering elements like allocation money, new collective bargaining agreements, and expansion, the state of the league and its roster structure is always in flux. Because of this, the importance of drafted players is always changing. Using the metric I’ve chosen, we can look at draft classes as recently as 2015 without having to wait for more data to accrue. This means that the value curve will be mostly responsive to any new trends. To balance this responsiveness with some stability, I’m going to use data back to 2007 in constructing the curve.
So here’s the plot of a player’s value (his minutes in his first two seasons) against his pick number. The curve itself gives the expected value at each pick, and is smoothed using LOESS, on all SuperDraft picks from 2007 to 2015 (supplemental drafts have been excluded):
The shape of the curve is pretty interesting. It suggests the value of a draft pick decreases linearly from pick 1 to maybe 24, which probably fits intuition. But it also suggests picks 24 to 46 are worth roughly the same. That could be because those picks are likely to need more than two years of development before being MLS-quality. If these picks are all about equal, though, it makes sense to trade down within that range in exchange for more picks or additional allocation money. At the end of the curve, from about pick 47 onward, each pick gets slightly less and less valuable.
We can look at the rank correlation between pick number and actual value to see how good on the whole MLS clubs are at evaluating prospects. The average rank correlation for the drafts in this dataset is .48. Based on Michael Lopez’s article linked at the top, that’s a bit more efficient than MLB and NFL front offices, a bit worse than the NBA, and about on par with the NHL.
Here’s the rank correlation over time:
There is a small positive trend to the data, suggesting clubs might be getting better at evaluating prospects, but this might be too few seasons to really say for sure.
For now, I’ve laid the groundwork from which to further evaluate draft decision making via the expected value curve. In part 2, I’ll look at which coaches and teams have over and underperformed this curve with their draft picks. There’s a lot more to dive into here, though, so if you’re interested in playing around with the modeling file I’ve used, feel free! It’s on github here.
The second part of this series publishes tomorrow!