Europe, Money, and the Problem with Disparity

By: Eliot McKinley, Sean Steffen, and Tiotal Football

American Soccer Analysis has been in the analytics game since 2013, and, early on in this project, we noticed something that’s always troubled us when it comes to taking the seminal analytics studies and concepts developed in Europe and applying it to an MLS data-set. To put it frankly, they don’t work as well.

No, seriously, it doesn’t work.

In the previous article, Eliot faithfully replicated Sander IJtsma’s classical soccer analytics study showing the predictive power of xG. Something that jumped off the page was the finding that traditionally predictive metrics, such as expected goals (xG) and total shots, fail to be nearly as predictive of future team performances and results in MLS as they are in Europe.

The prevailing theory behind what’s going on is that MLS just isn’t as good as the Big 5 European Leagues. The story goes that xG is a better predictor in Europe because Europe represents some notion of a  ‘truer’ version of ‘football’-- Perhaps a more structured, solid version of the sport is played in the Old World. In MLS, stuff just kind of breaks down more into chaotic scrambles. Additionally, you may hear about the effects of playoffs, weather, or travel which we will maybe come back to another time.

As close followers (and fans) of the league, we admit that the level of skill on display in MLS is lower than what you’d find in the Big Five, however, using this as an explanation for why “xG doesn’t work as well,” has never felt right to us. 

Is it talent?

One way to test the talent hypothesis is to sample xG ratio predictiveness across varying levels of leagues. When you do this, one problem with talent theory comes to light.  

If talent is truly the main driver of predictiveness, why is xG ratio predicting future points per game better in the lower leagues of American soccer where the talent level is surely below that of MLS? 

As a sidenote, we’ve plotted the predictive power of both our own ASA xG model and that of the Statsbomb figures you see on FBref, and they’re similar: basically on top of each other. It’s not that anyone’s model is broken in some very specific MLS kind of way. 

Is it finishing?

Another aspect of the talent argument can be boiled down to the controversial notion of “finishing.” The theory goes, European players are simply better at it, and this is a major reason for the predictive split. 

To us, this theory is equally unsatisfying as it blatantly ignores the fact that there is no significant divergence in xG’s ability to predict goals between leagues. No matter the league or model, xG does a fine job of capturing how many goals will be scored. If finishing talent was the main driver, this would not be the case. 

So, if it’s not talent or “finishing”, what is it?

Let’s talk about markets

While there are plenty of differences between MLS and Europe, we’d like to focus on one that we’ve found to be extremely important to the question at hand– inequality. 

 A common joke in the states is that European markets are regulated and their footballing transfer markets aren’t, while the opposite is true in America. American politicians are keen to let the “invisible hand” handle all the market-clearing, except in sports where carefully constructed financial regulations are oh-so-critical to protect the market values of the jointly owned and profit-shared franchises and their owner-operators. 

MLS  is no different in that it mostly caps or restricts wage and transfer fee spending for all the teams such that the disparity in spending between the “richest” and “poorest” teams is one or two orders of magnitude smaller than you’ll see in any of the Big Five European leagues. MLSPA published salary information suggests the top team in MLS spends roughly 60% more than the median team on wages and twice as much as the lowest spenders. In Europe, while these figures are less transparent, we estimate the biggest spenders spend let’s say 4X that of the median and 20X more than the most frugal teams.

[Editor’s Note: Sean and Tiotal Football wrote like 4,000 words on MLS capology here. We have deleted their ramblings and revoked their logins. We care about you, readers]

Might this inequality have more to say than actual quality? Earlier, we connected the talent notion to the supposition that Europe represents a truer version of the sport, which implies that MLS is the outlier. What if we have this backwards? What if the inequality of Europe is the confounding factor distorting things in the other direction?

What if the history of all hitherto existing soccer is the history of class struggle, and the real spectre haunting Europe is disparity so rampant that leagues are entirely too predictable, resulting in an inflated sense of how predictive these metrics truly are?

Gini in a bottle

To test this, we need a way to measure disparity. Luckily, that tool has existed in economics since 1912: Gini Coefficients. 

The Gini Coefficient is a measure of concentration (or dispersion) amongst a population of measured values. Its primary use in economics is to measure income or wealth inequality between say, nations or social groups. Here, we’re measuring the concentration/dispersion of points earned amongst soccer teams in a final league table instead (but I mean, it’s fun cuz that’s often measuring the financial concentration/distribution amongst clubs in a league anyway). 

So, what happens if we compare the Gini coefficient of a given league against how well our metrics are at predicting future team performance within that league? 

As you can see, the fit is quite impressive, suggesting that inequality does, indeed, boost the power of observed expected goal figures, total shot ratio, goal ratio, etc, at predicting future team performance. Also note that the NWSL, USL Championship, and USL League One have similar travel issues and playoffs to MLS, but inequality still tracks with predictability.

The role of money in class structure

A way to think about it is to imagine these leagues as having stratified social classes. At the top, you have your blue bloods like Barca, Juventus, Bayern, etc., who can be reasonably predicted to win almost every domestic game they play. At the bottom, you have a rotating cast of relegation fodder who can be easily predicted to lose most of their games. In between, of course, are varying degrees of “middle class”.

The classes, in this example, can be approximated by separating teams into tiers by their final table positions in each season. The top tier is the top 6, bottom is the lowest 6, and the middle is everything else. We can then look at games played by opponents in the same tiers or across tiers. 

What’s most striking from this is that these predictive metrics in Europe start to struggle when you just look at games between opponents of similar quality (even when “the best football” is played between elite teams). In fact, it’s even less predictive than MLS. Furthermore, and perhaps you saw it coming at this point, predictiveness increases when we restrict the population of European soccer to only games between unequal opponents (opponents of different “tiers”), compared to all games together.  

Much of this may be explained by the fact that when top teams play bottom teams in Europe, they absolutely mop the floor with them - with an average goal differential of 1.45. In MLS, goal difference drops to 1.01 when the top teams play the bottom. While top European teams win against bottom teams about 72% of the time compared to MLS’ 66%, these games are more likely to be lopsided, with 45% being wins by 2 or more goals compared to only 35% in MLS.

Folks. We ask again. What if Europe is the outlier?

Conclusions and Questions 

Many times throughout the research and draft process for this article, we found ourselves asking “does any of this count as a discovery?” And we asked ourselves these questions because, quite frankly, it all seems rather obvious. You don’t have to be an economics nerd to see this. Just look at Europe where Juventus recently won 8 titles in a row, and Bayern Munich 10! Does anyone not already know that Europe is easier to predict? 

Despite this obvious fact, little has actually been done in terms of exploring the impact that these leagues have on the metrics themselves. This is alarming, considering how much of the current analytics canon is written using data from these leagues where xG predictiveness is juicing in the locker room.

So this conclusion is not a ground-breaking one, but hopefully, it is a defamiliarizing one. The results of the replication work in the last article and the exploration of xG’s performance in MLS relative to other leagues in this article do not alter our understanding of how well an xG model predicts future results relative to the other available basic metrics (although Eliot did find some discrepancies there). Instead, it should be a reminder that the absolute predictive power of xG (or any metric) is still contingent upon the structure and make-up of the games themselves and the teams that contest those games. As this article highlights in particular, the concentration of strength/ability across the teams who are competing is an important and often underlooked factor. 

In this way, we should be slightly more resistant to flat statements about how powerful statistical methods are at generating objective insights about the sport. More specifically, we wonder if we should be more resistant to thinking about xG’s predictive power (as imagined in the classic correlation over time chart popularized by 11tegen11) as something that is “natural” to the sport, as our mainstream framing for this “nature” is based on the structure and make-up of top European leagues and everything that comes with them (inequality).

If this article shows anything, it’s that the choices a league makes surrounding what the competition will look like at the macro level (i.e. regulation), will impact the degree to which any data-driven insights are predictive of future performance. 

While we have joked throughout that European soccer could be the outlier, the truth is, there is not one true form of soccer and then some other kind. Team sports are contests between… different teams. Rampant inequality amongst these competitors is just one of many models for how a league might operate, (as is a near perfect parity structure), and soccer analytics might need to more directly confront this.

While we have gone on long enough, we do feel that these findings raise some questions going forward that we may explore in the future:

  • When we notice the impact of competitive parity on the absolute predictive power of statistical metrics in soccer (sport?), should we (or how might we) change the way we use data-driven insights as the competitive parity increases or decreases along the scale? What does it look like practically to take these observations to heart?

  • If disparity distorts the predictive power of xG, what else might such a concept touch? Beyond stats, might a manager in MLS looking to model his tactics off of Chelsea or Man City be equally prone to mistaking correlation of tactics to winning, when the cause might just as easily be disparity?

  • How should teams take into account differences in parity when recruiting players from another league? We’ve heard of more linear concepts like the “Bundesliga Tax,” but how do we adjust for differences in the predictive power of a given team’s or player’s past performance due to the relative competitive differences within their league?

  • What are some of the advantages and disadvantages of designing a league with rampant inequality vs a league with planned parity? What can these leagues learn from each other?

We would like to thank the many members of a months-long Slack discussion, specifically Jamon Moore and Mike Imburgio, that helped provide valuable insights that shaped the direction of this research and the research to come.