By Jared Young (@jaredEyoung)
The topic of luck is never too far away from soccer analysis. A team peppers its opponent with 20 shots in a game but scores just one goal, while the opponent manages two goals from just six shots. We’d be prone to call the losing team unlucky because it appears they dominated the game, but the reality is the losing team was just on the wrong side of a probability distribution curve for goals scored. Call it unlucky, or just call it an unlikely but possible outcome.
Consider the binomial distribution estimate of goals scored for the Philadelphia Union and LA Galaxy this year. The binomial distribution is created using the teams' actual finishing rates and shots attempted per game.
Here we see that the LA Galaxy are more likely to score greater amounts of goals in a game than the Union, which is not a surprise. But what might be a surprise is that the Galaxy, for all their firepower, still have a 17% chance of scoring zero goals. The Union, meanwhile, have a 38% chance of scoring a goal. Multiply those two probabilities together and the Union have a 6% chance of winning 1-0. Add together the odds of Philadelphia winning 2-1 and 3-2 and all of the sudden the Union have a 15% chance of winning the game via one of those scenarios, which is a lot higher than most people would think going into the game.*
*While this approach assumes goal scoring between the two teams is independent, it turns out that assumption produces results not too far away from the correlated Poisson model we currently use for team Power Rankings.
These swings in logic are hard to keep track of over time as well. Do these probability outcomes even out over time? Or does some team keep beating the odds while another just can’t catch a break? Now that we're in the stretch run, we can start to look at teams that have beaten the odds consistently.
One way to look at this is to quantify what should have happened in a game. Expected goals models are useful for this, and we now provide them on a game-by-game basis. It’s a typical analysis to look at the expected goals of a game and determine if the result was deserved by both teams. In the opening example above, the expected goals outcome might be 2.2 goals for the team that took twenty shots and 0.7 goals for the team that took six. Analysts might say that the team with 2.2 expected goals deserved to win and were unlucky. So we can recalculate results by using the expected goals results from each game this year. To deal with the issue of draws, if two teams are within .5 expected goals of each other I’ve assumed the game should have been a draw. This number can be adjusted, but as you’ll see, it doesn't make a big difference. While the number of estimated draws compared to actual draws is a little high, it’s not too far off, and rounding down essentially is a reasonable assumption. Once all of the calculations have been done, we’ll look at the difference between the expected goals outcomes and the real results. The difference could be considered luck.
But first, there is one glaring problem with this method, and that is that expected goals models assume all teams are average. Given that MLS is most of the way through the season, we have an advantage of knowing that some teams are better than others. What I’ve done to address this is make an adjustment to all of the expected goals results based on the average offensive and defensive finishing rates allowed for each team. So if the LA Galaxy shoot 40% better than their expected goal finishing rate suggests, then I would adjust each expected goals result up by 40%. If they played the Union and the Union allowed a 40% higher finishing rate then their expected goals against would indicate, then I would adjust LA’s expected goals score up another 40% to account for the Union’s porous defense. Adding this adjustment basically assumes that all deviation from expected goals over the long term is skill and the remaining difference between the estimated wins and the actual wins should be “luck”. Of course these differences might not be a simple matter of "luck." Key injuries or international windows could alter the mix of the team over short periods of time which might account for the gap.
Here are the results:
So NYCFC has actually been the most unlucky team this season; the chart suggest they should have earned another 0.3 points per game. Meanwhile Portland has been far and away the most fortunate. These points per game gaps are meaningful too. By these figures NYCFC should have 36 points and be in the thick of the playoff race, while Portland should be at the bottom of the West.
It should come as no surprise that a collection of teams atop their respective conferences have all been a little lucky this year, but check out the LA Galaxy. Not only have they been good, and especially dominant of late, but they’ve been a bit unlucky this season as well. Watch out for teams like Sporting KC and Vancouver in the playoffs too, if they make it. They’ve got a few good bounces coming their way.