By Eliot McKinley (@etmckinley)
Expected goals (xG), love ‘em or hate ‘em, are increasingly being accepted across the soccer world, with misguided notable exceptions. While there are multiple xG models in the soccer analytics world, the concept basically boils down to quantifying the likelihood of a shot being scored based upon where and how the shot was taken. xG quantifies what you may understand intuitively, a shot taken close to goal is more likely to be scored than a shot taken 30 yards away. There are many ways to misinterpret expected goals, one of the most common is that xG tells you exactly how many goals a team will score in a game. Obviously, this cannot be the case, as the sum of xG values of shots in a game is rarely a round number. A team cannot score 1.62 goals in a game, but it can score 1 or 2. xG gives the most likely outcome for goals scored in a game. But since goals come in discrete units of 1, and no more than 1 goal can be scored per shot, calculating the probability of goals scored in a game gets a bit complex. The number and quantity of shots that go into a team’s overall xG for a game matter, it’s not just the sum of xG.
A team winning the xG battle in a single game does not mean that the team will win that game. A recent example of this is the Chicago Fire vs. Sporting Kansas City game on March 10, 2018. In this game, Chicago doubled up Kansas City’s total xG and had shots of higher quality on average, yet still lost the game 4-3. By simulating each team’s shots in the game 100,000 times, Chicago was predicted to score 3 goals 31% of the time, whereas Kansas City would be predicted to score 4 goals only 4.5% of the time. Looking at the probability distributions for each team, Chicago scored their most likely amount of goals. However, Kansas City was most likely to only score 1 goal (34%) and the 4 goals they scored was less likely than scoring 0, 2, or 3 goals. Overall, based on the shots taken, Kansas City would be expected to win this game only 10% of the time. These probabilities are all based the number and quality of the shots, and not all xG are created equal.
|Goals||Shots||On Target||Total xG||xG/Shot|
To further illustrate this point, let’s look at a hypothetical game, where both the home and away team take shots that combine to equal 1.0 xG. Let’s assume that that the home team takes 3 shots, with xG values of 0.8, 0.1, and 0.1, while the away team takes 10 shots each having an xG value of 0.1. The hypothetical home side got one huge chance, 0.8 xG shots are similar to those taken by Diego Fagundez or Will Bruin, and a couple of decent chances worth 0.1 xG, like these that were converted by Alberth Elis or Alvas Powell. While the hypothetical away side had 10 of those decent chances. How do these shot profiles affect the number and likelihood of goals scored for each team and the probable game outcome based on these shots? A number of factors go into this calculation. First, the only outcomes for a shot are goal or not goal. Again, you cannot score a fraction of a goal. So when simulating games, xG is not cumulative, each shot is independent of the others. Second, the number of shots matter. The maximum number of goals a team can score (leaving aside own goals) is the number of shots the team takes. Third, the quality of shots matter. High likelihood shots, by definition, are more likely to be scored, and will have a greater impact on the team’s goal probability distribution than shots with low xG values. Taken together, these determine the probabilities for the number of goals scored and, subsequently, game outcomes.
|Home Probability||Away Probability|
Going back to our hypothetical game. The most likely number of goals scored for each team is 1, however, the probability that a given number of goals will be scored are very different for each team. By virtue of having one very high quality shot, the home team has only a 16% chance of not scoring, and 68%, 15%, and <1% chances of scoring 1, 2, or 3 goals, respectively. It is not possible for them to score more than 3 goals because they only took 3 shots. The away side, on the other hand, has a 35% chance of not scoring, and 39%, 19%, 6%, 1% chances of scoring, 1, 2, 3, or >3 goals, respectively. In theory, the away side could score as many as 10 goals on their 10 shots, however, this event is extremely unlikely. When simulating the game using each team's shots 100,000 times, this translates into the home team having a 35% chance of winning (i.e. scoring more than the away team) or drawing, and a 30% chance of losing. The one high xG value shot shifts the likelihood of winning the game towards the home side. In comparison, if both teams took the same number and quality of shots, then each team would have a 25% chance of winning, and a 50% chance of a draw. Further into hypotheticals, imagine a game where the home team takes one shot with an xG value of 1.0 (which doesn’t exist, but Kei Kamara came the closest ever and still missed) and the away side takes 100 shots with xG values of 0.01. In this case, the home team’s win expectancy rises to 37% compared to 26% for the away side. Thus, while the headline sum of xG for each team in a game gives you a good heuristic about which team as more likely to win, but it is a bit more nuanced than that. Enter your own xG values in the box below to see how they affect goal probability and game outcomes.