Where Goals Come From

By Jamon Moore and Carlon Carpenter

This is the first article in a series of articles and videos from the Where Goals Come From project from Jamon Moore and Carl Carpenter.

Regardless of philosophy, style, principles, and game model, a club’s on-field strategy should focus on maximizing goals scored and minimizing goals conceded. Goals win points, points are how clubs reach their objectives, and clubs that reach their objectives win fans and make money. The “Where Goals Come From” project is about how to create goals (or prevent them). By breaking soccer down to basics, we aim to provide clubs a clear roadmap to success.

Key points:

  • A club needs a common language to talk about the strategy and principles the team will use to score and prevent goals. That’s what the “where goals come from” framework aims to provide.

  • The framework can also provide context to expected goals (xG) and other types of probability models in soccer.

  • A detailed understanding of how a club plans to maximize its goal differential can help people in a variety of roles understand how their decisions fit into the strategy.

The Purpose of the “Where Goals Come From” Project

Soccer analytics has been undergoing a bit of a revolution in recent years, from widespread adoption of expected goals to possession value models like Goals Added (g+) that assign a goal value to every on-ball action. But the analytics community has been less successful at explaining to sporting directors and coaches why its metrics matter. Ian Darke summed up the communication gap:

This project seeks to help bridge the communication gap by building from the ground up, starting with a framework to describe types of goals. We believe the concepts we’ll cover help describe the game at a level of detail that has practical applications, a sort of layer between “goals” and “expected goals.” We’ll unpack questions about advanced metrics later, but for now we want to focus on what the game is really all about: goals.

While a club cannot control how many points other teams get in the table, it has direct control over how many goals it scores and allows. Whether the objective is to win the league, compete in the Champions League, or just make the playoffs, that annual target can be translated, via the historical distribution of goal differential on the table, into actionable on-field terms: score at least X number of goals and concede only Y number of goals. Understanding where goals come from allows a team to plan for goal differential based on team history, coach’s history, player history, and the upside potential of acquisitions. 

This video provides more details for those who are interested in a deeper look at the purpose of this project and why goal differential is the key performance indicator (KPI) that should drive the decisions on the sporting side of a football club.

The Five Goal Categories

Goals are created in a common set of ways. In our research, we have identified five categories of shots and goals, mostly depending on the pass or dead ball situation that set up the shot. These five categories are:

  • Progressive Passes - Shots from a completed open-play pass that moves at least 25% closer to the goal from its origin.

  • Basic Passes - Shots from a completed open-play pass that moves less than 25% closer to the goal from its origin.

basicpass.gif
  • Set Piece Passes - Shots resulting from a dead ball pass or another pass within a few seconds of an initial dead ball pass (as determined by the data provider).

setpiecepass.gif
  • Individual Play - Open-play shots that do not come from a pass, but rather from shooter creativity, a defensive breakdown, or other means.

individ.gif
  • Set Piece Kicks - Shots from direct free kicks and penalty kicks.

The Diego Valeri of La Liga is also good at free kicks.

The Diego Valeri of La Liga is also good at free kicks.

How the five goal categories perform

We looked at over 250,000 shots and 25,000 goals from the last six complete seasons from five leagues: the English Premier League, the German Bundesliga, the Spanish La Liga, the Italian Serie A, and the Major League Soccer.

We expected to find some variation in a league such as the English Premier League compared to a league such as Major League Soccer, because of the style of play differences, but also because of perceived quality. It turns out despite player skill differences, the distribution of goals scored in these five categories across every observed league is very much the same.

As you can see, the percentages don’t vary more than 3-4% between leagues and levels, and the relative size of each category is static. The year-over-year values in the same league should remain largely constant.

It was important to us to look for patterns that may emerge from leagues of varying perceived quality. As we looked deeper into available second division data, we found very similar distributions of goals in the goal categories.

After these discoveries, we now expect we would find similar distributions at most levels. The only hurdle to performing this additional layer of analysis at any level of play is the availability and quality of data.

About 40% of goals come from progressive passes

One thing that surprised us in our research was how often we saw progressive passes appearing in various types of goals. Whether it was the pass that set up the shot (the “key pass”) or the pass before the assist (secondary key pass, also called the “hockey assist”) or even the pass before that (tertiary key pass), progressive passes are an incredibly important part of a successful open play attack.

There are various definitions of the term “progressive pass,” but we’ve found fixed-distance definitions miss passes that should qualify, such as passes from the sides of the box that don’t travel 10 yards.

Our definition of “progressive pass” comes from John Muller: an open play pass that moves the ball at least 25% of the remaining distance to goal. We only count passes that start in the attacking 60% of the pitch. This definition captures all the passes that we need. We call progressive passes over 35 yards “long balls,” in keeping with leading data providers. 

If your analysts want to use different definitions from a data provider, that’s fine, but you may end up with somewhat different numbers and overall analysis.

John and ASA visualization guru Eliot McKinley recently demonstrated many types of progressive passes from various areas of the pitch used by UEFA Champions League clubs.

There are specific types of progressive pass in the middle and final third that create or set up “finishes.” We will get into detail on these passes in a forthcoming article and video. 

What goal categories can tell us about team strengths and weaknesses

Using this new classification of goals, we can do some basic analysis on how various teams score and concede goals. Let’s take a look at how MLS teams scored in the strange 2020 season that was started, stopped, and restarted twice.

Even without any deep analysis, we can see the teams that scored more goals from progressive passes were mostly strong in the attack. In fact, the correlation between points per game (PPG) and Progressive Pass goals per game for MLS in 2020 was strong (R = 0.76), regardless of goals against. The top row of teams had the most Progressive Pass goals but also solid numbers in other categories. The second row is teams who were strong in Progressive Pass goals and at least one other category.

You might think the teams that score the most Progressive Pass goals are just the teams that shoot the most, but quite often that’s not the case:

The correlation here between points per game and shots is fairly low (R = 0.43). This tells us that teams that are better at scoring are more efficient with their Progressive Pass shots. We will get into why this is in our next article.

Progressive Pass goals against in MLS 2020 were also very important, with a strong correlation (R = 0.67)to points per game:

Poor defenses are usually poor in at least two goal categories. The San Jose Earthquakes gave up nine or more goals across four categories, while other teams that allowed a high number of Progressive Pass goals gave up nine or more goals in only one other category. The top defensive teams all conceded 12 or fewer Progressive Pass goals. New York City FC was absolutely astounding with only four Progressive Pass goals against, but they were near the worst in Individual Play goals against. The Portland Timbers and D.C. United both gave up 14 Individual Play goals, which could indicate other issues.

The combination of Progressive Pass goals for and against (i.e., Progressive Pass goal differential) has an even stronger (R = 0.88) correlation to points per game, extremely close to the correlation of points per game to overall goal differential (R = 0.91). In other words, Progressive Pass goals explain a team’s success more than anything else.

Note: Own goals are not included in the goal differential of the chart above

The league’s best and worst teams stand out for their Progressive Pass goal differential. With few exceptions, teams with positive Progressive Pass goal differential had positive overall goal differential and vice versa. We will look into other things we can learn from these charts in a future article. 

Goal categories are useful for player analysis as well. Goal data is often segmented into just two categories, penalty and non-penalty goals. Even at the simplest level of the Where Goals Come From framework, the five goal categories, we can get a much more informed look at how individual players score goals. Look at the differences in how MLS’s top scorers found their goals:

We can dig further into this data using shot locations and key pass data. Over time, we can use this to better understand how much a player’s scoring and contributions may rely on a progressive passing attack or specific defensive tactics.

Here’s a video summarizing the goal category concept:

Conclusion

The five goal categories are the first aspect of the “Where Goals Come From” framework to help improve communication about goal differential in a club. Key points:

  • Regardless of league or level, goals are scored in the professional game at the same rate in very similar ways.

  • There are five goal categories: Progressive Pass, Basic Pass, Set Piece Pass, Individual Play, and Set Piece Kick. 40% of goals come from the Progressive Pass category.

  • A club can build its on-field strategy for how to maximize goal differential using these five goal categories. By focusing the team’s game model on a subset of these goal categories, a club can improve in specific areas.

  • The most important goal category is the Progressive Pass category which is usually the best indicator of the overall goal differential for the team.

  • The categories can aid basic team- and player-level analysis. Future articles will build on this.

___________________________________

Future Analysis

Please follow American Soccer Analysis, Jamon Moore, and Carl Carpenter on Twitter for future work on this project. Here are some of the things you can look forward in 2021:

For coaches

  • Deep dive on Progressive Pass types and when to use each of them

  • Video examples of how to create the right key pass or prevent them

  • When to play passes in the air versus on the ground

  • Good, better, and best: making expected goals useful in your practice and game model

For club executives

  • Creating a sporting culture around goal differential: lessons on business agility for a football club

For analysts

  • Types of key passes and expected goal modeling

  • Phase-of-play and speed-of-play effects on key pass effectiveness

  • Secondary and tertiary key pass analysis across the top leagues

  • Additional deep dives into the Basic Pass, Individual Play, and Set Piece Pass categories

___________________________________

About Jamon Moore

Jamon is a twenty-five-year professional in the high-technology industry who started as a software developer and is now in executive management overseeing business agility transformations with a specialization in high-technology. Jamon is an analyst for American Soccer Analysis, runs the Quakes Epicenter website covering the San Jose Earthquakes in Major League Soccer, and is a host of the Black and Azul webshow, now beginning its third season, which also covers the San Jose Earthquakes. Jamon has been a coach and assistant coach for several competitive youth soccer teams in the Bay Area. Jamon can be contacted via Twitter, and club analysts and executives can connect with Jamon on LinkedIn.

About Carlon Carpenter

Carlon is the current Tactical & Video Analyst for StatsBomb, one of the largest soccer data companies in Europe. His role at StatsBomb is focused on providing insights into video analysis and providing clients with support into incorporating data analysis into their workflow: be it through opposition analysis or scouting. Other responsibilities include providing guidance, feedback, and expertise in required functionality for video platforms, as well as applying StatsBomb’s data into “on the pitch” functionality. Carlon also works as a contract employee for the U.S. Soccer youth national teams, working as a performance analyst for the U-17 men’s national team. Previously, Carlon worked for three years at the University of Virginia as a performance analyst and coach, also coaching at a youth academy in Charlottesville, specializing in goalkeeper development. Carlon can be contacted through his LinkedIn account, or via Twitter.

A request (for the analysts)

We understand that the analytics community is constantly improving the ideas of others, and this work would not be possible without taking bits of the collective ideas of the soccer analytics community at large. Where direct attribution can be provided, we have tried to do so.

In the universe of hundreds of articles on soccer/football, it is impossible to read every bit of research that has been made on the topic of goals. If this work duplicates any existing work, please contact the author to either have a change made or proper credit given.