Valuing goalkeepers with goals added

By Matthias Kullowatz

We have updated our goals added (g+) methodology to produce g+ components for goalkeepers. You can find these new metrics on the Goals Added window in the app under the Goalkeepers tab (MLS, NWSL, USL, USL1). Up to this point, we had not published g+ metrics for goalkeepers. We recognize that goalkeepers perform many unique tasks on the field, and the first version of our expected possession value models and g+ framework missed a lot of those. Below I’ll explain the specific keeper g+ components and what they try to measure, share a few examples, and then wrap up with some nitty gritty details.

g+ Components

Shotstopping. This one is pretty straightforward, and something we’ve already been sharing on the Goalkeepers tab under xGoals. On that tab, we show (G – xG), the difference between goals allowed and expected goals allowed, where xG is on a post-shot basis (PSxG). There, negative numbers are good, implying that the keeper allowed fewer goals than expected. Here, we flip it around because we want positive g+ values to reflect good play. Thus the shotstopping metric is calculated as (PSxG – G).

Handling. It’s not just about saving the first shot; it’s about making sure the other team doesn’t score a goal. A keeper might make an amazing save, worth 0.5 goals of value, but cough the ball up directly in front of the net for an easy finish. The handling metric compares the typical rebound value of the shot (xRebound) to the actual rebound value of their parry (xParry), and rewards the keeper with (xRebound – xParry) g+ value. Note that if the keeper holds onto the shot, then they are rewarded with the xRebound value, which is the typical rebound value that average keepers give up on similar shots.

Claiming. This one is specific to aerial crosses. Whenever a keeper attempts a claim on a cross, we derive their g+ value from the difference between expected possession values before and after the cross, just as we do for most actions on the field. You can imagine a failed claim where the ball falls in front of the net for a shot. Often such a blunder will increase the possession value from around 0.05 goals before the cross to upwards of 0.5 goals on the free shot in the box. In that case the keeper would lose 0.45 g+. Conversely, most successful claims will be worth around 0.05 goals of g+ value. Punches are included here as kind of a claim-specific keeper clearance, as are instances where a keeper smothers a loose ball.

Sweeping. This is simply us renaming “Interrupting”, which is a component for all field players. This category primarily includes successful and unsuccessful tackles and clearances (with the feet).

Passing. Again, this one is intuitive. We value passes made by goalkeepers in the same way we do for field players. Note that goal kicks, which make up a healthy proportion of a keeper’s passes, are parameterized explicitly in our expected possession value models. This allows us to properly evaluate the possession values before and after the goal kick and assign the difference to the keeper (and recipient on completed passes), just as we do for all passes. What’s still missing is any allocation of blame to failed long-ball goal kicks, where clearly the keeper is at the mercy of his team’s ability to win aerials, at least to some degree. This is a blind spot in the g+ methodology for all passes, and one we hope to solve someday with tracking data.

Fielding. We took all the other typical components of g+ for field players—dribbling, fouls conceded and won, shooting, and receiving—and we lumped them into this category. Keepers aren’t often doing this stuff, but some of these events will severely hurt their g+ totals. Fouls, for example, will often be dominated by one or two instances where a penalty was given up. It helps that we limit the foul value gained and lost on penalties to just 0.25, recognizing that it’s often not entirely the player’s fault. Dribbling is another sub-component here that is very likely to be negative based on major blunders in front of the net.

Examples

Avid ASA readers know what we think of MLS’s top goalkeeper, Matt Turner, and it will come as no surprise that his shotstopping g+ value above average is tops since 2019 (minimum 2,000 minutes). What might be surprising, though, is by how much. When aggregating regular season player performance over the past three seasons, there’s Matt Turner, and then there’s nobody, and then there’s Steve Clark.

Total goalkeeper g+ across 2019-2021 MLS seasons, per 96 minutes, minimum 2,000 minutes.

In the NWSL, Michelle Betos of Racing Louisville shows a similar lead on the field across the 2018-2021 seasons, though she has just barely played a full season’s worth of minutes in there. Among those with multiple season’s worth of minutes played, Kailen Sheridan sits atop the list, making her case to get some more starts for the Canadian national team.

Total goalkeeper g+ across 2018-2021 NWSL seasons, per 96 minutes, minimum 2,000 minutes.

One thing that stands out to me when looking at the leaderboards is that everything a keeper does on the ball is dwarfed by shotstopping. In many cases, it’s hard to do really well at any one keeper g+ component (except shotstopping), but a keeper can sure screw things up quickly (errant passes and dribble, fouls in the box, etc.). This may not be too far from the truth when it comes to goalkeeping. Keepers are rarely delivering the final ball, threaded slickly between the back line of defenders, or winning a cross in the box with a header on target. Much of their job is to not screw up. That is certainly a simplification, but one that aligns with much of what we see in g+. As always, enjoy the app responsibly. 

Details

Shotstopping. g+ afficionados know that we opted not to give shooters credit for actual goals scored. Instead, shooters receive xRebound shooting credit as a function of their shot placement in the goal mouth--a function that typically credits less than 25% of the post-shot xG. Placement is something that, at least in theory, shooters should have some control over. Because we give shooters some credit for placement (again, through xRebound), that’s where we start with keeper valuation. Keepers are credited with the post-shot xG, i.e. placement value, and then debited the remaining value of the shot (save = 0 or goal = 1). What we are implying is that shooters are responsible for a fraction of the outcome of their shot up to their placement of the ball, and keepers are responsible for whether it goes in the back of the net from there. What we are thinking is that this is a very murky area in soccer analytics, and we had to split the difference somewhere between shooters and keepers (and try to remain internally consistent).

Handling. xRebound, for this purpose, looks at all shots that reached the keeper and determines the expected rebound value of each such shot. That is roughly calculated as the average amount of xG during the rest of the possession (because the keeper couldn’t hold on) for similar shots on target. The post-shot xG is a big driver of this value, which makes sense. The more likely a shot is to score, then the more likely it is to be parried given it is miraculously saved. xParry is derived from only the shots that were actually parried. Using features of the shot and the parry (such as the x/y location to which the shot was parried), we derive the expected rebound value conditional on the parry location. That’s what makes this more of a “realized'' rebound value than xRebound. We call this the xParry value, or the realized rebound value. An xgboost machine learning algorithm looks through all parried shots—whether they went out for corners, throwins, or back into the field of play—and determines the expected remaining possession value for such a parry. We make the keeper responsible for this xParry value, or expected realized rebound value. We know that if the other team actually scores on the rebound, then the keeper will be punished for that in shotstopping.

So there you have it! We finally have g+ values for all 11 players on the field at a time. Coupled with our recent updates to the model, we’re continuing (and will continue to) improve g+. Stay tuned for additional updates to our win probability model, which are also coming soon.