Collated by Alex Bartiromo
Here at American Soccer Analysis we try to publish cutting edge articles on soccer that don’t rely on narrative, anecdote, or lazy comparison, preferring instead to rely on data, experimentation, and a ruthless questioning of all of our own previously held beliefs. As you might surmise from this somewhat pompous description, doing that is hard, and our writers have more opinions and insights than we could possibly publish. That is why we are coming out with a new series, the ASA roundtable, a weekly discussion between our contributors on questions facing the soccer and soccer analytics communities today. We look forward to your feedback, and if you want to submit a question just tweet it to @analysisevolved or email us!
Welcome to the ASA Roundtable, folks!
Given the proliferation of analytics in the soccer community, we've gotten much better at examining the various contributions players make to a team, especially when it comes to goal scoring and the actions that lead up to it. However, one area that is still little understood is defensive contributions. How can analytics help us understand defending in modern game? What are the next steps that need to be taken to improve our understanding in this area? What are some analytical concepts you think more MLS teams could take into account when game planning/building rosters that they aren't currently (if any)?
To start off, the defensive ability of a player or a team is hard to judge purely based off of simple metrics that are used for attacking players: defensive players can’t be judged off of simple accumulation of tackles, headers, etc. because the number of actions is largely skewed by team strength. Thankfully, tracking data is coming into play more and more. Through GPS data and other similar tools, we will likely be able to generate a more holistic viewpoint of what makes good defensive players.
You can look at "space control" by drawing a circle around the player based on how far they can reach and measure pressure by offensive event success, but this time looking at what defenders are nearby.
To Carlon's point—and Cheuk's too—it's not just what makes a player good defensively but what makes a team good defensively. I think of Real Salt Lake, which, honestly, isn't a team that stands out in terms of defensive ball winning ability. But they've been extremely good at limiting shots over the summer and from what I can tell, a lot of that has to do more with combined team work than it does any one single player and their ability. Knowing the tactical awareness and marking ability of a group of players is just as important as understanding a player’s ball winning potential.
Cheuk mentions space control and it's true. It's not just pressuring the ball but marking and making sure that when a player receives a pass that they don't have the ability to turn or move the ball in a way that could compromise your team’s shape or create angles for passes to break lines for runners.
Cheuk Hei Ho
It seems to me defense is always about shape, being compact, etc. You could do that with tracking data now.
But even assuming we have all the tracking data that we want, event data is still gonna be needed because you still need to use the offense to infer the effectiveness of the defense.
Combining the event data with tracking data can give a pretty good outline of the area a defensive player “dominates”. The more space a defender has control over, the better player.
Pushing the conversation towards roster construction, something to look at is the performance of a player in their team’s context. When we talk about defensive events and how a player performs, a lot of times we fail to consider how much, or how little, his team is responsible for possession. Possessions and pass attempts, or movements in general, often change the tone of how we talk about tackles, interceptions, and blocked shots.
For example, a team that plays in a low block is probably going to gather more opportunities to block a shot. So saying this player is "great" because he simply has a lot of blocked shots doesn't account for the fact that while he's on the field, his team may play with fewer possessions of the ball, giving him more opportunities to block a shot.
To continue Harrison's point on roster construction: we’re at a point now that teams would be remiss to not search for players who play in similar systems and styles to “filter out” those who numbers wouldn’t match the same output (in the same areas too!) of those in other systems. For example: if you’re a counter attacking team, you look for defenders in other counter attacking teams.
At the end of the day, players are largely products of their system.
We all agree that tracking data is going to be a game changer for defensive metrics, but how long do you think it is going to take before something useable will be widely available?
Well, do you mean publicly? For pro clubs, etc.? Because Catapult data (those sports bras players wear), is already doing a lot with this.
I’m thinking of this quote from the Athletic about hockey tracking data.
“We’re getting player tracking data in the NHL over the 2019-20 season, the first public burst of it ever, and it’s going to give us amazing, interesting, helpful statistics that hockey fans decades ago couldn’t have even fathomed. But in the process of figuring out what’s important, and how to use a lot of this information, there’s going to be a lot of shit out there. I mean flat-out shit.
is GPS tracking good enough? Do you need optical tracking?”
I know people like STATS and SportLogiq are already looking at these things
as well as whatever Barcelona and Liverpool are doing.
Even MLS clubs are doing it. In fact Sean Davis talked about this very thing on our last podcast.
He did mention that the sports bras weren't the most comfortable though.
I’m thinking something like determining the optimal position a player should be in at any given time. Something like the ghosting work in this video or this article: http://www.yisongyue.com/publications/ssac2017_ghosting.pdf
To Carlon's point, and as a few others have mentioned, the difficulty is the need to contextualize defensive actions. It seems like the logical place to begin is comparisons by position within teams, like Cheuk has done with WOWY. It's so hard to compare players across teams and positions that it seems like it will be a long time until we get to something as easily understandable as xG for defense. Obviously that severely limits its usefulness as well.
Agreed on all the points about tracking data. One of the biggest issues with event data is situational awareness: Why was possession lost? What pressure was applied? How many players were between the ball and goal? How far were defensive players from the attacking players? Was the pass poorly aimed or well defended?
We can say RSL is great at stopping counter attacks before they happen this season, but it is difficult—unless you watch them every week—to understand what it is they collectively do differently than other teams.
Then the question will be: will clubs be developing these types of models themselves? It is going to take a lot of computational power to get these things running.
Depending on the provider, records will accumulate and create a “big data” problem. MLS clubs have never been in a position to have to make IT type decisions that other companies do in areas such as data infrastructure, security, and Cloud services.
To Eliot’s point, the data will grow to be in the billions of records, particularly if you want data on all the other teams in your league or data from other leagues for scouting.
Should they keep this data in-house, put it in the Cloud, or outsource it to providers who provide services? These are new questions.
I think it's logical that clubs will pattern themselves after other professional sports entities of the last few years in their various similar genres. The one specific example to me is the Houston Astros’ big data project and the internal staff that they collected for it. They've built a good size department filled with some pretty interesting baseball and analytical personalities over the last seven years. Now, much of the personnel has turned over but the premise of what they built and how they built it still exists. Looking over at the NFL, Cleveland hired Paul DePodesta due to his experience in building out these types of departments and his proclivity to using analytics in decision making. Teams probably will just follow the paths which have been forged by other organizations in other sports.
As for Eliot's initial question, which I think is really quite critical: when will this type of data become more publicly available? Obviously there are some great organizations that are already working on these things. Our friends at Statsbomb, for instance, are working with this data. They as well as other organizations have provided some sample data for people to try their hand out and that's all well and good but we're really not going to make real progress until we have more people involved with much larger samples. This was shown with expected goal models and I think it's true as we continue forward.
Soccer has generally lagged shortly behind the NHL and then a bit further behind the NFL in terms of how the games are progressively approached from a public perspective, and I think that if this is the first year that [the NHL is] getting public tracking data. It may still be a couple years before we get it on our side with MLS.
I know Dummy Run was playing around with normalized Defensive Actions (DFAs) that adjusted for the team they played on and I thought that was interesting. Could something like that work at least in theory?
I'm honestly not sure. There are few different takes on DFAs but the more that I use the events to analyze and investigate occurrences, the more I feel like you have to take the action for itself. Lumping interceptions with tackles seems weird. They’re two different skills and two different tactical approaches to a problem and while both end in a turnover (which is a good thing), I don't know if you want to focus on the end result so much as the approach and manner in which possession was won. That feels a bit more important, but maybe it just depends on what question you're trying to solve.
As a very first baby step for working with what we've got now, team adjusting DFAs seems more promising to me than possession adjusting them, but neither approach even really starts to get at the smart stuff Cheuk was saying earlier about how what we really want to do is measure space control and how we're going to have to use offensive events to do it.
Statsbomb's Thom Lawrence already laid the groundwork years ago for the thing I've been wanting to look into.
Aside from making for a pretty cool viz, this starts to move us in the direction of using tackles and interceptions and stuff to figure out what part of the pitch a defender is responsible for.
This is important because all modern defending is fundamentally zonal, even if you play under Matias Almeyda (as the Quakes eventually figured out once they got the hang of the whole rotation thing).
If you can figure out what zone a guy is trying to prevent the ball from moving through then you can grade him on how successfully the opponent moves the ball through that zone instead of trying to count the things he did to accomplish that. Thom wrote about this back in 2016 too.
When your job is to win the ball back, disrupting a pass is (roughly) as good as making a tackle is (roughly) as good as recovering a loose touch, etc. and a lot of times your defensive contribution is not even going to show up as an event.
One big problem is that those zones of responsibility change situationally and how the hell do you measure that?
That's where I think a good next step might be to look at incorporating phases of play, which a lot of people have been thinking about lately and which some Opta guys, including ASA alum Tom Worville, published some good work on last year.
To Dummy Run's point, if you understand zones of responsibility (or at least a player's ability to cover an area, as they may not cover their assignment well), you can see what attacking actions happen in that zone compared to other zones. This allows you to find areas of weakness which may be the responsibility of a single player or a combination where responsibility overlaps.
Yeah, that makes a lot of sense. That said, I would still think you'd want to sift through by event type. Interceptions or tackles, not both. Or am I missing something?
It's a good question, and perhaps more of a team style one. Are you getting stuck in or pouncing on a mistake? From this, we can see how style is reflected in execution and then look at intent. Mostly I subscribe to the thinking that stronger teams impose their style on weaker teams.
Cheuk Hei Ho
When I look at defense I often see it as a game of each "attempt to progress the ball".
A lot of the best defensive plays—especially in the medium block—don't attempt to cut or intercept the ball, they just force you to go back and back and then after a certain point they press you and force you to hit an aimless long ball. That wouldn’t be captured easily by defensive events.
Right, if a defense can trap the buildup against the sideline, press the backpass, and force a longball out of bounds it's done a good job without recording a tackle or even a recovery. The opponent's pass completion rate for that sequence probably looks decent and the successful press won't show up in Passes per Defensive Action (PPDA). So what do you measure to show that the defense did exactly what it set out to do?
That's where the tracking data can come into play, as well as proxy events such as longballs against, augmented by those going out for goal kicks and throw-ins.
Cheuk Hei Ho
I think that for zone and space control, one doesn't always need to look at the outcome of the ball per se.
Eventually you need it, but you can also measure how often a defender is just positioned properly around the guy he marks.
Or is the space between the lines secured? Meaning that, at about 0.5 seconds, is the space is reachable by the players on the lines?
I think two things are clear: 1) This is really hard, and usefully analyzing defensive actions is still in its infancy, but 2) there is a lot of potential growth here for analytics, and if we can come up with this many ideas on a stupid Slack channel, surely there are other smart people out there thinking about these and other ideas.
Cheuk Hei Ho
My take is that bad defensive players are easily spotted by event data but good ones aren't and using defensive stats to evaluate defensive quality can be very rudimentary.
Eight more years of bad defensive metrics will change America in a fundamental way. if you agree with me, go to joe 3 ... 0 ... 33 ... 0 and help me in this fight.
There are no sure bets here, but teams who subscribe to services to obtain tracking data, and invest in people who can create useful analysis from it, will be at the forefront. They will have the opportunity to solve these problems and benefit from the output sooner than others.
Game actions are focused on the ball, leading to better analysis of attacking actions because there is more and better data. Pretty much every touch of the ball is recorded somehow. But on defense, "the dog that didn't bark" is what comes to mind. The best evidence of a good defender would be a large hole, devoid of an opponent's successful attacking actions. But defenders move around, switch up to handle counters, and the like. It gets to be very difficult. And basic counting stats of defensive actions are somewhat misleading. In 2017, when Minnesota was setting records for goals conceded, the center backs were above average in tackles won. I suggested this stat was because the Minnesota defense was facing more possessions in their own third - but a simple look at # of tackles might suggest a decent defense.