Every team has its own “style”. Some teams bunker, some teams high-press, some clog the middle, some work the wings. Where they defend is a major part of what defines their style. The recipe for a team’s defensive shape is one part tactics and 11 parts players on the field. Certain players seem to naturally gravitate their efforts to particular areas, be it the wing they’re assigned to, their preferred foot, their favorite partner-in-crime or how they’re instructed to approach the opposition. In the end, the action happens in consistent general areas of the field, but in complex patterns.
One could take an Opta map from any particular game and examine the defensive spatial patterns. You can see the clusters of defensive actions as well as voids where a team hardly seems to find themselves defending at all. But that’s just one game. We all know that teams are forced to adapt their style of play to their opposition, and whatever flukey circumstances played out in that game might not be totally indicative of a team’s overall style. What would really be telling is the aggregate over multiple games.
For example, we could look at the tendencies of teams to do all their defending on the wings vs in the middle of the field. The below graphs show the one-dimensional left-to-right location (left sideline to right sideline) of every defensive action so far in 2017, separated by team. Basically, imagine that you’re the goal keeper, and you’re simply counting how often you see one of your players attempt a tackle, block, interception or challenge.
Some interesting trends can be noted. Houston and Vancouver, for example, defend pretty evenly from left to right on the field. Orlando and Seattle on the other hand are really heavy on the wings. Minnesota, Chicago and San Jose are pretty asymmetrical.
We could shift the perspective too, and look at the picture from the sidelines (defensive half to attacking half):
From this angle, the patterns aren’t quite as clear because the field is longer than it is wide. But still, some interesting patterns emerge. Minnesota has done a lot of defending basically everywhere behind their attacking third, while Montreal, a known counter-attacking team, contains most of their defensive actions to their own half. It’s also notable that, from this angle, some teams can be observed to do more defending than others in general. San Jose has some really high peaks, while LA and Vancouver are less so.
Putting the two together, we can examine frequencies everywhere on the field:
The picture isn’t so pretty right? You can sort of make out the hot spots, but there’s a fair amount of noise that begins to obscure the patterns everywhere on the field (and I’ve even aggregated the “bin” size to make this easier to see). What we need is a way to separate the “signal” from the noise…
Welcome, friends, to the wide world of spatial statistics. My statistical colleagues will tell you that there’s a huge array of techniques for geospatial modeling, ranging from the obscenely-complex to, well, sort-of-complex. What they all have in common though is the ability to pick up on the patterns I pointed out with the first two graphs, but do it in both dimensions (or more) simultaneously.
For this analysis, I went with the “simplest” option that would pass the sniff test: regression analysis with some extra variables. (Math alert below, so feel free to skip down to the Defensive Actions Heat Map if this part doesn't interest you.) For the nerds out there, it’s negative binomial regression with fully-interacted cubic polynomial terms and an interaction to make it different by team, like this:
The point is that it’s a count model, where x and y (the horizontal and vertical position on the field from the sideline) are the main predictors of the number of defensive actions in a particular place. There’s also a bunch of square and cubed terms in there to make it smooth and curvy in 2D space.
The result looks like this:
These are heatmaps of where, on average, each team ends up doing their defensive work. Red is lots of action, blue is very little. From these, you can clearly see differences in team styles. Orlando plays high on the wings, Toronto absorbs but doesn't fall too far back, and simultaneously creates a lot of action right on top of the opposing central defenders, Montreal bunkers deep in their own end, Seattle relies on the range of Joevin Jones, San Jose’s Nick Lima-Victor Bernardez-Florian Jungworth combination is lights-out, but basically all they've got.
The complexities of spatial analysis highlight that there’s more to it than just high-press, bunker, etc. There’s asymmetry, playing to the strengths of your best players and islands of action in the middle of otherwise cold areas.
This kind of analysis is fun to look at, but it can also be used in a myriad of ways. For example, we could compare teams this year to last year to see how tactics and personnel are changing:
You can see that the 2016 Montreal Impact created a lot more chaos in front of the box than the 2017 Impact do. Either they’ve become more clinical, or less goal-dangerous. Perhaps a similar analysis on passes and shots would shed light on that question. New England last year also employed a higher line of confrontation than they do this year. If one simply watched the games without looking at the data, these kinds of differences might actually go unnoticed.
There are other potential applications for these kinds of results too. There’s so much detail in these heatmaps, that you might imagine analyzing how a particular team performs against a type of team with a very different style. Perhaps there are patterns within matchups that seem like unpredictable wins and losses when they pan out in real life, but would have been entirely predictable if the coaches (or betting lines!) had looked carefully at the history of one style vs another.
We might also imagine examining these maps with and without a key player, to see how someone’s presence and absence impacts a team (like a more advanced version of this analysis I did a little while back). Or perhaps we could identify a turning point in a coach’s tactics, and use these to assess what it led to.
We could even make player-specific maps and repeat any of the above potential analyses.
There are, of course limitations to this. For one, it requires a huge amount of data to be able to pick up a reliable signal. Trying to estimate these heatmaps from one game to the next, useful as that would be, is kind of infeasible. These maps also don’t say anything about uncertainty. For some teams, the predicted heatmap might be a highly-accurate representation of their style, but for others, it might just be the midpoint of very blurry patterns. In that sense, they give the reader a false sense of certainty about team style. There’s also a multitude of statistical flaws (or at least weaknesses) to the model I’ve chosen. The most important consequence I've noticed is that this model tends to exaggerate when there are "islands" of action (like the Jozy Altidore/Sebastian Giovinco island for Toronto, or a number of the teams with hot spots around the corner flags). Those teams really do have tendencies to defend in those areas, but the heatmaps are way too dark there to be realistic. A better model, something like integrated, nested Laplace approximation, would probably avoid those exaggerations, but it’s kind of overkill for the first analysis of this kind. Finally, there’s the matter of whether the defensive action was actually successful. The maps above just show where attempts are made, but I could make completely separate maps (and I actually have already) about the probability of an attempt leading to a turn-over. Those maps tell a different side of the story that I may save for another article.
Whether it’s used to predict the results of future games, analyze the impact of a major change, or just summarize a team’s style at a glance, spatial analysis like this is a natural fit for a fluid, tactical game. Really, I think the potential analytical uses for this kind of analysis are enormous. They’re also just kind of mesmerizing.