By Jared Young (@jaredeyoung)
Last year I became interested in using statistics to measure a team’s style of play. I was inspired by a Jonathan Wilson article that laid out two extreme styles, which he labeled proactive and reactive. Proactive teams are concerned primarily with possessing the ball and high pressure on defense to get the ball back as quickly as possible. This is Barcelona and tiki taka in its purest form. The reactive teams are characterized by a desire to maintain their defensive shape, will typically offer low defensive pressure and will be direct in their attack.
I've adapted the score, that I called P Score, since last time and the details for the curious are below. One thing about the change to point out now is that I've adjusted the scale to be a 10 point scale - 10 is a high level of possession and 1 is very reactive.
Here are the P Score rankings for MLS through March. The columns to the right of the total scores show a team’s proactivity relative to their opponent. The way to read the table (for example starting in data column 3) is that Orlando City was less proactive than their opponent in 25% of their games and averaged one point per game. A game is considered even if the two teams were within one point of each other in their P Score for that game.
|Less Proactive||Even||More Proactive|
|Rank||Team||P Score||Pts/Gm||% of Gms||Pts/Gm||% of Gms||Pts/Gm||% of Gms||Pts/Gm|
|1||Orlando City SC||9.5||1.3||0.25||1||0.25||3||0.5||0.5|
|3||New York Red Bulls||7.3||2.3||0||0||0||0||1||2.3|
- Orlando City SC so far scores the highest with a Pscore of 9.5, significantly higher than 2nd place Montreal
- A couple of teams that are usually known for their possession oriented style of play are at the bottom of the list. The Portland Timbers change of style has been noted, but Sporting Kansas City anchoring the list is a big surprise given their history of a 4-3-3.
- Two of the best reactive teams last year, New England and Dallas, are again near the bottom of the league.
- Looking at the table in some depth reveals some interesting early trends about where points are concentrated. I summed up the table in a visual below.
What this table says is that if a team is going to be proactive, it’s beneficial to be more proactive than their opponent. The same goes for reactive teams - results are better when a team is more reactive than their opponent. The implication is that commitment to an execution of a style of play, regardless of style, is a key contributor to success. That’s a pretty fascinating learning and I’ll monitor the numbers over the season as we get bigger sample sizes.
The New P Score Calculation
The P Score is built off the idea that pass type data can indicate what style of play a team is playing. A proactive team will attempt a higher number of shorter passes and should in theory have a higher percentage of backwards passes. A direct team will attempt longer passes in an effort to counterattack and will have less backward passes.
When I developed the P Score on the 2014 season I was disappointed in the availability of passing data and I was forced to use variables that I didn't want to use. The model simply used the percentage of long passes and total passes. Recently, Whoscored added more pass types to their match center and I've evolved the model. I tried most pass types available including short, long, backward and through passes as well as crosses. I also looked at blocked shots because reactive teams block a higher percentage of shots than proactive teams. Given their penchant for defensive shape, that makes sense.
I used multivariate regression using outcomes from a collection of games from the 2014 season. You can read which games I selected for the dependent variable in the prior post. Only two pass types ended up being statistically significant; the percentage of backward passes and the percentage of long passes. Both coefficients adjust the model in the direction you would expect. A higher percentage of long passes lowers the score and a higher percentage of backward passes increases the score. I did not use total passes in the model because that variable can be strongly influence by an opponent, whereas percentages would be more likely to indicate a team’s actual intent. The Rsquared of the new model was a sturdy 0.79.
The old and new models had similar results. I scored the 2015 season both ways and the correlation between the two is 0.95. Orlando City SC is still the top team and Sporting Kansas City is the bottom team scoring both ways.
I strongly prefer this version of the model because it looks at the percentage of the type of team passes to indicate style as opposed to anything related to volume, which as I mentioned would be much more likely to be manipulated by an opponent.
If you have any questions about the methodology please leave a comment or reach out to me on twitter @jaredeyoung. I’ll be publishing the P Score table monthly throughout the season.