A Post about Possession Stats

First of all, I had intended to have this up this past weekend and not on Thursday, my apologies. Secondly, I hope you all went out and looked at the stat table that Matty put together. Some great accumulation of data and in a nice little format. Great information and some stuff that isn't readily available anywhere else. Consume this, stat heads.

This past weekend during our recording session we talked about possession. This has been discussed time, and time again by people much smarter than myself. I won't waste a lot of my own words except to kind of bring things together.

Get more (much more) after the jump

Graham MacAree, wrote a brilliant piece about Opta stats and how they specifically calculate possession as a whole. To save you a bit of time, the summation of the finding is that Opta, at the time, calculates possession as pass volume. Meaning if you take the entire amount of pass attempts over the 90' minutes and divide that against each of the sides pass attempts you would get the reported amount of "possession".

Richard Farely does a great summation on why that's bad.

What does this mean? Let’s take a totally fake scenario. Barcelona plays three quick passes before trying a through ball that rolls to Petr Cech. It all takes four seconds, while Petr Cech keeps the ball at his feet for eight seconds before picking it up, holding it for five seconds, then putting it out for a throw in, which takes eight more seconds to put back into play.

Despite Barcelona having possession for only four of those 25 fake seconds, they’d have 80 percent of Opta’s possession (three good passes plus one bad, while Chelsea had only Cech’s unsuccessful pass). A logical expectation of a zero-sum possession figure would have that as either 16 percent or (if you credit the time out of play as Barça’s, since they’d have the ensuing throw) 48 percent Barcelona’s. Or, if you do a three-stage model (that’s sometimes reported in Serie A matches), you’d have 16 percent Barcelona, 52 percent Chelsea, and 32 percent limbo/irrelevant.

Now I say "at the time" because I attempted to do that tonight and my math was a bit off. So it's possible they also incorporate another statistic or something a long those lines. It's highly unlikely that they have gone towards a game clock as they still report in percentages.

Now it's fair to argue that the possession stat is meaningless, as Rui Xu (follow him on twitter, seriously, do it now) of Sporting Kansas City Performance and Statistical Analytics department thinks the following:

In general, I think they’re useless because they don’t contain any context. There is little-to-no correlation with points (if anything, there’s a slightly negative correlation in the MLS), and it doesn’t really tell you anything on a performance analysis level. Once you start adding context is when you’ll be able to draw some narratives. What is the possession percentage of the road team after they go up 1-0?  What was it before?  What is the possession percent of the home team when up 2-0 after the 80th minute?  You still have to be very, very cautious though.  - Numerology: How valuable is possession anyway?

But, it's countered very well by the Revolutions Timothy Crawford.

It’s hard to say exactly what possession necessarily does at this point. Barcelona out-possesses everyone, but they certainly dropped some points and big matches. There have been studies that have shown teams that win often lose the possession battle, but as my statistics classes taught me, correlation does not necessarily imply causation. No one is saying the way to win more matches is to never have the ball. It’s trying to find what part of possession is important, and then applying that to tactics.

Look, there is a mass amount of information that has basically flooded our ability to start analyzing the American version of "the beautiful game," and guess what, we're going to turn it into 1's and 0's because we're all a bunch of Yankee bastards.  But as we do that there are going to be plenty of things that we don't know. I'm not sold that Possession is useless, but I can't think of exactly what is needed to make it "good" or what "good" it could produce at distant point in the future.

Matthias at a certain point on our podcast mentioned specific time spent in attack or in the opposing third, versus the ball residing in your own defensive third.These things of course matter as they will eventually create opportunities for you or your opponent. That leads into the thought that not every chance on goal is documented by a shot. There are plenty of these that stopped just short by defenders or goal keepers making last ditch efforts.

That being said if you sit inside and lob in crosses hoping to get lucky, Charles Reep style, against a team such as San Jose or Los Angeles that is remarkably well versed in protecting such an attack, what does it really matter how much time you had in the attack. You wasted it.

That being said, Zonal Marking has done some correlative work between the relation of shots and possessions.

As you might expect, there’s a fairly obvious correlation – the more possession you have, the more shots on goal you’re likely to attempt, which is hardly a revelation.

The graph is interesting, however, for two reasons. First, because there are clear differences between the five separate leagues. Second, because there’s a handful of sides that don’t fit the pattern, and a lot of variation amongst the sides who see a lot possession.

The sides who are significantly ‘higher’ on the graph compared to the line of best fit are particularly efficient with possession – they have more shots than you’d expect for the amount of the ball they enjoy. Those who are significantly ‘lower’ are less efficient – they see a lot of the ball but record relatively few shots on goal.

Of course, being more or less efficient is not necessarily ‘better’ – because the sole purpose of possession is not to score a goal. Possession can be used as a defensive tactic to play out time when a side is ahead, and can be used to tire the opposition, before attacking more directly later on. The intention here is not to ‘rank’ sides, but to show their different styles.

I think the biggest thing is how he ended those paragraphs "but to show their different styles". I think the biggest thing that possession shows me, at least for the time being, is what type of team you are. Are you comfortable with the ball or without it? Do you run up-and-down the pitch all day, winning and then losing possession; pressing for that additional chance at goal?

Maybe current possession stats can't tell us all of that information. But looking at break downs and heat maps of possession and the pitch, where players possessed the ball the most, can supply and relate to us an (albeit short) narrative. And ultimately it's all about context and applying it to the data that we use. Understanding why we chose the data we did and then being able to articulate it back to others, who can point out deficiencies as well as some of it's strengths.

I'll include one additional link here on the back end of this thought. The site Soccer By Numbers by one Chris Anderson, producer of the book "The Numbers Game", hosted a post last year by Andrew Brocker. It doesn't really have anything different than what you've likely seen other places, however, I feel compelled to include it.

Brocker creates a neat little chart that displays the relational values between successful passing and maintaining possession of the ball between national clubs during the 2012 Euros. There are some interesting thoughts that go along with it and a tidy summary. I encourage you all to take a look.

Again, this isn't about coming to a conclusion, it's about continuing the talk and the attempt to raise the level of knowledge and understanding on a specific topic. I don't have hardly any of this figured out and I'm sure that there are much smarter people out there that could contribute much more. Should you be them or find their material, make sure you point us in that direction.

Opta loosens the chains a bit

opta

Look, it's late, you'll have to forgive the hack job JPEG above. I have no idea why I'm up besides the fact that I don't have to go to work in the morning. But with the upswing of free time, I'm just perusing the internet and generally reviewing information that I often don't find time to cruise through. While sifting through data and spending my time nodding off to sleep at my keyboard, I came across Opta's playground site where they are "opening up the database."

I'm not sure how new this is or if it is just something I missed. But I know it wasn't available the last time I was around. It's a basic request for people nerds like me (and possibly you...) to submit data requests.

An understatement would be to call this development "cool."

A lot of data within Soccer is closed off and generally leaves a lot to be desired. Being a guy that used to write a lot about baseball, it would be awful--strictly speaking from my perspective--to write about a player if the lack of overall information that was provide is akin to that of modern day soccer data.

It's safeguarded and looked after as if it was top secret defense information. To be fair, I actually think that some of that information is kept more secure than defense information. But that's not really the subject. Having the ability to submit an e-mail request for specific data is exciting. It's a marked improved over the current status quo.

Sure, you could complain about the fact that they only accept one application in all categories per email address, but who cares? It's an improvement, and here at ASA, that's what we're all about. Improvement. And soccer. And beer. So that's not what we're all about. But it's part of what we're about.

ASA Podcast: Episode III

Hello, there, you fine traveler. If you are of a west coast bias, you'll love today's show. First I have to say I'm really saddened by the fact that neither myself nor Matty worked in a "Third" joke with as much as we like to talk about cutting the field into thirds, and with this being our third podcast. *sigh* it didn't happen, maybe we'll save the jokes for Episode XXXIII. Anyway... we have the following for you.

We chronicle the life and times of your's truly, me, Harrison Crow, and little background as to why I created the site.

We talk a bit about possession stats: why they're important, why they are not, and how they are sometimes misleading. I referenced a blog post from Opta during this conversation. Please take a look, as it has a lot of really good information. I'm trying to come up with a bunch of material for a post later on today. I hope you all check it out.

We do mention Montreal and their Italian-influenced defense/counter-attack system and how it's helped them to a second-place start in the Eastern Conference.

Next, we chronicle the poor start for the Sounders, and perhaps why they have produced no goals despite an excellent possession percentage. We also mention Sporting Kansas City and the LA Galaxy, as well as the Portland Timbers and their dominance in possession.

We use the segue of the Portland Timbers to talk a little bit about Will Johnson. He's an underrated pick-up who has scored some amazing goals, and his ability to troll Alan Gordon is exceptional. Yep, he's gone for 3 games. Gordon, not Johnson...

And for those of you who didn't catch his brilliant goal at home, which effectively gave the Timbers 3 points, check it out below:

[youtube https://www.youtube.com/watch?v=aHMV-zPqccc]

Lastly we talked about our Game of the Week, the matchup between Sporting Kansas City and the Los Angeles Galaxy. While separated by 7 points in the table--with two games in hand for LA--Tempo Free Soccer's rankings has them 1 and 2 overall in MLS. We make our picks for who we like, why, and a few little facts to back it up.

We hope you enjoy the podcast!

[audio http://americansocceranalysis.files.wordpress.com/2013/04/american-soccer-analysis-podcast-3.mp3]

ASA Podcast: Episode II

Hope that you all enjoyed your weekend! We're back with Episode II, where Matthias and I discuss a bit about crosses and open-field play in the midfield, and what value they can add to a club. If you care to have a deeper look at some of the numbers, Matty was nice enough to put a piece together should you care to take a look. If not, tune in to the podcast below. Hopefully we'll start producing some more content by May, and there will be a reason to check back with the site more than once a week!

[audio http://americansocceranalysis.files.wordpress.com/2013/04/american-soccer-analysis-podcast-2.mp3]

Also make sure to check out the YouTube video,linked below, to the SSAC13 Soccer Analytics panel. A lot of good stuff there.

The small effects of crossing in MLS

During the podcast that Harrison and I recorded yesterday (to be posted soon), we discussed the value that a midfielder can bring to a team even if he’s not scoring. Harrison brought up, for instance, Mauro Rosales' crossing ability as a dangerous weapon for the Sounders. The only thing I had to offer to the conversation was data-driven, as I have not seen the Sounders play this season (yes, I unfortunately missed the Timbers-Sounders match). I noticed in the 2013 data that teams that “cross too much” as part of their offenses have tended to fare slightly worse this season. Let me explain the methodology… The game-by-game statistics for “open-play crosses” can be found in all boxscores on mlssoccer.com. An open-play cross is just one that occurs during the run of play, as opposed to free kicks and corner kicks. Crosses can be dangerous, as we’ve seen during the last decade from the German national team specifically, but not every team is the German national team.

To isolate the effectiveness of crosses, I started by dividing a team’s crosses by its scoring attempts—scoring attempts being just any shot or attempt in the general direction of the goal. The ratio of crosses to scoring attempts serves as a decent proxy for the fraction of a team’s offense that comes from crosses rather than direct play through the middle. Is it perfect? Of course not, but I hope that this generates some discussion about how to better evaluate the role of crossing in MLS. On to the results!

For every game played up through April 6th, I took the home and away teams’ ratios of crosses to goal-scoring attempts. As an example, in the Timbers-Sounders matchup on March 16th, Portland recorded 20 open-play crosses and 13 scoring attempts. Seattle played 9 crosses in, and had just 7 scoring attempts. In both cases, we see that the fraction is actually greater than one. So while this not really a percentage at all, dividing by attempts still helps to control for a teams’ overall offensive production.

Portland = 20/13 = 1.54

Seattle = 9/7 = 1.29

Then I take the home team’s ratio and divide by the away team’s ratio.

1.29/1.54 = 0.84

The fact that this number is less than one simply suggests that the home team, the Sounders, used crosses as a smaller proportion of their open-play tactics. Next, I ran a simple linear regression between this final ratio and the goal differential in all matches. Here is the formula:

Goal Differential = –0.31 x Ratio + 0.82*

So what does this mean? Well basically, it says that the more often an MLS home team utilizes crosses, the lower their goal differential tends to be. There are tons of other variables at play in soccer, but let me take a stab at explaining this correlation.

Crosses are valid goal-scoring opportunities, and any team would almost always rather get a cross in than, for instance, play the ball all the way back to the defense. But in that same vein, a team would rather attack through the middle and earn a more direct goal scoring opportunity than just whip crosses in all day. So I think what this regression tells us is something that just about every soccer player knows. If a team is forced to constantly play from the wings—if it is forced outside more often than not—its goal-scoring potential will be reduced. Crosses are a necessary part of most offenses simply because a team cannot always get a perfect opportunity in front of the net.

Teams that cross a lot may be doing so as part of team tactics, or in any given game it could be an efficient defense that forces the offense to play from the wings. Either way, a weak-but-existent correlation exists that suggests teams should not settle for crosses if possible.

I’m sure we haven’t learned anything new here, but I’m a firm believer in attempting to quantify phenomena in sports in an effort to better understand the effect, and to actually measure the size of the effect.

*P-value was 8%, and R-square was 0.05. So this was definitely a weak relationship, and there is more than enough room for certain teams with the right personnel to thrive on crosses.

SSAC13: Soccer Analytics Round Table

It's finally here! If you didn't get the opportunity to get out to the Sloan Sports Conference in Boston two months ago--though I was in town for business, I wasn't fortunate enough to get to go--the videos from the round table discussions have been posted to YouTube.

This is great stuff from some great minds, take a look.

[youtube http://www.youtube.com/watch?v=2Ye-mvV9ELI]

ASA Podcast: Episode I

The site is largely undone, and still in need of a lot of love. But that will come in time. Until those moments arrive, and in effort to hold you off, here is Episode I of our podcast. It's a bit rough, and it's the first time either of us have done this, so be easy. Things will get better. Right now, podcasts record on Saturdays and will be released sometime Sunday or Monday with a running time of roughly 30 - 60 minutes. Enjoy!

[audio http://americansocceranalysis.files.wordpress.com/2013/04/american-soccer-analysis-podcast-1.mp3 ]

(and just in case you aren't using Google Chrome)

American Soccer Analysis: Podcast I

The First Post

We're here, we're loud, we're here to make a difference. There is a lot to be said within the world of soccer or "futbol", especially when concerning analysis and analytics. Using what is available is kind of the name of the game. But the thing about American Soccer is that more than ever there is a lot out there that allows for--in the very least--a better way of achieving a whole picture.

None of us here have the answers, nor do any of us think we're going to come up with them. But there are a lot of smart people out there that sometimes go without being heard and our self proclaimed job is finding those people and try to give them a voice and have some fun doing it.

We're going to do a lot of things wrong, moreso than right. That's cool. You're going to make mistakes. The point in that is that you make the public you show what you...or we... did wrong and you show you correct or fix that mistake. Giving the process some transparency. Something sorely lacking in American Soccer and Analytics within the beautiful game.

For now this is just a site to host our podcasts and generally collect the little bits of wisdom that we've scoured the internet for. If you have find something good give me a shout.