Data

Shots in the Dark: how data providers tell us different versions of what happened by Eliot McKinley

Recently, this tweet created a small firestorm in the soccer analytics community. While it is unclear the source of the error, it was pretty clear that there weren’t 1,300 passes and 50 shots in an English League 2 match. This led to responses from prominent analysts such as StatsBomb’s Ted Knutson (including on his podcast [starts at 10:45]), Opta’s (and ASA alum) Tom Worville and Ryan Bahia, and Chris Anderson, author of The Numbers Game. All of them were saying pretty much the same thing: question the data you are using. If the data you are using to analyze a problem is not valid, then your solutions won’t be either.

So what do we know about the data that is used for soccer analysis? Previous studies have shown that people are pretty good at agreeing about what type of event occured in a soccer game (e.g. shots, tackles). But as far as I can tell, the accuracy and precision of locations  of game events among the various data providers has not been studied. As Joe Mulberry pointed out when looking at the troubling inconsistencies between spatial tracking data and event data, small differences in locations can have big effects on downstream analysis including expected goals (xG) models. In other words, small inconsistencies in how data is tracked can have big consequences for the models built off that data. So what are the differences between how soccer data providers collect and report their data?

Read More

Turf and Injuries: The Data Hurts by DMP

One of the most peculiar matches of the 2018 regular season occurred on August 18th. The LA Galaxy were already stretched thin from injuries to both dos Santos-es and Romain Alessandrini (their three DPs) and defender Michael Ciani for Sigi Schmid’s return to Seattle. But when they showed up in town, there was a huge name - perhaps the biggest name in MLS - missing from the lineup. That name was Zlatan, and by all indication his absence was voluntary.

By the end of the afternoon, the Galaxy really could have used one of the greatest players ever to kick a soccer ball. They ended up suffering their worst loss of the season and Seattle notched their best (5-0). Oh yeah, and the Galaxy missed the playoffs by less than three points.

We all know why he missed that game. It’s because the Sounders play on FieldTurf. There’s a perception out there that playing on artificial grass increases the risk of injury, and Zlatan had hurt his knee not long before (not on turf).

The superstar is not alone in his perception. I remember being disappointed not to see Thierry Henry play at CenturyLink Field in 2013. In fact, a group of Canadian researchers surveyed 99 MLS players back in 2011 and found that the vast majority (93%) said they believe third-generation artificial turf (FieldTurf) increases the risk of injury.

Read More

Shots Not Taken: Exploring the propensity of teams to shoot from good positions by DMP

Do you ever find yourself yelling “JUST SHOOT THE BALL!” at the TV screen? Of course you do, you watch soccer! Sometimes it can be maddening to see your star striker make his/her way into the box, only to futz around with a pass or dribble. At times it doesn’t even matter whether that pass or dribble was successful. Does it seem like your team does it particularly bad? You’re probably not alone.

Psychologists will be quick to point out a thing called negativity bias. Basically, we probably all think our team dilly-dallies in the box more than others because we remember it better. The existence of this bias, by the way, is supported by a convincing amount of experimental evidence. But it begs the question, who is empirically more likely to shoot when they can?

Read More