How Data Changes My View of MLS or a Frank Exploration of Luck in Dallas

by Harrison Crow (@harrison_crow)\

As I made my way to Toyota Stadium on Friday night I was concerned about just making it to the game on time. The traffic was horrendous and it was my first time driving around Frisco that collectively dragged my pace of getting to the park and was the reason I was walking up to the gate as fire works were set off and the National Anthem finished.

I stood just outside the south gate waiting for my ticket to arrive as Dominique Badji scored the Rapids first goal of the season and I felt a sense of validation in thinking that this was going to be a game that Colorado could be competitive and challenge for full points leaving Texas as I had implied with my post Friday morning.

It wasn't that I thought Dallas was a bad team as I wrote about them. I think Dallas is a very good team even after that beating, and I'm pretty certain they'll make the playoffs out of a very stacked and competitive Western Conference. The problem is that prior to Friday night they had the second highest PDO in Major League Soccer, a metric that is a measurement of luck based upon finishing and save percentage.

FC Dallas had scored a total of eight goals as a team behind the contributions of Blas Perez with three, Tesho Akindele just behind him with two and Fabian Castillo trailing with just one. Those three are what is going to drive the Dallas bus to success, just as the trio did last season, and though goals will come from other sources these are three that you can point to as "the guys".

The problem is that all three have been scoring goals with a much higher efficiency than what we'd seen previously from them. Now from what we've learned about scoring rates is pretty basic; they've had maniac highs and depressive lows. Even with the number of quality chances they're gotten, as described by our expected goal metric, it's not something that we could reasonably expect to continue. Again, not because they don't have fantastic goal scorers or that those players are of a lesser quality to the rest of the league. There are few players in the world that can score at their current rates.

Their high PDO meant that if the volume of Dallas' shots didn't change, they weren't going to continue to scoring goals.

Likewise, Colorado was riding a similar wave of eventual regression.

While Dallas had a high PDO, Colorado had a very low one (956, tied for second lowest in MLS) that was largely driven by their complete lack of goals across 48 shots. Yes, 48 shots without a goal. They should have, by our own measurements, scored four goals by the time they arrived to play FC Dallas this week and instead were sitting on a goose egg. Few teams can take near 50 shots over any given time frame during the season from the attacking third and come up that empty.

Just for a bit of applied science; over the past five seasons only two players have taken more than 48 shots and not scored a goal: Juninho for the Galaxy in 2014 and Kalif Alhassan for the Timbers in 2011. Based on shot leverage we can tell that Juninho was shooting from long distance and wasn't finding a lot of good chances. Likewise, being that he only scored five goals through 94 matches and is now playing in the NASL, it's possible Alhassan does not possess the finishing skills required from most goal scorers at the MLS level.

I'm not trying to say that it was certain Colorado was going to win a game or even score a goal.  Dallas could have very well done things different and there is luck to account for, too. Don't think for a second that shots like Dillon Serna's happen every week, there is a reason why it was special. The shot could have gone either wide or high and I'm a bit surprised that Walker Zimmerman didn't get a boot on it. Most players across soccer LEAGUES (not just MLS) convert those shots into goals in less than three percent of opportunities.

Colorado's eruption was mind blowing in the sense that I didn't expect them to score four goals, blow off the doors and leave Dallas with a clean sheet and all the points. But it's not as though I didn't think it couldn't happen either. That's the thing about all of this; we aren't trying to get a high definition picture of the future or to take the beauty out of anything, but instead it's to give the accomplishment context and measurement while understanding why it could have happened and if we should continue to expect it to happen.

Colorado isn't a team that I'm convinced is going to be anything great. They're probably at very best a 5th or 6th playoff seed if their defense holds up all the way through, which is another topic entirely. Likewise, Dallas has the attacking pieces to continue to beat their expected goal parameters and a top-3 seed isn't out of question. But if they either can't create more opportunities or continue to finish chances at a high rate, their regression may continue.

Weekend Kick-Off: Dallas Luck, Rapids Unlucky.

by Harrison Crow (@harrison_crow)

This week I come to you live from Dallas. I love Friday night soccer so much that I had to go and finally experience one with one of the hottest teams in MLS... the Colorado Rapids. Okay, maybe not. But perhaps still an interesting matchup. If you're at the game make sure you say, "hi". Good luck, because I'm in Waldo land.

Colorado and Dallas are connected by two primary memories; an odd MLS Cup match-up of two Western Conference teams in 2010 hosted in Canada, and the defection of Óscar Pareja. While the MLS Cup contest was largely considered to be a snooze-fest, the return of Pareja to Dallas through basically forcing the Rapids hands was a crazy, awkward, and a bit surprising too.

Now Pareja's once high flying Rapids attack have turned a defensive cheek under the guise of club legend and mustache aficionado Pablo Mastroeni. Despite controlling only 48% of the ball per match, Colorado have managed to turn limited opportunities into the third best possession ratio in the attacking third in MLS. That is to say, they complete more passes in the final third than their opponent by a significant margin.

They also create more shots than their opponents, with better shot position according to their expected goal differential. They clearly haven't done this in an ascetically pleasing manner, and their inability to score is downright shocking, but it's possible this Rapids team isn't as bad as envisioned nor the current narrative being passed down.

Dallas has also been a formidable side after once again a strong start the season and, just like deja vu, they are doing so without the creative production of Argentinian midfielder Mauro Diaz, who has missed time for a few different reasons. While I think the majority of supporters would love to see Diaz return to the lineup, the club's two points per game pace without him is impressive, and sets them on the path towards a strong season with Supporter Shield-like ambitions. It also gives hope that his return might catapult the club that much further.

Still, I would advise a bit of tempering going forward. While the joke surrounding FC Dallas the last couple years has been the "collapse" that comes with the end of Spring and the launch of the Summer campaign, this season we might see some problems ahead based off of three sets of numbers. Their PDO, TSR and of course their expected goal differential.

PDO - a predictive metric that is based off both finishing percentage and save percentage, which tends to regress to the 1000 level is floating a bit high at this point.

TSR - Total Shot Ratio lends credence to PDO, showing that they've been out-shot to this point in the season.

xGD - Expected Goal Differential works a bit in their favor. Despite surrendering more shots than their opponents, they've primarily found better position and with that a high probability of shot success which might explain some of their PDO. However, it's still only 10th best in MLS and, again, could point towards a fall back into the Western Pack.

While most are probably preparing for a 1-0 or even 2-0 game in Dallas favor, I wouldn't be surprised to see some opportunities fall in Colorado's favor to include the result. I'm not a huge fan of selling off Deshorn Brown but if that couldn't have been helped, as is some portion of the story being told, the primary concern at this point is the gap that stands between his departure and new DP striker, Kevin Doyle. But the team still possesses the attacking talents of Dillon Powers, Juan Ramírez and Vincente Sanchez. All are strong enough weapons to create chances and maybe find the back of goal. Oh and Gabby Torres, he's still there... *check's Colorado Rapids active roster* Oh, yep, he's still there too.

Or maybe Fabian Castillo and Blas Perez could just destroy everything. Either/or or perhaps neither seems conceivable.

 

FANTASY PERSPECTIVE:
 

Colorado Rapids

Axel Sjoberg  (owned 32.7% - worth $4.7)
A value buy in the defense, this hulking man has taken up starts in lieu of the injuries and international call-ups this past year. He did not play the previous two games and I'm not sure he'll play this weekend, but he's a great reserve guy to have for a team that typical boasts defensive posture by nature.

Dominique Badji (owned 23.4% - worth $4.4)
Badji is a speedy winger drafted this off-season by Colorado that's kind of creepy in the "dating a new girlfriend that looks kind of like your old ex-girlfriend" sort of way in a post Deshorn Brown world. As a fantasy selection he's cheap which has garnered a bunch of manager selections and seen minutes in all four Rapids games this season making him all that more valuable.

My biggest qualm with Badji and something to be advised about, he has only collected .31 expected goals scored from three total shots through his 202 minutes on the pitch this year. To put that in perspective, Cyle Larin has 101 minutes and collected .71 expected goals scored off five shots. Add that to Luke Shelton taking his spot in the 18, and he might not see so much time on the pitch in the not so distant future.

FC Dallas

Chris Seitz (owned 22.5% - worth $5.1)
Sietz has, according to our data, shown to be worth a whole goal better than the average keeper during the course of his time between the posts for Dallas. Which is saying something considering the "reactive" nature and the amount of shots they've allowed thus far into the season.

Ryan Hollingshead - (owned 19.6% - worth $5.5)
The former UCLA talent has come back to the states from helping his brother build churches and become the much needed answer for "what happens when Mauro Diaz goes missing" with .50 expected assists in 307 minutes. Making him a great bench value pick.

FRIDAY NIGHT'S WATCHABILITY SCORE:

This week we added in another factor into our watchability score: possession. The idea behind that is simple: the fewer possessions, or fewer turnovers and attrition you tend to have, the more ascetically pleasing a match is to watch. This addition and a slight tweak in combining helps put this on a basic 5-100 scale which makes everything a bit easier to digest and consume.

Next week I'll put up another survey asking you to rank your favorite games of the week and then we'll compare it to the Watchability Score and hopefully it will iron out the kinks. I will say that we need more than the 11 participants of last week seven of which voted for the Sounders match being the most entertaining. Which has me asking if people just ranked it first because it was their favorite team or if it was an entertaining game. I point this out specifically because it ranked last in our scale last week. Also, did you see that Sporting v Union game? I mean, c'mon.

THE WEEKEND MATCH-UPS:

Saturday:

Columbus Crew SC (0.25) at New England Revolution (0.46) - WS: 54
After a huge mid-week draw against a surging Vancouver, and short Federico Higuian, the Crew take on a Revolution side that might have Kelyn Rowe trying to be this years verison of Lee Nguyen ranking 5th in xG+xA. Prediction: Revolution Win.

New York City (-0.35) at Philadelphia Union (-0.03) - WS: 45
The scale see's this as a better than average game to watch and with David Villa out there I have to believe that's a good call. That said both teams have been rather a bit lack luster this season. Prediction: Draw

New York Red Bulls (-0.34) at DC United (-0.81) - WS: 37
Rivalries are fun and this one is no exception, not even considering the extenuating circumstances the scale has this game as the top game to watch this week. Prediction: Draw

Montreal Impact (-0.31) at Houston Dynamo (-0.45) - WS: 68
Houston play host to a team that is only the second ever MLS Club to qualify for a CCL final. Despite that I think is probably going to be one of the toughest games to actually sit through and watch. Prediction: Draw

Real Salt Lake (-0.34) at Sporting Kansas City (1.24) - WS: 57
It's only a matter of time before Sporting just has their big break. Our metrics don't just have them as the best team in MLS, it's not even close. Oh, and RSL just continues on messing with our models. ONE OF YOU, MAYBE BOTH, NEED TO QUIT MESSING WITH US! Prediction: Sporting Wins

Vancouver Whitecaps (0.42) at San Jose Earthquakes (-0.65) - WS: 50
Could this be the Whitecaps year? Maybe? Maybe this is the year the Earthquakes make a return to being relevant in that "we're really not that harmful" sort of way. Prediction: Whitecaps Win

Sunday:

Orlando City SC ( 0.28) at Portland Timbers (0.24) - WS: 51
Two very entertaining sides in this one. The scale is only at 51 on this but it's also the third lowest, read as: best, scores this week. Probably a great match to sit down and watch, if it goes south there is always that afternoon nap. Prediction: Timbers Win

LA Galaxy (0.40) at Seattle Sounders FC (.39) - WS: 60
This game is one that consists of two great, and in recent years, powerhouses. The stats still very much favor both teams as still being talked about in that top tier but their overall performances have lead fans to be a bit skeptical about the immediate future. Prediction: Draw

NERD IMAGERY

How Pablo Masteroni would look dancing after winning in Dallas this weekend. Things 'bout ready to get interesting up in here.

Perfecting Our Watchability Score

This is a simple request that you would just take a moment to rank the games in the survey below in the order of the most entertaining. This is to help me work out the kinks in our Watchability Score, which in turn helps predict the best games to watch for the weekend and maybe prepare you for your club playing a bit of a dud or perhaps a nail bitter.

If you didn't watch every game this weekend that's okay. I'd encourage you to still rank the games based upon how the score ended up any highlights or comments you heard surrounding the game. Your decision to dock points based upon tissues issued is up to you, that's not yet been a factor considered....emphasis on YET.

The State of MLS Goalkeeping

For those unfamiliar with ASA’s goalkeeping stats, the long explanation can be found here and the short of it is that the “G - xG” stat column, Goals Allowed Minus Expected Goals, is how many goals goalkeepers are giving up from expected shooting areas. A negative number means they’re doing well, saving their team that many goals, while a positive number means they aren’t performing up to the standard MLS goalkeeper. The table is also reproduced below.

Keeper Team Min SOG GA xGA G - xG
Bill Hamid DCU 384 20 2 5.55 -3.55
Tyler Deric HOU 480 18 2 5.39 -3.39
Bobby Shuttleworth NE 476 23 6 9.05 -3.05
Clinton Irwin COL 385 11 2 3.77 -1.77
Luis Robles NYRB 287 13 2 3.64 -1.64
Josh Saunders NYC 383 13 2 3.52 -1.52
David Bingham SJ 480 27 7 8.48 -1.48
Jeff Attinella RSL 97 6 1 2.38 -1.38
Chris Seitz FCD 480 17 4 5.27 -1.27
Nick Rimando RSL 290 11 2 3.15 -1.15
David Ousted VAN 486 16 4 4.74 -0.74
Steve Clark CLB 289 11 3 3.70 -0.70
Sean Johnson CHI 284 11 4 4.59 -0.59
Luis Marin SKC 480 20 5 5.58 -0.58
Jaime Penedo LA 194 12 3 3.51 -0.51
Jon Busch CHI 196 9 3 3.05 -0.05
Eric Kronberg MTL 97 3 2 1.87 0.13
Evan Bush MTL 193 7 1 0.82 0.18
Brian Rowe LA 286 8 3 2.80 0.20
Stefan Frei SEA 383 8 3 2.79 0.21
Adam Larsen Kwarasey POR 477 14 5 4.49 0.51
Joe Bendik TOR 383 27 8 6.61 1.39
Donovan Ricketts ORL 483 12 5 3.25 1.75
Rais Mbolhi PHI 486 19 9 7.18 1.82
*Does not included own goals scored on oneself. Oh hi, Tyler.

By Bill Reno (@letsallsoccer)

2015 could be a big year for Nick Rimando. The RSL goalie is looking to capture his first MLS Goalkeeper of the Year award, something he has been robbed of for multiple years now. It’s tough enough for him to earn it this late in his career (he’ll be 36 this summer) but to think of him having a realistic chance next year doesn't sound plausible. We've already seen signs of him aging two games into the season.

Rimando earned a lot of praise in his first game with his mult-save shutout against Portland in week one. However, he followed up against Philadelphia was three goals, all of which he could have played much better.

0:55 - An obvious cross from the right sees Rimando doing his trademark “cheat” out towards the penalty spot. (Think of a defensive shift in baseball.) Rimando knows the future location of the pass - a direct shot is almost definitely not going to happen - so he scoots away from the goal. There a many problems with this idea and the 2002-US-Portugal-esque result is a large one. Rimando doesn’t need the extra step to cover the ground he is responsible for. The goalmouth is the main priority, not the penalty spot.

1:18 - As the ball is cleared out for another throw-in, Rimando turns away and steps towards his goal. He quickly looks back but does not give any call to his defenders for the ensuing danger. After the goal, he screams at his defenders for not paying attention, but the truth of the matter is that is starts with Rimando himself. He is equally at fault if not more so for his lack of positioning on the shot. Watch 1:27-1:28 in slow motion, his right foot steps away from the shot, a classic sign of a stance far too wide. If he closes his gate, he can perhaps take care of the shot.

2:25 - Another crossing situation. Surely the same mistake cannot happen twice.

Rimando’s, ahem, "aggressive" play has obvious problems. First, if anything were to go wrong, the goalmouth is completely open. This applies for crossing situations, 1v1 situations, scrambles in the box, and shots from distance. (Basically every part of goalkeeping.) Secondly, the defense will struggle to know exactly how to play a crossing situation now. Rimando did not play this aggressively last year and now his defense must adjust. In a quick situation, should the defender abandon marking responsibilities to cover the goal? Or do they challenge the loose ball? It is tough to say now.

If he is consistent with other goalkeepers who have taken the path of over-aggressiveness, it will translate into all aspects of his goalkeeping. 1v1 situations will be too forceful and lacking of patience, shots in the box will be met with flapping arms instead of thoughtful positioning, and footwork will become an afterthought. Unfortunately for Mr. Rimando, he shows early signs of a goalkeeper nearing retirement: relying too much on hopeful play that does not trust percentages.

As for the GOTY award, here is your author Bill’s expert opinion on the matter; don’t let this be confused as to which goalkeepers are better, as that is an entirely different question. Think of this as a popularity vote. (See last five years of the MLS GOTY Award.)

Bill Hamid - Only a transfer abroad or injury will drop him from the number one spot.
Bobby Shuttleworth - Had a successful run with the Revs last year and now a great start, despite the six conceded goals. Consistency is still in question.
Steve Clark - Best goalkeeper in MLS will undoubtedly be snubbed again.
Luis Robles - Most consistent goalkeeper might get an added boost from RBNY’s player reshuffle.
Nick Rimando - Will go down as the greatest MLS goalkeeper to never win the award.
Luis Marin - Voters love Kansas City goalkeepers for some reason.
Chris Seitz - Quietly keeping counter-happy Dallas calm in the back.
Joe Bendik - Has the tools but had trouble putting it together in 2014. Could 2015 be a breakout year?
Tyler Deric - Stellar start, minus time time when he kicked the ball off an opposing player and into his own goal.
Stefan Frei - Underrated and has shown true, tangible growth despite the ponytail.

Year-to-year Shot Correlations

By Matthias Kullowatz (@mattyanselmo)

Perhaps I played my best card too early, publishing year-to-year expected goals correlations first. I won't waste a lot of words explaining what's going on here, but basically I'm just looking to see what shooting metrics correlate from the end of last year to the beginning of this year. This time, let's go back and look at raw shot totals. Without further ado, to the pretty plots!

Notes

Shot-to-shot correlations hold up pretty well against expected goals when it comes to repeatability. However, that doesn't necessarily mean they should be used in predictive models in place of expected goals. Expected goals not only predict themselves well ("stability"), but also predict outcomes well, like goals scored and games won.

For the most part, we see stronger correlations between shots on target than between total shots. This is not the first time I've found that some form of goal mouth placement at the team level is repeatable. The expeted goals model we use for goalkeeper ratings is based partially on placement, and this version of expected goals will almost surely creep into my prediction models this season.

Weekend Kick-Off: Watchability Score and DC goes to Orlando

by Harrison Crow (@harrison_crow)

I know. Oh, yes. I know. That pain of watching a boring game. I know the code; no American soccer fan may EVER admit that there are boring soccer games but the reality is that they happen. Just as there are boring NFL, NBA and NHL games, MLS is no different..

How many times have you ever tuned into a game and just thought "this is stupid, how did I think this was going to be a good game?" Well I present to you a solution that can perhaps point out potential entertaining games and help you bypass the ones that lead to the afternoon nap or watching Nicholas Cage movies on FX.

I present to you the Watchability Score. It's a pretty simple metric that is comprised of four attributes that A) keep the game flowing and B) make for entertaining circumstances. Those specific attributes are total shots in a match, total number of minutes tied in a match, fouls per match and completed dribbles per match. To get the metric,  add the rankings of each team in each of those categories. For example, if Team A is 1st in shots/game, 1st in mins tied, 1st in fewest fouls, and 1st in dribbles, then Team A would have a score of four. Conversely, if team B was 20th in each category, their ranking would be 20+20+20+20=80. In other words, the lower the number, the more "watchable" a team is.

1) We all hate games that are slowed down by fouls. It's annoying and doesn't frequently make for a good or aesthetically pleasing match.

2) Shots lead to goals and while goals are exciting the simple event of a shot being taken brings out attention back into focus. The more, the merrier.

3) Close games are obviously more entertaining than blow outs... on most occasions. Tie games mean there is something on the line for both teams and it makes people do things that are often entertaining.

4) Dempsey is a US international favorite because of the insane things he both tries and somehow manages to pull off. He's an entertainer and an amazing athlete. People that can do things with a ball that are more than simply just kicking it hard to a teammate or towards the goalie tend to make some cool things happen and most people like watching that. I like watching that.

I'll be using the Watchability rankings, slightly tweaking them and incorporating them into these posts in the future. If you have suggestions on how you'd like to see this tweaked in the future, hit me up on twitter, email me or just comment below.
------

DC United at Orlando City Soccer Club

Watchability Score ranks this match to be the third most interesting this weekend.

Watchability Score ranks this match to be the third most interesting this weekend.

The weekend is here folks and by God I'm feeling like it's a day late on it's arrival. I'm sure for Orlando City who is still recovering from all the missing internationals that another few days would have been nice. Tough break for them.

They play a team in DC United that has been pretty hard to figure out this season. They've had entertaining games, they've had "meh" games and then they had games last week that annoy you so very much because they managed to get points. plural.

Whatever. Good for them--and you know what, good for their fan base. They've endured a lot of crap in the last four-five years and the organization as a whole looks to be going back towards the top and that's good thing, I think, for MLS as a whole. I mean, I don't see how it could be a bad thing. Unless... I don't know something drastic happened like Ben Olsen was actually Gabriel Gray or something weird like that. That's impossible... right?

Looking at our expected goal numbers, Orlando City is a bit of an oddity. We can see they sit a bit on the above average side with expected goal differential but are sitting 18th in expected goals for and first in expected goals against. This isn't the type of dominating split I was expecting and it kind of calls to memory the season that DC had last year with similarly awkward numbers.

A league wide perspective on how OCSC and DCU compare to the rest of MLS.

The thing that makes Orlando unique and possibly leaves me with the thought they might be good is that they are dominant with keeping possession in their attacking third. Yes, their total average possession is tied for 14th in MLS but their ratio compared to opponents is the highest in the league at 1.54. Prediction: OCSC win.

Fantasy Perspective

DC United
Jairo Arrieta
($6.6 - 18.3%) - Who would have thought even three weeks ago he'd be the top selected player on the DC roster but here we are. I imagine much of that bump is a two fold equation playing at all and being cheap, while having a good first week.

Nick DeLeon ($6.5 - 13.6%) - Yet another cheap pick-up that starts regularly sometimes it's just about those easy points, everything else is gravy for some people. I guess.

Orlando City
Kaka
($11.3 - 28%) - The fourth most owned player in all of MLS fantasy and yet to let down his owners that have taken the chance on him early. He's been brilliant early on for a team that have very much depended on him to carry them early.

Rafael Ramos ($5.5 - 16.1%) - Ramos has been a great early get for a lot of fantasy managers as he plays, is super cheap and has been apart of a pretty stingy defense that doesn't relquish a lot of shots.

The Weekend Matchups:

Numbers in parentheses are expected goals in even game state, WS is the combined Watchability score for both teams playing. WS numbers are on a scale of 12 to 156, and the lower the number, the more "watchable" the game is.

Saturday:

Toronto FC (-0.83) at Chicago Fire (-0.21) - WS: 70
Watchability score for some reason, I haven't figured it out yet, kind of likes Chicago as a team. There are a lot of shots allowed in their games, most are close and they have some interesting things happen while flowing pretty smoothly. This game might even favor Chicago too! Prediction: Chicago, win.

New England Revolution (0.48) at Colorado Rapids (-0.16) - WS:100
Colorado just isn't a fun team to watch right now and it's kind of sad to see any team pack in at home. Prediction: DRAW

Houston Dynamo (-0.42) at Seattle Sounders FC (0.91) - WS:114
Speaking of not good teams, Houston is bad right now. I'm a bit surprised but that's how some of these go. The Sounders are favored and with a WS that's the highest of not being in favor of watching a game. So, ugh... proceed with caution. You've been warned. Prediction: Sounders win.

LA Galaxy (0.59) at Vancouver Whitecaps (0.26) - WS: 89
An obvious flaw within the metrics that I'm seeing is some sort of judge in importance. This stands out as this is a big game, but WS doesn't rate it highly. Prediction: DRAW

FC Dallas (0.15) at Portland Timbers (0.58) - WS: 65
I'm not sure I really needed a metric to tell me this would be an interest/fun game to watch but either way. Portland has all the numbers it's about them finally putting things together in a match against a very tough Dallas team. Prediction: Portland win

Sunday:

Real Salt Lake (-0.49) at San Jose Earthquakes (-0.74) - WS: 110
One of these days RSL is just going to end up being bad. The question is if this is one of those days. Could all the changes at RSL finally equate to our their xGDEven finally avoid being a model buster? Prediction: DRAW

Philadelphia Union (0.12) at Sporting Kansas City (1.16) - WS: 58
This is the first REAL test of the the Watchability Score as it thinks this will be the match-up of the weekend.  Sporting has statistically shown to be VERY good this season but the points in the right hand column shrug their shoulders with a heavy sigh. Prediction: Sporting, win.

------

NERD IMAGERY OF THE WEEK

This is how I basically how see I stand-off of Bill Hamid versus Kaka looking... only represented in a Heroes gif. You're welcome.

Finding goals for the rapids: is it time for a formation switch in colorado?

By Tom Worville (@worville)

(Note that in this article, "Possession Adjusted" is where you take the stat in question, divide by the team's possession, and multiply by 50% to put all teams' stats on a per-possession basis.)

Colorado Rapids are an intriguing side. They are the only team not to concede in MLS so far this season, but also the only team not to score, either. Having held Philadelphia, New York City and Houston Dynamo to 0-0 draws so far, the only thing they have going for them so far this season is consistency in results.

They’re not going to get easier opponents in the coming weeks either - New England, two games against Dallas, Seattle and LA Galaxy are all to come. These free scoring sides are likely to test the Rapids defence, so to have a chance at securing any points they’re going to need to start scoring. In this analysis I’m going to look into the Rapids blunt attack and what they could change to start scoring.

Teams that usually find themselves having scored no goals either have their bad luck to thank or are generating a low number of shots. For Colorado this is a mix of both. The Rapids currently have the 4th lowest shots per game numbers in the league, taking 9.7 shots per game. 17th out of 20 teams is bad, but if even the 20th team has managed to score then it’s not just your poor finishing which can be to blame.

One could argue that despite the team generating a low number of shots, these could be of poor quality. The Rapids sit 15th in the league in team shot accuracy though - once again not the worst and teams with less accurate shooters have managed to score more. From their Expected Goals numbers taken from American Soccer Analysis, the team has an Expected Goal count of 0.8 per game (2.4 Expected Goals overall). This indicates a lack of real quality chances - and their goal total of 0 is not entirely unexpected. They have also faced two of the better keepers so far this season in Tyler Deric and Josh Saunders, both of which are over-performing vs their Expected Goals against.

Deric and Saunders Expected Goals

Looking at the Rapids’ roster, their strike-force consists of Designated Player Gabriel Torres, Superdraft pickup Dominique Badji and veteran Vicente Sanchez. Between them they’ve only managed eight shots in three games: six for Torres and one apiece for Sanchez and Baji. This low shot creation from the strikers can’t be solely blamed on them - the Rapids sit at 6th lowest in the league for chance creation.

What’s worse is that Deshorn Brown, the team's top scorer the past two seasons with 23 goals, has moved to Norwegian side Vålerenga. With Brown gone, the squad is left with only Sanchez and Torres with any experience up front, although they have only scored 13 between them in the past two seasons. This is worrying as they are the only experienced strikers on the roster, which is unlikely to be strengthened until the arrival of Kevin Doyle in the summer. Any injury or suspension of Torres or Sanchez leave the team with fewer options in an already unthreatening attack.

I am very much a firm believer that clubs should play the cards that they are dealt and play to their strengths. If the Rapids don’t adapt their style they are going to have a long wait until Doyle arrives - and even then things may not change. For a start I think it is worth them reconsidering their formation.

From the excellent FootballLineups.com I can see that the Colorado Rapids have played a 4-2-3-1 every game this season. This formation isolates the striker and means that play heavily relies on the three attacking players behind him. Teams who play this system usually play it with a more physical striker - capable of holding up the play long enough to involve the attacking players who play behind him. Notable examples include Olivier Giroud for Arsenal and Romelu Lukaku for Everton. For Colorado, all three of their striking options don’t really fit the bill in terms of size or strength.

This system could work with Doyle - who possesses more of the characteristics of your traditional hold up man who would suit this formation. Until then I recommend a change of system.

An area of the game that Colorado could try to exploit is to cross more, as the team currently sits 18th for crosses per game. The introduction of Young DP Juan Ramirez looks a good signing from the limited minutes I've seen him play. The Argentine possesses good pace and plays like an old-school winger: drawing fouls and completing take-ons. If he could add crossing to his game, he could be a great outlet for chance creation for the team. The team looks relatively set with their back four and two holding midfielders in Sam Cronin and Lucas Pittinari. This leaves three vacant spots on the team sheet - as I make Ramirez a must-start, too.

Another player on the team who I feel is un-droppable is Dillon Powers. Powers has gathered quite a lot of interest from teams outside of MLS and with him recently getting his Italian passport I wouldn’t be surprised to see him in Serie A before long. For now though he’s on the Rapid’s roster. Powers operates most efficiently in an attacking midfield role, being the central link between the strikers and the deeper midfielders. His chance creation (highest in the team last year) and long shooting abilities make him a must-have in this side.

I would allocate the final two spots on the team sheet to Dillon Serna and Torres and thus play a 4-3-3 formation, with Powers being ahead of the other two central midfielders and Serna and Ramirez being either side of Torres. Despite Torres’ low scoring record, his pass accuracy has been better than that of Sanchez’s over the last two seasons. Serna certainly has the legs over the aging Sanchez, despite not creating more chances than him per 90 last season. The option of Sanchez from the bench is another dynamic that could be used to change matches.

Suggested Colorado Rapids formation

Playing a formation like this would help Colorado play to their strengths in terms of preference to play long balls and utilize their attacking players effectively. The central player in the front three is also able to drop deeper and play the false nine - relieving him of the target man-like qualities the team needs to play 4-2-3-1. The wide players have more space to run into to make the long balls a more viable option.

The reason why the formation needs to be changed can be taken from looking at the possession figures in more detail. The Rapids sit joint 19th with San Jose on 47% possession per game. Looking at the way the team passes, they sit 15th in the league in terms of short passes per game (possession adjusted) and the 1st for long balls per game (possession adjusted). The team also has the highest average pass length in the league of 22 meters. As stated previously, this isn't a side where long balls will work in the current system, and the Rapids have a big preference for long balls.

These passing figures alongside the fact that the team has the lowest passing accuracy of all the teams in the league of just 72% (putting into context: over 1 in 4 balls passed is misplaced) show that Colorado’s ball retention is not the best. It is worth noting that due to the number of long passes the team plays, their pass accuracy is likely to be skewed down anyway (long passes = more inaccurate).

By comparing the number of chances created to short passes made we can get a feel for a side's efficiency in possession. For example, the Seattle Sounders make an average of 65 passes (possession adjusted) before they create a chance. On the other hand, Sporting Kansas City makes a minute 26 passes (possession adjusted) before they create a chance - a snip in comparison. Colorado sits 13th for this efficiency measure, making 44 passes per chance created (possession adjusted). This shows that when they are able to string passes together they are better at creating chances. The formation change to 4-3-3 and utilising Powers as a pivot between defence and attack can help capitalize on this.

Alternatively, looking at long balls per chance created we can get a similar sense of efficiency with a teams long ball usage. Colorado sit joint 19th in the league here, making 11 long balls before creating a chance. I wouldn't be so critical of the usage of the long balls if they actually helped the side create chances. This shows that they are more of a hindrance. For comparison the San Jose Earthquakes, who make a similar number of chances per game (7.75 vs 7.33) make only nine long balls per chance created. Not a huge difference, but then again San Jose have scored six goals in the league so far this season, the joint highest in the league.

Now by comparing short and long passes per shots we can see how many passes a team needs to make before a shot is taken. Colorado sit 19th in long passes per shot, but 13th for short passes per shot. This difference highlights the sides greater efficiency in terms of short passes rather than long passes - and once again the need to try to capitalize on their short passing strengths more. Within the squad they have some good passers - Cronin, Marcelo Sarvas and Powers are three examples with 75%+ passing accuracy.

Finally by looking at how much time a team spends in each third of the pitch, we can get a greater sense of how efficient they are with the ball. Colorado’s split between Own Third/Middle/Opposition Third is 29%/45%/26%. By multiplying the number of short passes by the time spent in each area we get a rough idea of the number of passes made in that part of the field. That will help get a further sense of efficiency on the ball.

Colorado make the 10th lowest number of passes in the Opposition Third with 84 per game. This makes them the joint 5th most efficient side in MLS in terms of passes per chance created - needing only 11 passes in the opposition third to create a chance. For comparison, the lowest is Sporting Kansas City again, which only makes seven passes. Seattle makes 19 passes in the opposition third per chance created - making them the slowest team to build up chances in the league.
There is evidently strength in Colorado’s short passing play - which far exceeds any joy that they are getting from long balls at the minute. Were they to adapt a new formation that takes advantage of this passing strength the side might get closer to scoring more goals. By incorporating a more fluid transition from defense to attack and utilizing crosses more, the club may get some returns in terms of chance creation and shots taken and hopefully somewhere down the line - some goals.

Scoring The Proactivity of MLS Teams

By Jared Young (@jaredeyoung)

Last year I became interested in using statistics to measure a team’s style of play. I was inspired by a Jonathan Wilson article that laid out two extreme styles, which he labeled proactive and reactive. Proactive teams are concerned primarily with possessing the ball and high pressure on defense to get the ball back as quickly as possible. This is Barcelona and tiki taka in its purest form. The reactive teams are characterized by a desire to maintain their defensive shape, will typically offer low defensive pressure and will be direct in their attack.

I've adapted the score, that I called P Score, since last time and the details for the curious are below. One thing about the change to point out now is that I've adjusted the scale to be a 10 point scale - 10 is a high level of possession and 1 is very reactive. 

Here are the P Score rankings for MLS through March. The columns to the right of the total scores show a team’s proactivity relative to their opponent. The way to read the table (for example starting in data column 3) is that Orlando City was less proactive than their opponent in 25% of their games and averaged one point per game. A game is considered even if the two teams were within one point of each other in their P Score for that game. 

Less Proactive Even More Proactive
Rank Team P Score Pts/Gm % of Gms Pts/Gm % of Gms Pts/Gm % of Gms Pts/Gm
1 Orlando City SC 9.5 1.3 0.25 1 0.25 3 0.5 0.5
2 Montreal Impact 7.3 0.7 0.33 0 0.33 1 0.33 1
3 New York Red Bulls 7.3 2.3 0 0 0 0 1 2.3
4 Chicago 7.3 0.8 0 0 0.5 1.5 0.5 0
5 Seattle 7 1.3 0 0 0.33 3 0.67 0.5
6 D.C. United 6.7 2 0.33 0 0 0 0.67 3
7 Toronto FC 6.7 1 0.33 0 0.33 0 0.33 3
8 Philadelphia 6.5 0.5 0.25 1 0.5 0 0.25 1
9 Houston 6.3 1.3 0.25 1 0.5 0.5 0.25 3
10 Columbus 6 1 0.67 0 0 0 0.33 3
11 L.A. Galaxy 6 1.3 0.25 0 0.5 2 0.25 1
12 NYCFC 6 1.3 0.25 1 0.25 1 0.5 1.5
13 Vancouver 5.8 2.3 1 2.3 0 0 0 0
14 Colorado 5.3 1 0.33 1 0.67 1 0 0
15 New England 5.3 1 0.25 0 0.75 1.3 0 0
16 Portland 5 0.8 0.25 1 0.5 1 0.25 0
17 Salt Lake 5 1.7 0 0 0.33 3 0.67 1
18 San Jose 4.5 1.5 0.75 2 0.25 0 0 0
19 FC Dallas 4 2.5 0.5 2 0.25 3 0.25 3
20 Kansas City 3.5 1.3 0.67 2 0.33 1 0 0

Observations

  • Orlando City SC so far scores the highest with a Pscore of 9.5, significantly higher than 2nd place Montreal
  • A couple of teams that are usually known for their possession oriented style of play are at the bottom of the list. The Portland Timbers change of style has been noted, but Sporting Kansas City anchoring the list is a big surprise given their history of a 4-3-3.
  • Two of the best reactive teams last year, New England and Dallas, are again near the bottom of the league.
  • Looking at the table in some depth reveals some interesting early trends about where points are concentrated. I summed up the table in a visual below.

What this table says is that if a team is going to be proactive, it’s beneficial to be more proactive than their opponent. The same goes for reactive teams - results are better when a team is more reactive than their opponent. The implication is that commitment to an execution of a style of play, regardless of style, is a key contributor to success. That’s a pretty fascinating learning and I’ll monitor the numbers over the season as we get bigger sample sizes. 

The New P Score Calculation

The P Score is built off the idea that pass type data can indicate what style of play a team is playing. A proactive team will attempt a higher number of shorter passes and should in theory have a higher percentage of backwards passes. A direct team will attempt longer passes in an effort to counterattack and will have less backward passes. 

When I developed the P Score on the 2014 season I was disappointed in the availability of passing data and I was forced to use variables that I didn't want to use. The model simply used the percentage of long passes and total passes. Recently, Whoscored added more pass types to their match center and I've evolved the model. I tried most pass types available including short, long, backward and through passes as well as crosses. I also looked at blocked shots because reactive teams block a higher percentage of shots than proactive teams. Given their penchant for defensive shape, that makes sense. 

I used multivariate regression using outcomes from a collection of games from the 2014 season. You can read which games I selected for the dependent variable in the prior post. Only two pass types ended up being statistically significant; the percentage of backward passes and the percentage of long passes. Both coefficients adjust the model in the direction you would expect. A higher percentage of long passes lowers the score and a higher percentage of backward passes increases the score. I did not use total passes in the model because that variable can be strongly influence by an opponent, whereas percentages would be more likely to indicate a team’s actual intent. The Rsquared of the new model was a sturdy 0.79.
The old and new models had similar results. I scored the 2015 season both ways and the correlation between the two is 0.95. Orlando City SC is still the top team and Sporting Kansas City is the bottom team scoring both ways.

I strongly prefer this version of the model because it looks at the percentage of the type of team passes to indicate style as opposed to anything related to volume, which as I mentioned would be much more likely to be manipulated by an opponent.

If you have any questions about the methodology please leave a comment or reach out to me on twitter @jaredeyoung. I’ll be publishing the P Score table monthly throughout the season.

How Long Does It Take a Team to Mesh?

By Kevin Minkus (@kevinminkus)

While beginning a season 0-3-0 does not a happy fan base make, Sunday's win over Philadelphia has some Chicago Fire fans feeling at least a little better about the team's rebuilding process. Throughout the beginning of the season, coach Frank Yallop has frequently stressed that the team needs time to adjust to each other. After all, they brought in three new designated players during the off-season, and are returning players who accounted for only 63% of last year's minutes (the league average over the last four seasons is around 71%). It should take a while for all of those new pieces to mesh from the somewhat disjointed side we've seen into a coherent whole. But, given the Fire's level of roster turnover, how long should we expect the meshing process to take?

The term “meshing” is a slippery one, and can be defined in any number of ways.  Is it when a team's roster turnover no longer informs its results? Is it when a team's results sufficiently indicate its performance for the rest of the season? Is it when a team reaches the level of performance it will remain at throughout the rest of the season (if, in fact, a team can ever be expected to do so)?

Each of these definitions could be argued as valid, and I'm sure there are many other possible definitions not considered here. As it stands, though, these are the three I will analyze, using MLS data since 2011, in hopes of arriving at an answer to the question of how long it takes a team to mesh.

Let's start with the first definition- meshing defined as the number of games in which roster turnover still directly informs a team's results. 

This graph shows the correlation between points after x number of games and the percentage of a team's field minutes returned from the previous season. 

A positive correlation suggests that as roster stability increases, so does points earned. Numbers below the red line are not considered statistically different from zero (at 90% confidence). Note that the correlations in general aren't huge, but they do exist. As you can see, the correlation between roster stability and points peaks at game three, and remains statistically significant until game five (after which it remains insignificant until close to the end of the season).

A similar pattern exists if we look at defensive stability, though the correlation becomes doesn't become insignificant until after 8 games:

These two graphs, then, suggest (though perhaps not convincingly), that it may take as few as three or four games for a team in general to mesh, while it may take as many as eight for a defensive unit to come together.

Now let's take a look at the second definition- meshing defined as the point at which a team's results through some number of games “sufficiently” indicate what its results will look like for the rest of the season.

To do this, I've split teams into two groups- those with “high” roster turnover (in the top 50%), and those with “low” roster turnover (in the bottom 50%). I then regressed the team's final points total on the team's points total after x games, for each of the two groups. The Rsquared values for each of these regressions are graphed below, with the linear models from the set of all teams included as well. So essentially what we are looking at it is how well we can predict how a team will finish the season, based on what they've done after a given number of games.

Through six games, each game is about as predictive for each group, meaning that how well a team with high roster turnover does through six games is just as indicative of how that team will finish as how well a team with low roster turnover does through six games. That is to say, we don't gain any extra predictive power by knowing a team's level of roster turnover.

By game seven, though, high turnover teams begin to out-pace low turnover teams- by game seven we have a better idea of how high turnover teams will finish the season than low turnover teams. 

By game nine, the R2  value for high turnover teams is at .546, which is pretty high. We would expect predictions made using this nine game point total to be on average only about seven points off the final season total. That gets us pretty close for being barely a quarter of the way into the season.

 Though it's a normative statement not a positive one, and you could really draw the line anywhere, I would probably suggest that nine games is as good a place as any to set the limit on meshing based on our second definition. At the very least, we can say that after nine games we should have a decent idea of whether the rebuilding process will be successful in year one.

Finally, let's turn our attention to the third definition- meshing as the point at which a team reaches its consistent level of performance.

Let's investigate this phenomenon a little bit. 

Here's a graph of the three game rolling expected goal difference (at x = 4, the value on the y axis is the xGD from games two, three, and four, for example) for Sporting Kansas City last season- a decently representative mid-table team.  Expected goal differences provide a pretty reasonable statistic for gauging how good a team is.

It's pretty much all over the place. 

A three game rolling points per game graph of another mid-table team from last year, the Vancouver Whitecaps, tells a similar story:

These graphs point to something which I think is an important (though perhaps obvious) point to make; it's mostly unreasonable to expect game by game measures of a team's strength to converge over the course of a season. (Metrics like xGR (expected goal ratio), TSR (total shot ratio), and points per game will converge, but usually only when they're being calculated on aggregate.) There are a lot of reasons for this. Injuries, international call-ups, strength of schedule, and mid-season transfers are all factors which affect a team's consistency of performance. Teams, save maybe the very dominant and the very bad ones, just go through peaks and valleys throughout the year. They have good games and bad games. 

What does this mean for meshing, then?

Well, we've already seen that how a team performs at the start of the year can be predictive of where it finishes, particularly for teams with high turnover. The point above, though, suggests that how a team starts the year isn't necessarily indicative of how it will perform throughout the year. 

For teams who haven't quite come together yet, then, there is certainly still hope of righting the ship. Given the above analysis, I would expect the effects of having new players brought in to the system to begin to wear off by game four or five (though this may take a bit longer this season because of international call-ups). By game nine or ten, a team should have a decent idea of how well it has done in rebuilding its roster. If things remain bleak at that point, there is still the possibility of finding some success, but it may come only in limited doses.

USMNT IN Switzerland: Beyond the Score

By Jared Young (@jaredeyoung)

The USMNT took on Switzerland Tuesday, their 9th friendly since the World Cup, and in the process relinquished their 6th second half lead. The 1-1 draw wouldn't have been as much of a disappointment if the result didn't tell the same story about a team unable to hold a lead against top competition. The USMNT is now eleven goals against and just one goal scored in the second half of these friendlies. And that’s all I’m going to say about that. Here are three other stats to take away from the latest International weekend.

9: Is Klinsmann too conservative? Jurgen Klinsmann’s team didn't escape Europe with double digit shot attempts, as they finished with just nine. Is the team too conservative when it comes to shot selection? Three goals in nine attempts is an excellent conversion and there were a few shots that could have easily been converted, Michael Bradley’s sitter against Switzerland being the most notable. But are there too few shots taken? Consider that eight of the nine attempts were taken inside the box and even more crazy, inside the area of the spot. There was only one shot attempted from outside the 18-yard box, and that was Brek Shea’s laser goal off of a free kick. In other words, the team didn't attempt a shot outside the box in the run of play. Pause on that one for a moment.

This weekend the USMNT attempted 18.7 passes in the final third for every shot while their opponents attempted 10.8 passes in the final third per shot. Considering the US was playing a more direct style on offense that does imply they may be too picky once they get the ball in position. The results this weekend weren't terrible, especially offensively, but it does beg the question: does the US have the right shot selection balance offensively? More in part III of this post.

19.8: High energy, low team pressure. Colin Trainor has been publishing work on a metric that attempts to measure how much a team employs the high press. The metric takes opponent passes attempted in their defensive half plus about 20% of the offensive half of the field (so about 60% of the field that is the farthest away from their goal) and a team’s defensive actions in that same area. The lower the passes per defensive action, the more intense the high press. A measure of mid-single digits would indicate a consistent high pressure strategy. Here is the PPDA metric chart by team and area of the field.

You can see from the chart that Switzerland was much more aggressively defending up the pitch than the US. When the action was in the defensive end, both teams employed similar pressure. This resulted in the possession being strongly in favor of Switzerland at over 60%. The US did have high individual energy in their opponent’s offensive half but mainly that running around was just to disrupt the Switzerland offense as much as possible. The team as a whole was willing to wait to employ significant pressure. We didn't see a particularly aggressive US team this window and it makes you wonder if Klinsmann isn't perhaps going for results instead of pushing his team to be proactive like he was doing during the last World Cup cycle in these friendlies.

2: Blocked shots against UEFA teams. I now the late game defense is the big issue, but I’m not done harping on the shot selection. In this nine game stretch the USMNT has taken to the road against four European foes and have managed a 1-1-2 (W-D-L), but could easily have been 3-0-1. They did this attempting just 29 shots in the four games, an average of 7.3. The crazy stat is that only two of those shots were blocked, or just 6.9% of the total shots. A typical blocked shot percentage is roughly 25%. You can’t argue with the 17% finishing rate in those four games, but it does make you wonder the team is too picky on offense. 

Let’s do a little thought experiment to see if this trend is something that should change. Back to the latest window and games against Denmark and Switzerland. What if the US took shots as frequently as their opponents but also finished their shots at their opponents’ lower rate. The numbers would look like this:

The US would have only scored 2.6 goals had they been as selective as their opponents, and so while the sample sizes are clearly small, at least it looks from here that Klinsmann isn't too crazy.

Next up for the US is the rowdy rivalry with El Tri in what will hopefully be a Gold Cup Final preview (said by the guy living in Philly, home of the Gold Cup Final).