"Positions" are a lie. by Benjamin Harrison

By Benjamin Harrison (@NimajnebKH)

The idea of a player “position” is too inflexible.

We know – as fans – that that there are more than 11 different types of soccer players. We simply assign them titles which match a variety of on field roles, and some of those labels fit better than others. A “defensive” midfielder may also be a holding midfielder, is likely a central midfielder, and could even be a deep-lying playmaker. We may use the more nuanced terminology in a basic narrative description of game play – but there is no standard definition for how those roles might translate into measurable events. Soccer analytics is often left with a set of basic positions to categorize play on the field. These are reflected fairly well in the most basic statistics measured by OPTA. Consider a set of 209 players receiving starts over the 2014 season:   

The raw data here is collected from whoscored.com. Pass attempts per 90’ accordingly excludes crosses and set pieces. “Defensive actions” are all tackles (successful or not) interceptions, clearances, and blocks. Where deemed useful, I used the position selection option from whoscored (this is an extremely useful tool for reasons that will hopefully become evident over the course of this post) to restrict the player to a dataset which fit into an assembled 11-man lineup (only 11 starters- a potential lineup, were chosen from each team). Although positional differences are apparent in the basic biplot, the accumulation of passes and defensive actions also incorporates aspects of style – the pace of play – which vary considerably by team. To remove team context, I summed up the pass and defense rates by team and converted the axes to share of team actions for the 2014 dataset.

We’ll be using the 2015 dataset (raw data collected from whoscored as of April 23rd) through the remainder of this post. These 232 data points have been assembled using a slightly different approach – collecting all player statistics with a cutoff of 270 minutes game time, and normalizing individual numbers to the team average. Players who change positions between games should be expected to blur some position-specific distinctions, but major changes in player role are infrequent enough to be overwhelmed by the general trends. Despite the modest differences in method, the two plots exhibit predictably comparable values – there are a finite number of actions teams can take in a game, and a limited number of general tactical formations used in MLS (and soccer, in general).

The modified plot clarifies how the team uses the particular player as a share of its overall play. When the plot is constrained to a team-specific lineup, it can be a useful tool for visualizing average tactical setup, changes between seasons/games, and tactical adjustments to game state (check out the three links for some handy case studies specific to Seattle Sounders play). Positional differences remain apparent, but considerable overlap persists between categories, and their range implies poorly-matched roles. So long as a “midfielder” can have the same share of team actions as both a striker and a central defender, it remains a poor label. Overly broad player categories force the statistical comparison of different player roles having vastly different circumstantial difficulty (see, for example, this study of players with similar attacking midfield roles to Lamar Neagle). Often, difficult behavior is associated with exactly those aspects of play that lead to team success:

“Chances” are defined here as the sum of all assists, key passes, and shots. Offensive “touches” are the sum of basic passes, cross attempts, and shots. Evaluating player performance based on skill-dependent statistics is dependent upon a thorough assessment of player behavior. We need player typing to be as diverse as on-field roles, and as indifferent to nominal “position” as possible. The statistics used to characterize type should be characteristic of role and as far removed as possible from player quality/skill (e.g., shooting rate should discriminate attacking players, but the ability to generate shots is descriptive of quality, so it is not useful as a role-dependent statistic). Finally, we shouldn't use so many statistics in constructing a model of roles such that the result becomes overfit to specific players or contains redundancy (e.g. including two different types of basic passing rates – say, short passes and long passes – would exaggerate role difference specific to distribution).

For now, with the 2015 dataset, I assessed pass and defense share as described above.Goalkeepers have been excluded (it is interesting to include them in team analysis, but their position label is relatively effective). I also calculated and recorded dribbles/touch (measuring attacking style on the ball) and crosses per touch (wide vs. central play). I then relativized each of these four role indices to its 210-player maximum and performed a hierarchical cluster analysis on the resulting data matrix:  

I chose a position for pruning the tree (dashed line) that identifies 15 discrete player clusters grouped by role similarity by the four indices (this step is arbitrary this time, but will be automated in the future). Alongside each, I’ve roughly characterized the differences picked up in the analysis on a scale of --- (well-below average) to 0 (average) to +++ (well-above). Notice, if we move the cutoff line to the left to define only 3 groups, these would be primary defenders at the top, wide players in the middle, and central attackers at the bottom. Running a principal components analysis on the same dataset, let’s take a look at the differences between nominal position and cluster identity on the two first axes of variation. 

The overlap problem with position is considerably reduced (though not absent) with cluster identity. To be useful, the cluster identities must also exhibit superior discrimination of role difficulty. Short pass accuracy is a skill-dependent statistic, but highly variable depending on situation:  

Here the short pass accuracy by position is compared to that by cluster (cluster 11 is excluded, since it is simply Fabian Castillo – the point guard man who never encountered a ball he didn't want to dribble past an opponent). Many clusters exhibit a substantially tighter range of values than for the position counterparts – remember that these categories have not been defined by any values that explicitly measure skill or quality. Within clusters (or between closely related clusters) players should show similar statistical performance unless otherwise influenced by skill (as shown with the previously linked example concerning Neagle). No matter how well we characterize situational difficulty (e.g. how far from goal a shot is taken, or the direction, location and length of a pass), constraining the performance of peers provides a more complete characterization of expected result.

Providing context for player evaluation is only part of the value of this approach. The performance of individual players is strongly controlled by myriad factors even beyond team and role context. Grouping similar players may allow us to address questions that would be otherwise complicated by sample size. Take, for example, the question of whether any player can be considered to overperform or underperform expected goals.  

If a style-specific skill in finishing exists, the grouping of similar players – with the resulting increase in sample size – might allow its detection more readily than would be the case measuring goal records for an individual player subject to seasonal noise, team context, and age-related development trends. However, the modest differences between xG and G in the data above should probably be considered a vindication of the model, if anything. Attackers with substantially different on-field roles and shot selection still exhibit predicted finishing success. Still, this approach may warrant further testing in the future with more refined role discrimination and a larger dataset.

The four-index model above warrants more work. Some player groups are very effective, but others clearly could benefit from different weighting prior to clustering and/or additional indices. Take, for example, cluster 15 which mainly incorporates central attacking players with fairly average pass share. The cluster also picked up Vancouver CB Pa Modou Kah, who has exhibited abnormally low pass and defense shares for his role so far in 2015. The present dataset may also suffer from limited sample size (any set of a few games may lead to some very unusual game states and corresponding performance). Nevertheless, preliminary work suggests player typing may be a useful analytical tool.

The Weekend Kick-off: Texas Two Step by Harrison Crow

by Harrison Crow (@Harrison_Crow)

If there is one thing that we know about sports it's simply that familiarity breeds hate. Classy line... and one that I had to steal because this introduction was, for some absurd reason, killing me. Face it, Houston moving back to the Western Conference this past year probably excited a lot of fans as it could mean a more prominent and possibly resurgent Texas Derby with both clubs meeting more often than once a year.

I feel as if the most quoted thing in connection with the Dynamo is how soon can they get Erik 'Cubo' Torres. I don't want to exaggerate and call them a terrible team, but they haven't had a real good showing of late. Either their defense is terrible, their attack is anemic, or it's some gross combination of the two. The sick thing about this is that our numbers actually indicate that they might actually end up being the better team.

Okay, Mr. Snooty. You can point to the current standings and wag your finger at FC Dallas but indulge me for a moment. Forget about Houston being tied for seventh place in points per game; they have thus far been the inverse of FC Dallas, with a smidge more than an expected goal per match and just less than one expected goal against. This presents the possibility, despite the disparity in the standings, that these two teams are a lot closer than many would readily admit.

I think it's fair to suspect FC Dallas might go on a downward spiral at some point in their 2015 campaign. Not because they're "Dallas" and thus making it something easy to call, but it has to do with the amount of shots their surrendering, the leverage index of those shots and the fact that they are who they are. Also, Dallas becomes unbearable in summer time this according to my own personal research and experience of it being "hot as balls" when I visited.

That being said, Dallas has a great quartet of Mauro Diaz, Tesho Akindele, Blas Perez and Fabian Castillo. While this group has been described with excess hyperbole by many early in the season, it's still a very good grouping of talent that can hurt you very quickly and through multiple delivery methods.

Michele and Diaz are both gifted at delivering from dead balls and set pieces, Castillo and Akindele have tons of physical gifts mixed with fun technical abilities that make watching highlights a joy. Blas Perez is a brute that wins balls in the air and is excellent back to goal. Let's not attempt to convince ourselves that this attack is not going to get better at some point.

I think this game boils down to which team can find the right mix of shots and leverage opportunity. Will Texas finally start taking more regular attempts as they get those opportunities presented or will they squander them looking for the best chance that might not come?

Likewise I think Houston needs to use their creativity to find shots that aren't just shots added to a tally but are meaningful in the way that might increase the probability in their favor.



Tyler Deric (Selected 17.8% , Cost $5.0)
Surprisingly enough Deric has been a top-three keeper in MLS according to our G -xG rankings. Houston's shots allowed gives credence to the idea that he might just be able to sustain this.

DeMarcus Beasley (Selected 14.9% , Cost $7.1)
Possibly one the best all around fullbacks in MLS and right now the best fantasy full back fake money can buy. The question is have to ask yourself is do you value full backs on defense over centerbacks that have dominated the season thus far?


Chris Seitz (Selected 20.8%, Cost $4.9)
A solid keeper in his own right, Seitz ownership mostly spawns from the three clean sheets in the first four matches of the season. But the Dallas defense is allowing a lot of shots which kind of limits his long term value.

Ryan Hollingshead (Selected 20.8%, Cost $5.4)
The injuries sustained by Mauro Diaz directly related to the early minutes which Hollingshead received. His cost is reasonable but with the return of Diaz it's a legit question of how many minutes he's going to regularly see.


(expected goal differential in even game-states)


Dallas FC (0.05) @ Houston Dynamo (-0.10)
Prediction: Draw

San Jose (-0.03) @ Real Salt Lake (-0.41)
Prediction: Draw


Toronto FC (-0.23) @ Philadelphia Union (-0.06)
Prediction: Draw

Columbus Crew SC (0.30) @ DC United (-0.54)
Prediction: CCSC, FTW!

Colorado Rapids (-0.19) @ LA Galaxy (0.00)
Prediction: Draw

Vancouver (0.06) @ Portland Timbers FC (0.00)
Prediction: Draw


Chicago Fire (-0.08) @ Sports Kansas City (0.63)
Prediction: SKC!

Seattle Sounders FC (0.04) @ New York City FC (-0.53)
Prediction: Draw




Yeah, go see it before all your friends do and spoil all the good parts.

Proactivity Doesn't Mean Success in 2015 by Jared Young

By Jared Young (@jaredeyoung)

For more on Pscore, see last month's post.

As April comes to a close, Orlando City is still, by a fair margin, the most proactive team in MLS. They are joined closely by Montreal, NYCFC and Columbus. Unlike last year, where the top seven most proactive teams made the playoffs, this season proactive play is no guarantee for success, with only Columbus playing well in the top half of the East.

Each month I’ll change up the table to give some different looks. I’ll also try to look at how Proactive Score relates to other statistics to see if Proactive Score is making sense in a larger context.

This month I split the PScore between home and away. Last month I pointed out that Portland, Real Salt Lake and Sporting Kansas City were all playing more reactively than in past seasons. What’s interesting is that they are all playing more reactive at home than they are away. On the flip side, Toronto FC was a very reactive side last year but look like they could ultimately be one of the more proactive teams. Despite no homes games this year they currently rank 7th in the league.

Team Rank Last Pscore PPG Pscore Home Pscore Away Home Dif
ORL 1 1 7.5 1 7.3 7.8 -0.5
MON 2 2 6.8 0.5 9 6 3
CLB 3 10 6.4 1.6 7.8 4.7 3.1
NYCFC 4 12 6.4 0.8 5.8 7 -1.3
NYRB 5 3 6.3 2 7.7 5 2.7
CHI 6 4 6 1.5 5.8 6.5 -0.8
TOR 7 7 5.8 1 5.8
SEA 8 5 5.7 1.9 5.8 5.7 0.1
LA 9 11 5.6 1.5 7 4.3 2.8
NE 10 15 5.4 1.8 5.5 5.3 0.3
POR 11 16 5.4 1.1 4 6.8 -2.8
DC 12 6 5.3 2 4.8 6 -1.3
VAN 13 13 5 1.8 5.6 4.3 1.4
COL 14 14 4.7 1 6.3 2.7 3.6
SJ 15 18 4.7 1.3 5.3 4.3 1.1
HOU 16 9 4.6 1.3 4.8 4.3 0.5
PHI 17 8 4.3 0.7 5.5 3.4 2.1
SKC 18 20 4.1 1.3 3.8 4.5 -0.8
RSL 19 17 3.9 1.3 3.7 4 -0.3
DAL 20 19 3.5 1.8 2.8 4.7 -1.9

This month we’ll start with a fairly easy comparison of PScore against pass completion rates and possession. Not surprisingly PScore and pass completion are strongly correlated with an RSquared of .63

Given PScore is built on long passes and backward passes it stands to reason that teams that prefer short backwards passes will complete a higher percentage of their attempts. Orlando City is the team in the upper right. This chart highlights why a team like FC Dallas can be so good but complete the 3rd lowest percentage of passes in the league. The reason is because they attempt a higher volume of forward and long passes than the rest of the league. 

Here is the comparison of pass completion rate and possession:

And here is the comparison of PScore and possession. 

It’s interesting that PScore predicts possession levels just as well as pass completion rate. But something else pops for me here that is worth tracking in the future. Look at the two data points for Orlando City and Montreal. Their PScore level indicates they should be enjoying a much higher level of possession. Both those clubs are underperforming in the bottom half of their conference. On the flip side, FC Dallas is enjoying much more possession than their PScore suggests they should be, and they are performing very well in the West. The other two data points well below the line and to the right of FC Dallas are Real Salt Lake and Sporting Kansas City. Both of those clubs have had rocky starts this year but are perennial contenders.

It’s another interesting angle to watch. If we look at how proactive a team is and compare that to the possession they should expect to have, can we assess how well the team is performing? I ran a quick regression that looked at the error of possession and Pscore relationship and compared it to points. The Rsquared was just 6% but trending in the right direction. One of the issues is there are many points with very little error. It may only be useful when looking at large error levels. 

Next month we’ll revisit the importance of being more proactive or reactive than your opponent.


The Weekend Kick-Off: New York City FC To Play With Fire; Hoping Not To Get Burned by Harrison Crow

by Harrison Crow (@Harrison_Crow)

This season started off with Chicago being a punch line. They looked bad against LA and were arguably worse against Vancouver a week later. They were not aesthetically pleasing and their new star Shaun Maloney wasn't doing much to inspire visions of a team turn around.

Nearly five weeks later the team has back-to-back wins and Maloney is not looking as bad (sporting an xG+xA of .81). Shockingly enough, Chicago isn't the dumpster fire it once was. There may even be enough pieces with the return of Mike Magee to make a push for a playoff spot.

I'm not trying to get ahead of myself; there are still 29 more matches to play. Chicago could still be a bad team but there is something about having either an above average defense or offense that presents a complicated variable.

Chicago might be a mess defensively (1.40 xG against) but their attack has all sorts of interesting pieces. Harry "don't call me Harrison" Shipp is perhaps one of the most interesting American attacking pieces in Major League Soccer. Kennedy Igboananike is very quietly having a strong first year. Quincy Amarikwa is still doing his thing as perhaps the most under-appreciated striker in MLS, and Joevin Jones has been a nice little pick-up too.

The sum of the team has melded to make a greater whole than the individuals. We'll see this weekend if their success can continue.

Whereas Chicago has been defeated by their poor and mistake prone defense, it's been New York City's moments without David Villa on the ball that has been their downfall. Villa has been worth just about every penny. The problem has been outside of Villa. They've gotten league average assistance from his trio of strike partners Khiry Shelton, Adam Nemec and Patrick Mullins. But inconsistent creation from the midfield and a defense that is still trying to get on the same page has created problems.

Mikkel Diskerud has shown moments of brilliance between his slick passes and curling shots finding holes for goals. But he's still working through adjustments to the league and he's perhaps not the pure creator that Jason Kreis or NYC needs. Maybe Frank Lampard will be that person, and maybe not. Maybe it will take the summer transfer window to acquire that player.

Right now, New York City boasts a defense that has pieces and talent but somehow hasn't yet translated that to being successful. Currently averaging 1.40 xG against and standing 15th overall, the  scary thing is that their PDO is sitting around 987, right near the normal resting heart rate of a club. In other words, they probably are what they are as a team. I'm sure they'll have some ups and downs through the season, but without limiting the shots this club isn't going to really take that step.

They already found out that backup striker Tony Taylor is out for the season. Should NYC loose out on Villa tonight, and that's the current rumor going around, they will have to not only figure out how to make up the difference in his ability to create and score goals but hold to bay a team that actually has a decent attack of their own.

The real outcome of this match will boil down to whose defense holds. Will Sean Johnson show up for this match and can Josh Saunders continue to be an above average keeper? This season is still young and while a single game hardly defines the destiny of a season, I suspect these two clubs will be dancing around each other through the season in the standings.

Tonight, for mostly obvious reasons, I'm taking the Chicago Fire for all three points. That said, I wouldn't be surprised if their defense collapsed and a draw was a result but either way I shade in the Fire's direction of earning points.


Chicago Fire

Harrison Shipp (Selected 26.1%, Cost $7.9)
There are few entertaining and redeeming qualities about the Fire and Shipp is one and perhaps all of them at the same time. I can't imagine that his cost is going to stay suppressed for much longer if he keeps putting together the goal scoring opportunities for his strikers and finding the back of the net himself.

Lovel Palmer (Selected 12.3%, Cost $5.9)
There are few players in MLS as versatile as Palmer which translates to more minutes because of it. He'll never be an individual that puts together huge games in terms of points. But it'll be consistent point allotment from match to match and in MLS Fantasy that's a huge quality to be find.

New York City FC

David Villa (Selected 20.3%, Cost $10.3)
The 33-year old Spaniard looks out this week so he probably doesn't impact fantasy this week but looking down the range, once he heats up, he'll be the best striker in MLS. Write it down.

Mix (Selected 10.9%, Cost $9.1)
This is one of those occasions that I don't get the price relative to the production that an owner is going to get. There are a lot of people that bought into him early (probably due to the pairing of Villa) and kind of got burned. He's a player that we're still learning about because we didn't have a lot of concrete data on him. I think he still has a bright future with the US and in MLS.


(expected goal differential in even game-states)


Dallas FC (0.04) @ Colorado Rapids (-0.20)
Prediction: Draw

Philadelphia Union (-0.03) @ Columbus Crew SC (0.29)
Prediction: Columbus

Real Salt Lake (-0.37) @ New England Revolution (0.32)
Prediction: New England

Sporting KC (0.78) @ Houston Dynamo (-0.18)
Prediction: Sporting Kansas City


DC United (-0.49) @ Vancouver Whitecaps (0.00)
Prediction: Whitecaps

LA Galaxy (0.08) @ New York Red Bulls (-0.01)
Prediction: Draw

Toronto FC (-0.46) @ Orlando City SC (0.13)
Prediction: Draw

Portland Timbers (0.22) @ Seattle Sounders FC (0.86)
Prediction: Draw



Expected Goals 3.0 Methodology by Matthias Kullowatz

By Matthias Kullowatz (@mattyanselmo)

Michael Bertin of Deadspin recently critiqued the expected goals craze that is rushing through advanced soccer metrics. He specifically noted that so many expected goals models are currently proprietary, hidden inside of black boxes. We here at ASA have sought to be as transparent as possible, and so we have published our logistic* expected goals models in the Explanation section of our xGoals 3.0 tab above.

Many of the variables in the model are intuitive. The distance from the shooter to the goal obviously affects the difficulty of the shot, as well as the angle from which the shot was taken. Shots off corner kicks have a lower chance of going in--once controlled for shot location, angle, body part, and other factors--because the box is packed. Fastbreak shots off through balls have a high chance of going in because the shooter often has time and space. The variables in the basic shooter/team model include: distance, goal mouth available, whether the shot was headed, whether the shot came off a cross or through ball, and whether the shot came from any one of the various patterns of play including corner kicks, direct free kicks, indirect free kicks, fastbreaks, or penalties. The "regular" pattern of play is included in the intercept term.

A recent change we have made is substituting a log-Distance variable into the model for what was just a linear Distance variable. This idea was admittedly inspired by Bertin. Using log-Distance will change some of the output on the blog because the results of extremely close and extremely distant shots were not being as accurately predicted as they are now. Justification for this change can be seen in the graph to the right. The trend is that of a (negative) log function rather than a linear function. Note the spike around 13 yards. These are penalties, and as you can see, our model's calibration is off a bit. Penalties average 13 yards in distance in our data set, though this will not effect the utility of the model because distances are relative.

I have also updated how the model treats the width of the goal mouth available to the shooter. From straight on, a shooter has eight yards from left post to right post. But as his angle gets worse, that width available can shrink considerably. To appropriately model the effect of goal mouth availability, I used a quadratic function, which is justified to the right. The plot shows how the log odds of a goal change due to angle, with diminishing returns for better angles. Here, shot distance is frozen between 9 and 15 yards. 


Additional Keeper Model Variables

The height of the shot in the goal mouth is also important. Players aim both low and high to try and beat the keeper, and justification for that strategy is borne out beautifully in the graph shown to the right. The log odds of a goal increase the further the shot height is from a comfortable 3.5 feet. The decline in log odds between about 6.5 and 8 feet is a bit perplexing, though. I controlled for distance on this graph, but not other factors. It turns out that 21 percent of all shots in the upper portion of the goal mouth were headed, versus just 14 percent of shots below that zone. This surely plays a role in the strange behavior between heights of 6.5 and 8 feet, and we have controlled for headed shots in the model. Here, shot distance is frozen between 15 and 21 yards.

The last variable I'm going to justify is the linear version of the lateral distance a keeper had to move to make a save. This was the hardest part of the model mathematically, as it required some tricky analytic geometry and some basic assumptions about keeper positioning that aren't always true. Basically, we assume that keepers position themselves along the angle bisector of the two rays that extend from the shot to both posts. If they don't, then they should (usually). The lateral distance to the shot is then measured along a line that goes through the near post, perpendicular to the angle bisector. The geometry, as well as justification for the linear term in the model, are shown below. Again, there is strange behavior in the log odds when the lateral distance is between 3.5 and 4. The is because very few shots are taken from straight on, and thus the sample size is incredibly small and subject to weird fluctuation. Here, shot distance is frozen between 9 and 15 yards.


For logistic models (and many other general linearized models and non-linear models), the R-square value is not a particularly intuitive value. I hope the p-values in the models above, in addition to the graphs and basic logic about soccer, help to justify our Expected Goals 3.0 model. 

*Logistic models use a log odds response instead of a probability. This is because linear models by themselves could potentially arrive at probabilities above 1.0 or below 0.0. Log odds are the natural logarithm of the ratio of probability of success "p" to probability of failure "1 - p," or ln[p/(1-p)]. 

The Weekend Kick-Off: A Coast-to-coast trip by Harrison Crow

By Harrison Crow (@Harrison_Crow)

A mid-week US Mens National team game, a Thursday evening game between Philly and NYC, all leading up to a Friday game? Maybe I should have posted this "kick-off" on Wednesday? Well, forgive me. At least there has been plenty of American soccer to go around this week, which has inhibited my new seasons of Property Brothers that hit Netflix last week (why do they go always go with the smaller Reno budget???). I'll live.

Let's get to this week's Friday night game of San Jose traveling to Harrison, New Jersey to take on the New York Red Bulls (notice, I didn't use sarcastic tone or put New York in parentheses? Be proud of me, this is growth).

San Jose will be without exciting newcomer in Innocent Emeghara, who was suspended by MLS, and defender Shaun Francis, who is out one to two months with a fractured cheekbone. However, Dom Kinnear and company got a bit o' luck with Chris Wondolowski who was with the US national team but played zero minutes. Wondo ranks 11th in the league in xGoals + xAssists, indicating that he is a crucial piece of the Earthquake's offense, and he should be available tonight.

Jesse Marsch is in a much better position with his line-up. A healthy attack of Bradley Wright-Phillips, Lloyd Sam and Felipe Martins with Sascha Kljestan launching passes into the attacking third makes for an altogether overwhelming task for San Jose defenders Clarence Goodson and Victor Bernandez. That said Marsch still will have to deal with his own set of missing personnel with the potential unavailability of both Ronald Zubar and Damien Perrinelle.

Something that we talk a lot about around these parts is the traveling conditions for teams that are traveling West-to-East and East-to-West. I haven't done research on it, but it's something that Drew has talked a lot about. I don't like speculating on things for which I have no data in front of my face, but I feel like East-West travel through time zones has been shown to have a hangover effect on away teams (anyone that wants to do a study and needs some help give us a shout). Going into this, my mind thinks San Jose has a lot to overcome.

But, we're not here to get opinions. We're here for facts. That's why you come to read this blog...mostly. It might also be my winning personality and Property Brother mentions.

The Red Bulls are tied for first in points per-game within Major League Soccer, cohabiting that position with DC United. Paradoxically, both teams are ranked toward the bottom of our expected goals tables, so perhaps some regression is coming. If not this week, soon.

Currently the Red Bulls are tied (with Real Salt Lake... you can't script this stuff) with the second-highest PDO. Which, as we discussed last week, is a barometer for exceeding or falling short of likely expectations, especially early in the season. Their shots against totals are a very under-discussed talking point that could end up costing them some points in the future. Especially when their finishing rate against (6.8%) is almost sure to rise in the coming weeks.

I still think that the Red Bulls are the better team, and considering they are a very good home team and San Jose is making a cross country trip, things kind of lie in their favor. But don't be surprised if San Jose finds some cheap goals and still gets a point.

That said, PREDICTION: I'm going with the Red Bulls.


San Jose Earthquakes

Fatai Alashi (owned %6.6 - worth $5.1)
Alashe is getting plenty of selections by owners with the growing importance within MLS Fantasy of having someone cheap that is going to see minutes on your bench. His performance for SJ isn't about getting a bunch of points--though he's had some solid moments--it's about making sure you get some points. He's started four of six games played by San Jose this year.

Chris Wondolowski (owned %5.1 - worth $10.7)
Wondolowski is the most consistent goal scorer in MLS not named Robbie Keane. Goals scored isn't everything in MLS Fantasy, but it's of course the big point-getter, and there are few that are going to be worth that much of an investment. As mentioned above, Wondo is 11th in the league in combined xGoals and xAssists. He's a key part of that offense.


New York Red Bulls

Bradley Wright-Phillips (owned %12.9 - worth $10.9)
BWP is showing that he's more than just Thierry Henry's last project with two goals and two assists in four games. My initial concern is that he's creating fewer shots, especially considering that last season's 27 goals came not just from quality chances but also volume of shots (109). However, our numbers have him at 3.46 xG+xA, which is fifth in the league and first on a per-game basis.

Lloyd Sam (owned %8.9 - worth $8.8)
Sam is in a similar situation as BWP with gathering less total xG than what he's actually scored. But, just like Wright-Phillips, it's not as if he's overachieving by much. His expected assists is over one, putting him on pace for 8 - 10 assists this season. I really like Sam and I fully expected there is going to be a tough moment where I have to come to the realization that he might not be worth the price relative to the other market options, which makes me sad, but for now he's doing great and I expect him to continue to do so.

The Weekend Match-ups:


Houston (-.49) at DC United (-0.82)
Prediction: Draw

Orlando City (0.26) at Columbus Crew SC (0.11)
Prediction: Draw

Toronto (-0.56) at FC Dallas (-0.12)
Prediction: F-C-D

Seattle Sounders (0.62) at Colorado (-0.06)
Prediction: EBFG, Sounders

Vancouver (0.00) at (Real Salt Lake -0.62)
Prediction: Southersiders

Sporting KC (1.26) at LA Galaxy (0.32)
Prediction: SKC, but this is one of the more mind boggling match-ups--I may come back to this one in a few weeks.


New England (0.34) at Philadelphia Union (0.61)
Prediction:  Union, because it has to eventually happen.

Portland Timbers ( 0.37) at New York City (-0.91)
Prediction: Cascadia with a third win on the day. #BestCoast




One day I'm going to finish my Marvel meets MLS post and you're all going to hate it. For the time being I want you to think about how much Dax McCarty looks like Remy Lebeau. You're welcome.


Mexico at USMNT: Klinsmann stays the course by Jared Young

By Jared Young (@jaredeyoung)

The USMNT avoided their trademark collapse on Wednesday and easily defeated their arch-rival Mexico by the classic score of dos a cero. The final score was about the only stat that changed however for Jurgen Klinsmann’s team, as the USA continued the style of play that has characterized their post-World Cup friendlies. Klinsmann continued to experiment with new players and played a conservative style focused on getting good shots while limiting the opponents’ quality chances. He said that he was starting to hone in on the Gold Cup and so fans might have expected the US would come out of their shell. Perhaps the surprise of the match was that they stayed the course, in what could be Klinsmann’s preferred strategy for the next cycle.

Klinsmann went with a 4-4-2 diamond set up, while El Tri came out in a conservative 5-3-2 setup. Both teams offered very low defensive pressure to start the game before slowly opening up. Both teams combined for just 8 shots in the first half with only two being attempted inside the 18 yard box. There was just no space for either offense to operate.

In the second half as the teams opened up, it was brilliant play from Michael Bradley combined with a little luck and solid finishing that gave the US their only two goals of the game. Jordan Morris, a 20 year old, scored his first goal for the USMNT. Much will be made of Jordan being a college player but we need to remember that most of the best players in the world are not playing soccer in college. It’s simply not part of a good player’s development in any country but the US. Just over four years ago, the 2nd goal scorer of this match Juan Agudelo, scored a USMNT goal as a 17 year old. Did it matter that he was or was not in college? Heck, he wasn’t old enough to be in college. The media loves a good story but this country won’t show soccer maturity until we can bring that global perspective to the game. Celebrate a young player scoring and give that context, just please not that he’s choosing to play in college.

486 minutes from “newbies”: Klinsmann said his focus was turning to the Gold Cup, but he continued to experiment with new players. More than half of the minutes played were by players who did not play in the World Cup. This was the second highest minute total for the young guys in this series of friendlies, only exceeded by the Switzerland match.

72% pass completion percentage: Blame the poor field conditions but this pass completion percentage was the lowest from the US during this cycle. When a team is sitting deep, low completion percentages are expected, but at home this was perhaps too sloppy a number.

Four shots on target for USMNT to two for Mexico: Yet again, the USMNT gained the shot advantage despite giving up more shots. Mexico outshot the US 12-8 but eight of Mexico’s shots were hail Mary’s from outside the 18 yard box. The USMNT’s TSR (Total Shots Ratio) since the World Cup is 39%, but they make up for it by putting 44% of their shots on target and getting quality looks. That remained a key strength of the US team against Mexico.

Rough go for Garza. The only space in the attacking half that Mexico found in the first half was in Greg Garza’s area. Garza has been given a long look by Klinsmann in these friendlies. He’s earned the most caps of any non-World Cup player with seven. 

The circled passes above were attempted by Mexico in Garza’s area. There was clearly space to operate and Mexico was exploiting. Yes, it appears that El Tri was building more often down the right side, but the fact that they found so much space in that area is disturbing. Meanwhile DeAndre Yedlin was playing very aggressive defense and his area remained primarily clean. That is until the 2nd half.

Mexico, perhaps seeing that Yedlin was aggressively playing the ball, shifted their focus to his side. Luckily they didn't have enough success to score a goal. It should be noted that Brek Shea kept his area on Mexico’s right hand side clean in his second half shift.

A win over your arch-rival will always be good, and this team needed to finish off a match and get a good result. With difficult road friendlies at the Netherlands and Germany on the horizon, we should expect more of the same style from Klinsmann. His speeches about playing proactively with the rest of the world seem to have quieted, but he’s found a nice recipe over the last few friendlies. The US has allowed just four goals in the last four games, and just one in the first half. At the same time they’ve put 13 shots on target and limited their opponents to just 9. The US has converted seven of those 13 shots as well. Hard to complain where the US sits as they approach the Gold Cup in July.

MLS Trade Analysis: Alex for Jason Johnson by Mike Fotopoulos

By Mike Fotopoulos (@irishoutsider)

Yesterday, the Fire traded Alex to Houston for Jason Johnson. These are the kinds of trades that make my inner MLS capgeek smile. Chicago trades a perfectly average midfielder on a perfectly average contract for a pocket full of cap room and a free player to boot. They needed cap relief and fewer midfielders, and this move gets the job done.

Alex is definitely out of the picture in the Fire midfield with Matt Polster, Michael Stephens, Victor Perez, and likely Chris Ritter and Razvan Cocis ahead of him on the depth chart. Getting Houston to throw Jason Johnson and his Generation Adidas contract is basically free money. Johnson’s contract is basically a free option to see if he pans out, so it would seem that the Fire are coming out ahead on the trade.

The question for Houston is their own need for midfield depth. Given the Dynamo’s current pairing of Nathan Sturgis and Luis Garrido, Alex seems to be bringing exactly that. He has struggled for playing time recently in Chicago, so it is hard to say whether he would be a clear starter over either. More likely, it is straight purchase of a serviceable midfielder, which is exactly what the Fire put up on offer. They found themselves with depth to sell and were able to find someone to pay them off. 

Interestingly enough, bringing in yet another forward player places Chicago back in a position where they can find themselves with more attacking depth. Mike Magee and Patrick Nyarko are still recovering from injury, but it is possible to see the Fire start the summer with an extra player up top. If Johnson can find a role on the current roster, they could see themselves ready to deal again, potentially making another deal along these lines. 

MLS cap space is a precious commodity, and as Chicago continues to repair its roster, optimizing every dollar spent is the key. Trades like this get some dead money off of the bench and also give a free look at a young player, so clubs should take advantage of these situations whenever they arise.

How Data Changes My View of MLS or a Frank Exploration of Luck in Dallas by Harrison Crow

by Harrison Crow (@harrison_crow)\

As I made my way to Toyota Stadium on Friday night I was concerned about just making it to the game on time. The traffic was horrendous and it was my first time driving around Frisco that collectively dragged my pace of getting to the park and was the reason I was walking up to the gate as fire works were set off and the National Anthem finished.

I stood just outside the south gate waiting for my ticket to arrive as Dominique Badji scored the Rapids first goal of the season and I felt a sense of validation in thinking that this was going to be a game that Colorado could be competitive and challenge for full points leaving Texas as I had implied with my post Friday morning.

It wasn't that I thought Dallas was a bad team as I wrote about them. I think Dallas is a very good team even after that beating, and I'm pretty certain they'll make the playoffs out of a very stacked and competitive Western Conference. The problem is that prior to Friday night they had the second highest PDO in Major League Soccer, a metric that is a measurement of luck based upon finishing and save percentage.

FC Dallas had scored a total of eight goals as a team behind the contributions of Blas Perez with three, Tesho Akindele just behind him with two and Fabian Castillo trailing with just one. Those three are what is going to drive the Dallas bus to success, just as the trio did last season, and though goals will come from other sources these are three that you can point to as "the guys".

The problem is that all three have been scoring goals with a much higher efficiency than what we'd seen previously from them. Now from what we've learned about scoring rates is pretty basic; they've had maniac highs and depressive lows. Even with the number of quality chances they're gotten, as described by our expected goal metric, it's not something that we could reasonably expect to continue. Again, not because they don't have fantastic goal scorers or that those players are of a lesser quality to the rest of the league. There are few players in the world that can score at their current rates.

Their high PDO meant that if the volume of Dallas' shots didn't change, they weren't going to continue to scoring goals.

Likewise, Colorado was riding a similar wave of eventual regression.

While Dallas had a high PDO, Colorado had a very low one (956, tied for second lowest in MLS) that was largely driven by their complete lack of goals across 48 shots. Yes, 48 shots without a goal. They should have, by our own measurements, scored four goals by the time they arrived to play FC Dallas this week and instead were sitting on a goose egg. Few teams can take near 50 shots over any given time frame during the season from the attacking third and come up that empty.

Just for a bit of applied science; over the past five seasons only two players have taken more than 48 shots and not scored a goal: Juninho for the Galaxy in 2014 and Kalif Alhassan for the Timbers in 2011. Based on shot leverage we can tell that Juninho was shooting from long distance and wasn't finding a lot of good chances. Likewise, being that he only scored five goals through 94 matches and is now playing in the NASL, it's possible Alhassan does not possess the finishing skills required from most goal scorers at the MLS level.

I'm not trying to say that it was certain Colorado was going to win a game or even score a goal.  Dallas could have very well done things different and there is luck to account for, too. Don't think for a second that shots like Dillon Serna's happen every week, there is a reason why it was special. The shot could have gone either wide or high and I'm a bit surprised that Walker Zimmerman didn't get a boot on it. Most players across soccer LEAGUES (not just MLS) convert those shots into goals in less than three percent of opportunities.

Colorado's eruption was mind blowing in the sense that I didn't expect them to score four goals, blow off the doors and leave Dallas with a clean sheet and all the points. But it's not as though I didn't think it couldn't happen either. That's the thing about all of this; we aren't trying to get a high definition picture of the future or to take the beauty out of anything, but instead it's to give the accomplishment context and measurement while understanding why it could have happened and if we should continue to expect it to happen.

Colorado isn't a team that I'm convinced is going to be anything great. They're probably at very best a 5th or 6th playoff seed if their defense holds up all the way through, which is another topic entirely. Likewise, Dallas has the attacking pieces to continue to beat their expected goal parameters and a top-3 seed isn't out of question. But if they either can't create more opportunities or continue to finish chances at a high rate, their regression may continue.

Weekend Kick-Off: Dallas Luck, Rapids Unlucky. by Harrison Crow

by Harrison Crow (@harrison_crow)

This week I come to you live from Dallas. I love Friday night soccer so much that I had to go and finally experience one with one of the hottest teams in MLS... the Colorado Rapids. Okay, maybe not. But perhaps still an interesting matchup. If you're at the game make sure you say, "hi". Good luck, because I'm in Waldo land.

Colorado and Dallas are connected by two primary memories; an odd MLS Cup match-up of two Western Conference teams in 2010 hosted in Canada, and the defection of Óscar Pareja. While the MLS Cup contest was largely considered to be a snooze-fest, the return of Pareja to Dallas through basically forcing the Rapids hands was a crazy, awkward, and a bit surprising too.

Now Pareja's once high flying Rapids attack have turned a defensive cheek under the guise of club legend and mustache aficionado Pablo Mastroeni. Despite controlling only 48% of the ball per match, Colorado have managed to turn limited opportunities into the third best possession ratio in the attacking third in MLS. That is to say, they complete more passes in the final third than their opponent by a significant margin.

They also create more shots than their opponents, with better shot position according to their expected goal differential. They clearly haven't done this in an ascetically pleasing manner, and their inability to score is downright shocking, but it's possible this Rapids team isn't as bad as envisioned nor the current narrative being passed down.

Dallas has also been a formidable side after once again a strong start the season and, just like deja vu, they are doing so without the creative production of Argentinian midfielder Mauro Diaz, who has missed time for a few different reasons. While I think the majority of supporters would love to see Diaz return to the lineup, the club's two points per game pace without him is impressive, and sets them on the path towards a strong season with Supporter Shield-like ambitions. It also gives hope that his return might catapult the club that much further.

Still, I would advise a bit of tempering going forward. While the joke surrounding FC Dallas the last couple years has been the "collapse" that comes with the end of Spring and the launch of the Summer campaign, this season we might see some problems ahead based off of three sets of numbers. Their PDO, TSR and of course their expected goal differential.

PDO - a predictive metric that is based off both finishing percentage and save percentage, which tends to regress to the 1000 level is floating a bit high at this point.

TSR - Total Shot Ratio lends credence to PDO, showing that they've been out-shot to this point in the season.

xGD - Expected Goal Differential works a bit in their favor. Despite surrendering more shots than their opponents, they've primarily found better position and with that a high probability of shot success which might explain some of their PDO. However, it's still only 10th best in MLS and, again, could point towards a fall back into the Western Pack.

While most are probably preparing for a 1-0 or even 2-0 game in Dallas favor, I wouldn't be surprised to see some opportunities fall in Colorado's favor to include the result. I'm not a huge fan of selling off Deshorn Brown but if that couldn't have been helped, as is some portion of the story being told, the primary concern at this point is the gap that stands between his departure and new DP striker, Kevin Doyle. But the team still possesses the attacking talents of Dillon Powers, Juan Ramírez and Vincente Sanchez. All are strong enough weapons to create chances and maybe find the back of goal. Oh and Gabby Torres, he's still there... *check's Colorado Rapids active roster* Oh, yep, he's still there too.

Or maybe Fabian Castillo and Blas Perez could just destroy everything. Either/or or perhaps neither seems conceivable.



Colorado Rapids

Axel Sjoberg  (owned 32.7% - worth $4.7)
A value buy in the defense, this hulking man has taken up starts in lieu of the injuries and international call-ups this past year. He did not play the previous two games and I'm not sure he'll play this weekend, but he's a great reserve guy to have for a team that typical boasts defensive posture by nature.

Dominique Badji (owned 23.4% - worth $4.4)
Badji is a speedy winger drafted this off-season by Colorado that's kind of creepy in the "dating a new girlfriend that looks kind of like your old ex-girlfriend" sort of way in a post Deshorn Brown world. As a fantasy selection he's cheap which has garnered a bunch of manager selections and seen minutes in all four Rapids games this season making him all that more valuable.

My biggest qualm with Badji and something to be advised about, he has only collected .31 expected goals scored from three total shots through his 202 minutes on the pitch this year. To put that in perspective, Cyle Larin has 101 minutes and collected .71 expected goals scored off five shots. Add that to Luke Shelton taking his spot in the 18, and he might not see so much time on the pitch in the not so distant future.

FC Dallas

Chris Seitz (owned 22.5% - worth $5.1)
Sietz has, according to our data, shown to be worth a whole goal better than the average keeper during the course of his time between the posts for Dallas. Which is saying something considering the "reactive" nature and the amount of shots they've allowed thus far into the season.

Ryan Hollingshead - (owned 19.6% - worth $5.5)
The former UCLA talent has come back to the states from helping his brother build churches and become the much needed answer for "what happens when Mauro Diaz goes missing" with .50 expected assists in 307 minutes. Making him a great bench value pick.


This week we added in another factor into our watchability score: possession. The idea behind that is simple: the fewer possessions, or fewer turnovers and attrition you tend to have, the more ascetically pleasing a match is to watch. This addition and a slight tweak in combining helps put this on a basic 5-100 scale which makes everything a bit easier to digest and consume.

Next week I'll put up another survey asking you to rank your favorite games of the week and then we'll compare it to the Watchability Score and hopefully it will iron out the kinks. I will say that we need more than the 11 participants of last week seven of which voted for the Sounders match being the most entertaining. Which has me asking if people just ranked it first because it was their favorite team or if it was an entertaining game. I point this out specifically because it ranked last in our scale last week. Also, did you see that Sporting v Union game? I mean, c'mon.



Columbus Crew SC (0.25) at New England Revolution (0.46) - WS: 54
After a huge mid-week draw against a surging Vancouver, and short Federico Higuian, the Crew take on a Revolution side that might have Kelyn Rowe trying to be this years verison of Lee Nguyen ranking 5th in xG+xA. Prediction: Revolution Win.

New York City (-0.35) at Philadelphia Union (-0.03) - WS: 45
The scale see's this as a better than average game to watch and with David Villa out there I have to believe that's a good call. That said both teams have been rather a bit lack luster this season. Prediction: Draw

New York Red Bulls (-0.34) at DC United (-0.81) - WS: 37
Rivalries are fun and this one is no exception, not even considering the extenuating circumstances the scale has this game as the top game to watch this week. Prediction: Draw

Montreal Impact (-0.31) at Houston Dynamo (-0.45) - WS: 68
Houston play host to a team that is only the second ever MLS Club to qualify for a CCL final. Despite that I think is probably going to be one of the toughest games to actually sit through and watch. Prediction: Draw

Real Salt Lake (-0.34) at Sporting Kansas City (1.24) - WS: 57
It's only a matter of time before Sporting just has their big break. Our metrics don't just have them as the best team in MLS, it's not even close. Oh, and RSL just continues on messing with our models. ONE OF YOU, MAYBE BOTH, NEED TO QUIT MESSING WITH US! Prediction: Sporting Wins

Vancouver Whitecaps (0.42) at San Jose Earthquakes (-0.65) - WS: 50
Could this be the Whitecaps year? Maybe? Maybe this is the year the Earthquakes make a return to being relevant in that "we're really not that harmful" sort of way. Prediction: Whitecaps Win


Orlando City SC ( 0.28) at Portland Timbers (0.24) - WS: 51
Two very entertaining sides in this one. The scale is only at 51 on this but it's also the third lowest, read as: best, scores this week. Probably a great match to sit down and watch, if it goes south there is always that afternoon nap. Prediction: Timbers Win

LA Galaxy (0.40) at Seattle Sounders FC (.39) - WS: 60
This game is one that consists of two great, and in recent years, powerhouses. The stats still very much favor both teams as still being talked about in that top tier but their overall performances have lead fans to be a bit skeptical about the immediate future. Prediction: Draw


How Pablo Masteroni would look dancing after winning in Dallas this weekend. Things 'bout ready to get interesting up in here.