I'm no mathematician. Matty maybe, but I am not. So when approaching something like Game States, I felt it good to attempt to introduce it with something, though it's rather ominous and a bit intimidating. So most--if not all--of the information provided is taken from a source who is smarter than I am. That's really what this blog is all about, finding people who know and understand the principles we are trying to learn and centralize the material and keep it in tidy location where people that are new---not just to the sport, but also the concept of analytics---can go to find information and grow their knowledge.
The idea of game states is that a match will consist of a sequence of states, where each state is defined by a combination or series of events that culminate in creating a new state. Those events give details and help break down the match. They provide context or meaning to the data that we record. Game states, as I understand it, is based upon the idea of the Markov Chains.
...[A] mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process usually characterized as memoryless: the next state depends only on the current state and not on the sequence of events that preceded it.
Let's apply this idea to a sport...
In football, if a team is in a certain situation, what happened previously has no effect on what will happen next. For example, if we have a 1st-and-10 from our own 20, it does not matter if the previous play was a kickoff for a touchback or a 10-yard gain for a first down after a 3rd-and-10 from the 10-yardline. Either way, we now have a new situation that will only directly affect the next play.
To put it all into soccer terminology, if a team has possession of the ball and is progressing past the half-line and into the attacking third it doesn't matter if they got it on an interception or a goal kick. Regardless of how it happened they now have the ball going into the oppositions attacking third and will have an opportunity to threaten and score. I struggle with this thought because tactically this may not be true---an individual getting the ball on a break is different than someone participating with a soft building of play in attacking the opposition's net. Coming from a statistical and memory-less position we simply want the facts.
Let's go to another sport and put this into baseball terms because of the advanced progress that the sport in general has made in analytics. Baseball analytics breaks game states down into four basic concepts: the score, inning, base runners and outs. This would be the way you could calculate basic run/win expectancy and ratios.
If you have the average number of runs expected to score in an inning after any game state, you can figure out how many runs a stolen base is worth, or a triple, or a strikeout. The game state essentially allows us to relate everything that happens on the diamond back to the major currencies of baseball: winning and runs.
We're not talking about baseball. We're talking about soccer, or for you euro snobs, 'Fútbol'. However, taking the concepts of an already practically applied matrix, such as baseball and the one that that Tom Tango has already developed(see below), can give us ideas on how we can attempt to create a corresponding one in soccer. Soccer has wins too---though I see more people associate points---but our true currency in this sport is goals. Everything leads back to the price or value of a goal. Whether it be a cross or a tackle, the ultimate result of what we want is to be able to understand how things work together to produce goals.Comparing the possible game states; Soccer just--just like in baseball--has a score line that can help give us an idea of the transitions between states. A team being up one goal, or on the other side of the coin being down one goal, can give us a definition of a game state. It's simplistic, but it works. It also can show us why some times, looking at you Alex Fergueson, teams take less shots than at other times. It would make sense just for the purpose of proper possession.
We also have measurement for the length of the game, in that each team will play against one another for a total of 90 minutes. In soccer, we have time intervals. The largest problem is that because it's constantly moving rather than a set static state, such as an inning, it creates a lot of various probabilities and chains. However, if you wanted to mitigate that to an extent you could instead just revert to the basics of using the first half vs. the second half of a match. While these are two big time intervals, when used in conjunction with other specific game states it could continue to help us develop a better understanding of the game.
The next one listed is base runners, but I'm going to pass on that and move on to the concept of outs. This is something that at very best I can say is a difficult correlation, but if you wanted to attempt one I might try changes in possession. This is a rather poor concept and I have no idea how or if you want to use this... probably not. Baseball limits a team to three attempts to score per their half of the inning rather than having 3 specific possessions or attempts to score. In the first 45 minutes of a half you could have a team with anywhere between 12 and 40 possessions alone depending on who the opponent was and how the team executed its attack.
Back to base runners, and this is one of the easier things to mimic though I have no way of proving how closely they are related. Shots on goal conceivably could give us a baseline on the probability of goal scored and, more importantly, the points that are associated with winning or drawing.
One additional game state that hasn't been mentioned and doesn't really correlate with anything that baseball has, is yellow and red cards. While, baseball has ejections it doesn't necessarily affect the dimensions of the game. However, in soccer it puts the team down a man, rather than being able to just sub in a replacement.
These are all elements which you would consider context. How do teams perform during these situations? Do their possessions last longer? Do they take more shots? What are the quality of shots? This is the information and really the purpose behind gathering the data points. Are the Seattle Sounders just as likely to score in a "-1" goal situation as they are in a tie-game?
This is all information that we are seeking. The context can provide further details as well as specific entry into how certain aspects and statistics can be properly correlated to goals, and then to points toward the table.
I feel like there is more to write about here... and that's because it's true. There is a lot more to write about. But this is primarily to cover the basics of game states. We'll talk more about it in our podcast tomorrow, and we'll have a follow up post to everything when we start preparing to post MLS game state information.