What was it that made Ian Bell, Jonathan Trott and Michael Clarke stand out during the recently concluded Ashes 2013? A researcher uses complex network analysis to find the answer.
In the recently concluded Ashes test series, England retained the urn by beating Australia 3-0 in five games. England always looked the more confident team, reinforced as well as evinced by the confidence each player had on every other. They batted well, they bowled well, they fielded well.
Australian players, on the other hand, looked out of place. Often, great performances by a batsman or a bowler didn't translate into the rest of the team moving with that spirit, betraying high – if not unreasonable – dependence on some players, who were expected to bear the burden.
Now, a post-doctoral fellow from the Kellogg School of Management, Northwestern University, has put these conclusions to the test. Satyam Mukherjee has used complex network analysis to determine how England and Australia differed in their strategies during the Ashes matches, to gauge the "quality" of wins and how much of a role each player played in it.
Satyam thinks that, to the best of his knowledge, “this work is the first of its kind in cricket”, and is hoped to motivate analysts to look at behind-the-scenes statistics of players whose best skills may not always be brought to the fore.
Specifically, Satyam uses concepts like the PageRank algorithm (which Google uses to determine the 'influentiality' of websites), betweenness and network centrality, and treats each team as a network of players who have to perform specific roles.
Math & matters of the heart
"Two football players are linked if one player passes the ball to another, a pitcher and batter is connected if they face each other, or Nadal and Federer get connected if they play against each other," says Satyam, explaining how networks are built. "But in cricket, no such studies exist although there is no dearth of statistics."
However, a network-analysis of a game of cricket is much less straightforward as the success of the game doesn't depend solely on the ball being pass around or batsmen like Tendulkar and Lara facing each other off. Instead, they face off different bowlers, which means their performances can't be compared directly, either.
What a self-organised social network looks like (nodes of the same colour are of the same group). Image: Wikimedia Commons
So he used publically available data from Cricinfo to compute the network performance of players and how well they'd performed different roles. "The network based approach gives us the hidden properties of the performance of players," Satyam explains, adding that the advantage is that "it doesn't suffer from any biases which exist in traditional schemes."
In his network analysis, each player is thought of as a node (as shown above) in a network, with the lines connecting them being the runs scored by them together. This way, as the game progresses through different partnerships, nodes are added and connected, with the distance between nodes denoting the number of runs.
Then, Satyam brings his tools to reveal, when studied as a network of people trying to accomplish a common goal with different skills between them, how the team strategised and how it fell short.
According to his calculations, for example, Gautam Gambhir was the most successful player in terms of centrality scores during the 2011 ICC World Cup final for India. This means that he was involved in the most number of batting partnerships during the game (betweenness centrality). However, the man-of-the-match award went to skipper M.S. Dhoni.
"So there is a human bias coming into play," exclaims Satyam. However, this doesn't come across as a call to replace the more "spiritual" aspects of the game with a mathematical framework. Instead, Satyam is vouching for using such analytical methods to decrease the chances of missing out on important statistics that come into play during drafting, team-selection, etc.
Teams as competing networks
These and other network analysis concepts have been around for quite a while. They have been applied to sports for the last decade or so, quite famously to football using the Girvan-Newman algorithm and others. The parameters they use to "evaluate" teams are simple.
PageRank, a relatively newer measure developed by and named for Google co-founder Larry Page, measures the "quality" of outcomes (i.e. wins or losses).
In the context of a match, PageRank scores give a measure of the quality of wins. If a weak team wins against a relatively stronger team, it gains points. However, if the weak team loses to a strong team, it isn't penalized that much. Each outcome's PageRank is dependent on the performance of every player.
In the context of players' performance, "it gives the importance of the player in the batting line up," explains Satyam. In other words, it provides us with an idea of the importance of runs scored -- such as Graeme Swann's 34 in England's first innings of the final test.
It is calculated as:
p_i = PageRank score
w_ij = weight of a link
s_j-out = out-strength of a link
i = whichever team it is
q = control parameter = 0.15 (default)
N = total number of players in the network
δ = a correcting term
In-strength is the sum of the fractions of runs a player has scored in partnership with others players.
Closeness measures the connectedness of a player in the team. The 'closer' he is, the more open he will be to his place in the playing order being changed. For example, the 'closest' batsmen will be comfortable opening the batting, playing in the middle order, or holding up the lower order. This can be decided based on the match situation, pitch conditions, availability of other players, etc. Thus, having 'close' players increases the adaptability of the team.
Satyam put together a network of players in each of the five matches, and computed these scores for all of them in terms of their batting performances.
He found that, in the first and second games both of which England won, Ian Bell and Joe Root emerged as the best batsmen, respectively. Bell, especially, had the highest PageRank, in-strength, betweenness and closeness among all batsmen. In the second match, Root had the highest in-strength, betweenness and closeness, but Usman Khawaja and Michael Clarke beat him to the top on PageRank.
In the third test, Australia dominated the game. However, the domination arose through Clarke, the man of the match, while the rest of the players put up a less-than-dominating performance. In fact, this pattern was visible in Australia throughout the tournament. As opposed to it, England’s batsmen’s betweennesses were more evenly distributed. Everyone seems to have contributed, not just the top order.
For example, one batsman who regularly features in the top five players in terms of PageRank is Tim Bresnan, an all-rounder. Thus, his ability to build partnerships even when most specialist batsmen had departed was crucial for England to have stayed on top –such as in the fourth test at Riverside Ground. Looking at the overall scores: Bell – most betweenness centrality; Matt Prior – most closeness; Jonathan Trott – highest in-strength; Graeme Swann – highest PageRank.
A batsmen's performance network as it transpired during the Ashes 2013. Notice how almost every player on the English side was capable of holding up partnerships while, for Australia, noticeable 'hubs' exist in the guise of Haddin, Hughes and Rogers. Image: Satyam Mukherjee
For Australia, on the other hand, Haddin, Hughes and Rogers have high betweenness centrality, which quickly drops off when other pairs are considered.
Accordingly, batsmen who received man-of-the-match awards during the series were Joe Root, Michael Clarke, and Shane Watson. This is an instance of simple mathematical concepts having encapsulated our practical considerations well enough to have reached almost the same conclusions (even though a lot of assumptions were made in the process). However, cricket is only new to this arena.
In 2003, Michael Lewis published a book titled Moneyball: The Art of Winning an Unfair Game. It brought to a wider audience the field of study called sabermetrics, which uses in-game statistics in baseball to separate objective judgments – “Who contributed the most…” – from subjective ones – “Was that a great…” – so that teams are aware of what their strongest and weakest resources are.
In 2011, this book was adapted into a successful movie, starring Brad Pitt. Although I’d heard of the book at the time, I hadn’t read it, and the movie helped me confront for the first time how managers unfamiliar with sabermetrics’ pros might react to the idea. In the movie, many of them quit (However, most of them were old, too, and just couldn’t cope with the power to pick or drop players leaving their hands and falling into those of some “new fangled, cold-and-calculated" sabermetrician).
For this, what network analysis in sports can bring to the fore has to be understood well before it is dismissed. In its simplest form, it makes correcting for regional biases in selections easy and helps spot ‘hidden’ talent in the domestic circuit. At its very nuanced, it could factor in bowlers and fielders, not just batsmen, and also include an “athletic index” for each batsman to denote how agile he is between the wickets, to see who has been the best performer (a suggestion included in Satyam Mukherjee’s paper).
Of course, for the game to stay competitive and entertaining, both subjective and objective methods are important. Even with cricket, I can't imagine the BCCI resorting completely to sabermetrics’ version of cricket to choose the national cricket team – how would they be able to account for the reassuring presence of Captain Cool? Instead, they could use such tools to better inform their decision-making.
(For those interested, a more detailed presentation of Satyam's methods is available in this paper.)