Wednesday, 3 September 2014

Winning an ODI from the front

Is Cook the captain to correct England's ODI ship?
There have been a lot of debates recently about the role of the top order in an ODI. Is it more effective to come out swinging or is a more cautious approach more appropriate?

Until 1992 the expectation was that the top order's job in an ODI was to see off the new ball, and scoring at 3 an over was fine. If you look at the great opening bowlers of the 80's you will see that most of them had economy rates near 3.5 rpo. Then in the 1992 World Cup something wonderful happened. In the 10th match, New Zealand were playing South Africa, and one of New Zealand's premier batsmen, John Wright, got injured. In came Mark Greatbatch. Rather than playing the traditional openers role, he took advantage of the fielding restrictions and scored 68 off 60, including hitting Allan Donald back over his head for a 6 that landed on the roof of the stand. From this point onwards, New Zealand's approach changed, and the first 15 overs were seen as the best time to score quick runs.

In the 1995/6 Australian tri-series, Sri Lanka took that tactic to the next level. Kaluwitharana and Jayasuriya batted like whirlwinds and not long later they helped take their team to victory in the World Cup. At this point the world really stood up and took notice. The dashing opener was now the in thing. New Zealand had Astle, Australia had Gilchrist, India used Sehwag, Pakistan opened for a while with Afridi. An attacking opener had become as much a part of the game as using a spin bowler in the "boring middle overs."

But recent changes to the game, such as the different fielding restrictions at the end of the game and 2 new balls have meant that some people have questioned if going for it at the outset is such a good tactic now. Is it a better idea to keep wickets hand and "go harder, later."

The majority of this discussion has come out of England, where the roles of captain Cook, Bell and (before he dropped out of the game) Trott have been coming under increasing scrutiny. Are they batting too slow? Are they putting too much pressure on the players coming in after them?

This led me to have a look at the role of the top 3, in the past 2 years. What I wanted to know was what had the biggest impact on a team winning, the openers batting a long time, them scoring a lot of runs or them scoring at a quick run rate.

The first thing that I did was I got a list of the scores when the second wicket fell in each match in the past 2 years, and the outcomes of the matches.

The next step was to graph it, and see what came out.

First I looked to see if there was a different relationship between the overs taken and runs scored for the first 2 wickets for teams that won and teams that lost.




We can see that there is a fairly strong relationship with both graphs.

I set the intercept of the trend lines to 0, so that the gradients are effectively the run rates.

There are a few noticeable differences.  Firstly the teams that win tend to score at a higher run rate for the first 2 wickets than the teams that lose. However when I overlapped the two graphs, this difference was not as striking visually as it is numerically. (After 40 overs the trend lines are actually only 28 runs apart)

More significantly, there's a lot more data above 15 overs on the winning graph than on the losing graph. There is also a lot more instances on the winning graph where the first two wickets have contributed 150 or more runs.

I ran a quick bootstrap analysis and found that there was a statistically significant difference between both the median number of runs scored at the fall of the second wicket by the winning teams and the number of overs that they batted for.

To highlight that difference, I drew a cumulative frequency graph. The difference in the distribution of the number of overs faced by the winning and losing teams is quite striking.

That gave me some reason to search further.  It seems that there is a statistical evidence to say that there is a difference between the performances of the top 3 batsmen of teams that win and lose.

The next question was to see which made more of an impact: the number of runs, the number of overs or the rate that the runs were scored at.

To do this I ordered the innings by each of these three variables, and then looked at how many of the 9 innings surrounding each one were won and lost. This is not particularly intuitive, but it did give me some idea about the impact an increase in the variables would have on the likelihood of winning.

First I looked at run rate:

There is a trend here, but it's certainly not a strong relationship.

It's clear that increasing the run rate that the first two partnerships score at can contribute to increasing the likelihood of winning.

The important number here is the R² value of 0.35914. The closer to one this value is, the more linear the relationship is. While it is not a perfect measure of how strong a relationship is, it is a good indication.

The next graph that I looked at was the total runs scored.

This is a similar relationship, but it is clearly stronger.

The points are generally closer to the trend line and the R² value is higher (0.44074).

The data (expectedly) thins out as the number of runs increases, as it's quite rare for teams to get to 200 for the loss of only one wicket.

The R² value is less than 0.5, so there's still more of the variation that is unexplained than explained by this relationship, but again that is to be expected, as there are a lot more factors in a game than the first 3 batsmen, and it is always possible for a game to change suddenly.

The total runs scored for the first two partnerships seems so far to be a better predictor of success than the rate that they scored at.

The third factor is the one that I found the most interesting.


The relationship between the overs batted and wins looks (unsurprisingly) quite similar to the relationship between runs and wins.

The R² value for overs is lower than the corresponding value for runs, but both are higher than the relationship with run rates.

The point where the teams are winning more than they are losing are roughly 14 overs, 75 runs and 4.7 rpo respectively. These numbers start to give us an idea about what we should be looking for in an opener.

However, I wasn't totally convinced by these graphs.  I wondered if they would have turned out the same if I had chosen to look at 15 innings or 5 innings, or some other slightly different way of looking at it.  So I decided to try looking at the winning probability for individual points.  To do this I rounded the run rates to the nearest 0.1 rpo, the overs faced to the nearest over and the runs to the nearest 5 runs.

Again these graphs were quite interesting.


Visually, the strength of the relationship is indicated by the degree to which the top of the colours bars get close to trend line.

The runs and overs graphs look like they are a better fit than the run rates graph. But we can get extra evidence for the strength of fit from the R² value again.

This time the R² value for overs was significantly stronger than it was for either the run rate or the total runs.

The lower (green) graph suggests that for every extra over before the second wicket falls, the probability of winning increases by about 1.7%

Likewise, for every extra 5 runs scored, the probability of winning increases by about 1.7%

At this point I started to feel like there was some fairly significant evidence that having a top 3 that can see off the new ball is definitely the way to go.

The rate that the top order score at is important, but it doesn't seem to be as important as the number of overs that they bat for.

This makes sense for a couple of reasons. Firstly, a cricket ball is at it's most hittable when it is about 15 overs old. At this point it's still hard enough to go quickly off the bat, it has normally stopped swinging conventionally and hasn't yet started to swing unorthodoxly. The ball is the hardest to play when it is less than 10 overs old, because the ball will swing and seam, and the edges will go to hand, rather than dying as quickly as they do later on.

When the balls are 15 overs old, the match is 30 overs in.  At this point it's sensible for teams to have their best hitters at the crease.  If they are in too much before the 20th over, the hitters are being exposed to a swinging ball that makes it difficult to time their shots.

The next bit of analysis I did was to look at those 3 marks from above (14 overs, 75 runs and 4.7 rpo) and look at the difference in the results between teams that reached these milestones and teams that did not.

The outcomes were again quite interesting:


Here fast was any innings where the first two partnerships scored at more than 4.7 runs per over, big was where the second wicket fell when the score was over 75 and long was where the wicket fell after the 14th over.

The column titled "Relative" is the relative probability.  This means that teams that have their first two wickets score at more than 4.7 rpo are 37% more likely to win than teams who score slower than that.

Again we see that the biggest advantage is when the first two wickets last more than 14 overs.  It isn't a panacea, but it is important.

The final thing I did was to look at two similar skilled options, with different approaches, and see which was best.  To do this I looked at every innings where the second wicket fell between the 7th and 14th over, and the run rate was between 4.7 and 5.3 (this puts them roughly in the 2nd quartile for scoring rates and the 3rd quartile for length). I then also looked at times where the second wicket fell between the 14th and 19th over, with a run rate between 3.8 and 4.7 (which puts them roughly in the 3rd quartile for scoring rates and the second quartile for length).  Again the more cautious approach paid dividends, although this time with a much smaller sample size.

The more attacking start yielded 9 wins and 13 losses (40.9% winning record.) The more cautious start also yielded 9 wins, but only 8 losses (52.9% winning record). The difference here, however, is not statistically significant, given the low sample size. 

Putting this all together, an ideal opener is a player who averages at least 37.5 (half of 75), averages lasting 42 deliveries (half of 14 overs) and has a strike rate of at least 78.3 (equivalent to 4.7 runs per over).  If they a compromise has to be made on one of these it should be the strike rate, as that's the least important to help the team win.

So, finally, how does Alastair Cook stack up to these criteria?

At the time of writing Alastair Cook averages 37.51 runs from 48 balls at a strike rate of 77.66.

He (just) makes two of the 3 criteria, and the one that he misses out on he is very close to, and is the least important.

England have not been going well in ODI cricket recently, but Cook's batting is not the right thing to blame. The issues are clearly elsewhere.

Thursday, 12 June 2014

Some stats after the first test in Jamaica

BJ Watling

BJ Watling has taken 5 dismissals again. He's now joined Parore and McCullum as the only New Zealanders to have taken 5 dismissals in an innings 4 times. Ian Smith only did it 3 times.

He also leads the way in terms of 8 dismissals in a match. He's done it 3 times now, there have only been 3 other times a kiwi has done it, Once each for Smith, Lees and McCullum. He's 5th overall for that, behind Boucher, Gilchrist, Healy and Marsh. (But they all had much longer careers)

He's taken 2.296 dismissals per innings. Nobody who has kept for more than 3 matches has managed that.

He also leads the way for NZ with the bat, averaging 47.25 when he is keeping. The next best is McCullum at 34.18, followed by Blain at 32.30, Parore at 26.94 and Smith at 25.56.

Globally he's 4th of all time, behind AB de Villiers (56.96) Andy Flower (53.70) and Adam Gilchrist (only 0.35 ahead on 47.60). The guy that has traditionally been considered the best ever is Ames, in 5th. He averaged 43.40.

NZ under McCullum

New Zealand have won 4 and lost 4 under McCullum. There have not been many New Zealand captains who had a winning record. Only Fleming, Coney and Howarth have winning records, and Fleming and Coney only by one match.

Under McCullum, they have averaged a collective 33.70, which is only slightly behind the 33.99 that NZ averaged under Wright, but they were ahead of it before they came out swinging to try and get quick runs in the second innings.

McCullum has led NZ to a score of 400 in 9 out of his 15 matches. Only Steve Waugh has a better % of getting 400s.

To put that in context, New Zealand averaged 26 under Taylor with roughly the same players, despite having played in South Africa and England under McCullum.

Boult - Southee combination

In matches where they have played together, Boult and Southee have a combined average of 24.08. This puts them close to the all time great mark (McGrath Gillespie averaged 23.02 and McGrath - Lee averaged 25.32). They are clear of New Zealand's other very good combinations - Bond & Martin averaged a collective 25.01 and Chatfield & Hadlee averaged 25.39.

Peter Fulton

Fulton has only scored 306 runs in his last 10 tests, at an average of 17. However he has still averaged about 30 since he came back, which is still quite high by NZ standards. Even with these games, and the ones where he came in earlier, he's still New Zealand's 11th highest averaging opener ever, one place ahead of Guptill.

His double hundreds against England were not flukes. He was in very good form at the time. In the 21 innings leading up to them he averaged 52.7 in first class cricket. But in the 31 innings since then he has averaged only 18.7. I think it is possible for the selectors to drop Fulton despite still keeping faith in him. They need to say "you are not in great form, but we know that you are a capable player. Go away and get some runs under your belt and we'll pick you straight away."

Wednesday, 30 April 2014

Bowling in the IPL

Last year I had a look at how much wickets cost in the IPL, and devised a formula to calculate the value of a bowler in a team. I used that formula in a number of other cases throughout last year, and it seemed to bring some fairly sensible results each time, so I've decided to try it again with this years IPL as the first stage draws to a close.

Here is the top 15 bowlers, with their modified run rates. This takes into account the benefit that they have provided to the other bowlers in the team through the wickets that they have taken. I limited it to who had bowled at least 10 overs.

NameTeamOversWicketsModified Run Rate
Sandeep SharmaPunjab1172.45
VR AaronBangalore14.582.97
SP NarineKolkata2093.10
YS ChahalBangalore1973.47
SL MalingaMumbai15.373.61
AR PatelPunjab1864.06
MM SharmaChennai15.584.17
KW RichardsonRajasthan1564.40
PV TambeRajasthan2074.60
R DhawanPunjab13.244.65
R AshwinChennai17.554.65
B KumarHyderabad15.364.71
MA StarcBangalore2074.80
IC PandeyChennai1534.87
R BhatiaRajasthan1664.94

Somewhat unsurprisingly the top name in the list is the current rising star of the IPL - Sandeep Sharma.

His heady medium pace bowling has been a big part of the success that Kings XI have enjoyed. Often medium pacers can enjoy good results in limited overs cricket through consistency, But Sharma offers something more than that.

Right arm inswinger is a style of bowling that is normally only seen at the junior grades. Top senior batsmen normally develop a technique that allows them to avoid being dismissed by it, and then they can just wait for the inevitable bad delivery that slips into the pads and can be dispatched.

Sharma has managed to bowl consistently enough that he has only been hit for one leg-side boundary in 11 overs. If players are having to look on the off side for their runs from an inswing bowler then they are at risk of leaving the gate open.

If Sharma manages to continue to bowl as consistently, he could be potentially be a real force, not just for Kings XI, but also for India. He could be particularly useful in the World Cup in Australia/New Zealand where the ability to move the ball in the air is a real asset.

Saturday, 15 March 2014

Is it game over if you lose more than 2 wickets in the powerplay?

I recently observed this conversation on twitter:


It immediately made me wonder if Aakash was correct. Do you lose if you are more than 2 wickets in the power play of a T20 International.

I decided to find out. I felt that it was probably best to only look at situations where a team had batted first, as there is not any external scoreboard pressure (or lack thereof) interfering with the batsmen's mind sets.

I looked at every match where there was a result inside 20 overs (I ignored matches that had ended in a super-over or bowl-off) and looked at how many wickets down the team were after 6 overs. I didn't count "retired hurt" as a wicket, despite there being a change of batsmen and the batting team losing momentum similar to when a wicket falls.

Once I did that I came up with some quite interesting numbers.

Wickets DownWinsLosesWinning %
0411869.5%
1744860.7%
2525150.5%
3113623.4%
451033.3%
5030%

It's fairly clear here that losing wickets early hurts the probability of winning. This is not really a surprise, often teams bat their best batsmen at the top, and the subsequent batsmen have to take fewer risks if there are not many wickets left above them. However while there are a lot of incidents of teams losing 1 or 2 wickets, our sample size is quite small for the other number of wickets. I've graphed it, adding in a 95% confidence interval. This indicates what range we can expect the actual winning probability to lie in per wicket loss: The shorter the line, the more reliable the data.



We can clearly see the trend here. But we also notice the huge gap between being 2 down and being 3 down. There does seem to be a difference between losing 2 wickets or losing more than 2 wickets.

Accordingly I broke it down into 3 groups. Less than 2 wickets, 2 wickets or More than 2 wickets. Here's how that looks:


Roughly teams win two thirds of the matches where they lose less than 2 wickets, half of the matches where they lose two wickets and about a quarter of the matches where they lose more than 2 wickets.

I also broke it down further by team, and this holds true for almost every team. The only team that has won more than half of their matches when batting first and losing more than 2 wickets in the power play is Ireland. (Interestingly Ireland has the 4th best winning record of any team batting first, and then they are not far behind Pakistan, Sri Lanka and South Africa).

Sri Lanka win just under 80% of t20's when they lose 2 or less wickets in the power play, but 20% when they lose 2 or more wickets. England win just over 60% if they keep their wickets in hand, but only 20% when they lose 3 or more in the power play.

With the World T20 getting underway, how the teams approach the first 6 overs could be a fascinating thing to keep an eye on.

Sunday, 9 March 2014

Who are the most reliable 6 hitters

I noticed that the ICC have set up a new game, where you need to pick a player who is going to hit a 6.

This is an interesting option, as there are not many stats out there for how reliable batsmen are at hitting 6's. We know how many 6's a player has hit, but how regularly they hit them is another issue. For example, Aaron Finch has hit 21 sixes in the 9 matches he has played in the last 2 years. However those 21 sixes came in just 4 innings. In the other 5 matches he didn't hit any. Once he gets going he really starts to pepper the boundary. In comparison, Ziaur Rahman from Bangladesh has hit 10 sixes in the 11 matches he's played in that time. However he's hit those 10 sixes in 6 matches, meaning there are only 5 that he hasn't hit a six in. In other words Finch has hit more sixes per match, but Rahman is significantly more reliable.

As the ICC game is about either hitting a six or not, the most important stat is their reliability, not their sixes per match.

To help out anyone who is playing that game, I've compiled a list of the 6 hitting reliability of players who had played 5 or more matches in the last 2 years. I've listed everyone who has hit a 6 in 40% or more of the matches.

If you want to join my league - here's the link.

PlayerMatchesSixesInnings with a 6P(hits a 6)
SE Marsh (Aus)55480%
DR Smith (WI)916666.7%
Yuvraj Singh (India)1121763.6%
MDKJ Perera (SL)1114763.6%
AM Rahane (India)54360%
SR Watson (Aus)1430857.1%
RR Patel (Kenya)1417857.1%
MJ Guptill (NZ)1415857.1%
HD Rutherford (NZ)79457.1%
MN Waller (Zim)75457.1%
Gulbadin Naib (Afg)1112654.5%
Ziaur Rahman (Ban)1110654.5%
MEK Hussey (Aus)118654.5%
Mushfiqur Rahim (Ban)1311753.8%
BB McCullum (NZ)1726952.9%
KA Pollard (WI)1725952.9%
DJ Bravo (WI)19181052.6%
MN Samuels (WI)1626850%
MJ Lumb (Eng)1213650%
R Gunasekera (Can)84450%
JL Ontong (SA)66350%
MW Machan (Scot)64350%
CH Gayle (WI)1526746.7%
DA Warner (Aus)1522746.7%
LMP Simmons (WI)1112545.5%
MR Swart (Neth)1112545.5%
DA Miller (SA)117545.5%
AD Hales (Eng)2018945%
AJ Finch (Aus)921444.4%
Ahmed Shehzad (Pak)1614743.8%
Mahmudullah (Ban)1412642.9%
Mohammad Shahzad (Afg)1411642.9%
Asghar Stanikzai (Afg)75342.9%
LJ Wright (Eng)1920842.1%
DT Johnston (Ire)129541.7%
Shakib Al Hasan (Ban)127541.7%
Mohammad Hafeez (Pak)25201040%
GJ Bailey (Aus)2016840%
F du Plessis (SA)1511640%
RS Bopara (Eng)1010440%
NJ O'Brien (Ire)52240%

Thursday, 6 March 2014

Who should win the NZ cricket awards

I was asked by Tony Veitch to put together some stats for the different awards on offer for the New Zealand Cricket Awards tonight.

I could have just brought up a list of averages, but that's really not the CricketGeek style, so I decided to delve into things a little more closely.

One of the difficult things in cricket statistics is to compare bowling success with batting success. For example, which is better taking 5/84 or scoring 172? We need a device to compare the two disciplines.

I decided to compare each player's year with the historical averages for their position. For example, for batting I compared the batting average with year end batting averages throughout history. I had a cut off of 10 innings, as making a cut off much higher than that excludes too many players, as most teams play less than 10 tests per year. I then compared a player's average to the historical average of averages, and the standard deviation of averages to generate a z-score. (For more on Z-scores, see This NFL blog post)

I used batting average and bowling average for test cricket, as really what we care about is scoring runs and taking wickets. I wasn't totally happy with the results, as there was no advantage for the players who had maintained a high standard over a number of games, rather than just one. (James Neesham, for example, averaged 171 this season, but only over one match).  I first filtered out anyone who hadn't either batted in 10 matches or who had bowled less than 100 overs. Then I multiplied the z-score by the square root of the number of innings that they had applied their skill in, in order to get a fairer list. It only caused a couple of positional changes, but the new lists looked more appropriate.

Here's the test lists.

Player - SkillAverageRanking
LRPL Taylor - batting81.6012.3
BB McCullum - batting52.735.0
TG Southee - bowling20.073.8
TA Boult - bowling22.363.6
KS Williamson - batting47.213.4
BJ Watling - batting42.272.0
N Wagner - bowling30.421.1
CJ Anderson - bowling30.541.0
CJ Anderson - batting32.70-0.3
TA Boult - batting32.25-0.4

I would give the award to Ross Taylor. He scored 816 runs at an average of 81.60. He past 50 in half of his innings. McCullum, Southee, Boult and Williamson all had great years, but Taylor's average really makes his numbers stand out.

Next I looked at the ODI lists.

Here I decided to use the batting and bowling index developed by S Rajesh from Cricinfo (and me separately). Again I compared the players index to the historical data.

Here's the list:

Player - SkillIndexRanking
CJ Anderson - batting 84.4816.1
LRPL Taylor - batting 43.776.9
MJ Guptill - batting 44.226.4
KS Williamson - batting 39.044.7
MJ McClenaghan - bowling 23.871.1
NL McCullum - batting 26.230.9
JDS Neesham - bowling 23.690.8
CJ Anderson - bowling 24.850.7
KD Mills - bowling 25.970.7
L Ronchi - batting 22.93-0.1

Again a batsman takes the title. This, however was not particularly surprising. Anderson was immense with the bat, and generally the games were played on high-scoring pitches, which don't really flatter bowling statistics.

For the T20 award I used batting index, but my own metric for bowling. In a previous post I showed how each wicket worked out to roughly 5 runs in a t20. Accordingly we can take 5 runs off a bowler's total for every wicket they have taken. They then get a modified run rate. I used this to compare the NZ players' years to the historical data. This is a little less relevant, as there is not a lot of historical data (about 1/10 the quantity of test and ODI information) and also New Zealand only played 6 matches, so the sample size is very small.

Here is the list:

Player - SkillIndex/Modified run rateRanking
L Ronchi - batting221.1114.7
BB McCullum - batting101.084.1
AF Milne - bowling2.752.9
AP Devcich - batting73.341.7
C Munro - batting60.041.5
JDS Neesham - bowling5.000.5
JD Ryder - batting44.020.0
NL McCullum - batting42.25-0.1
NL McCullum - bowling5.64-0.3
HD Rutherford - batting40.02-0.3

Luke Ronchi is a bit of a surprise here, but I remember looking up his stats and being surprised as to how effective he has been in t20s recently. During the course of the year he averaged 133 at a strike rate of 166. Those are quite ridiculous numbers.

The last major prize left is the Sir Richard Hadlee Medal, for the best overall. For me that goes to Brendon McCullum. He managed to attract the attention of the whole nation with his 300, and he also captained the side particularly well across all the formats. There would be a fair argument for Taylor and Anderson, but for me, McCullum needs to be acknowledged some how, and that award seems appropriate.

Who would you give the overall award to?

Sunday, 9 February 2014

Mini-session Analysis 1st Test NZvInd, Eden Park 2013/14

Here is the final mini-session analysis for the first test between New Zealand and India at Eden Park, Auckland

A mini-session is (normally) half a session, either between the start of the session and the drinks break or the drinks break and the end of the session. Occasionally a long session will have 3 mini-sessions where it will be broken up with 2 drinks breaks.