Saturday, 1 February 2020

Solving the Super Over Situation

New Zealand have an issue with super overs. We play them much more than anyone else, and we're terrible at them.

We have played in 8 super overs in the past 12 years. We have lost 7 of them. There have only been 15 super overs in the history of international cricket. We play them ridiculously often, and we lose them ridiculously often.

Losing 7 out of 8 stops being bad luck, it start being something that needs to be dealt with.

Here's my solution: We play a single day domestic super over tournament on Waitangi Day every year.

We can either let association have a turn to host it, or pick one venue (possibly Whangarei for the proximity to Waitangi) to host it every year.

The day would work with every team playing a super over against every other team, (15 super overs) then semi-finals and a final.

It would take about 7 hours (shorter if there was quick hand overs between matches) - roughly the same as an ODI match, and could have a rugby 7's type festival feeling to it.

I can already hear the critics talking about shortening the game, and "what's next one ball matches" but this is an issue that needs to be addressed.

In most of those matches we should have won in regular time. We didn't generally get to a super over because we did well, and fought back. We almost invariably got into a super over because we were in a position to win the match, and did not manage to seal it.

Having our players playing those sort of pressure situations more often would tell us who is capable of handling that pressure. As such, we would want a variety of players involved. There should therefore be a rule that each bowler can only bowl in two matches, and each batsman can only be one of the three designated batsmen in three matches. That will mean that each team will have to use at least 3 bowlers and 6 batsmen. For the semi and final then they can pick whoever they want.

This seems to be the only option other than just hoping that we get better.

I'd rather do something, than nothing.

Over to you, New Zealand Cricket.

Friday, 3 January 2020

Some Questions ahead of the 3rd test

Questions leading into the third test

1. Who will actually be fit to play for New Zealand?

There is talk that Kane Williamson, Henry Nicholls and Mitchell Santner were all too sick to get out of bed yesterday, and all are unlikely to play. Trent Boult and Lockie Ferguson have already gone home. Glenn Phillips has been called in as a late replacement, meaning that there is a chance that New Zealand will end up playing four wicket keepers, and recalling Jeet Raval to the squad simply due to lack of other options. If those 3 are all out of contention, then New Zealand’s top 7 is likely to include Raval, Tom Latham, Tom Blundell, Ross Taylor, Phillips, BJ Watling and Colin de Grandhomme.

2. Will either side opt for two spinners, and if so, who will make way?

The Sydney Cricket Ground has a reputation as a spinners track, and both teams have added an extra spinner into their squad. If Australia opt for Mitchell Swepson, then they are likely to end up either dropping a batsman, or going in with only two pace bowlers and giving the 3rd seamer role to Matthew Wade. Wade’s over against New Zealand in Melbourne was considerably less threatening than his spell in Perth, suggesting that he is less effective with the red ball than the pink one. This suggests that going with two spinners is a highly risky move for them.

Another option could be to select Michael Nesser as an all rounder to replace Wade in the side, allowing more cover for the extra spinner, but lengthening the tail considerably. This is unlikely to happen, as Australia have traditionally shied away from picking five bowlers in test sides, and Tim Paine has made it clear that he does not favour changing the formula too much.

New Zealand have taken Todd Astle on a holiday so far, not playing any tests on this tour or in the matches in New Zealand. Will Somerville has been added into the squad, and his familiarity with the conditions and point of difference with his height is likely to make him a tempting option. Somerville is a former Sydney resident, and played for New South Wales for a few years before returning to New Zealand to try to play international cricket. He is close to 2 metres tall, and so created different challenges for batsmen by being able to extract similar bounce to a bowler bowling with loop even while bowling on a flatter trajectory.

Astle provides the advantage of being a competent batsman, so bringing him into the side in place of de Grandhomme is a possibility. That would allow a 3rd genuine seam bowler into the side. Another option is for Astle and Somerville coming in with Tim Southee and Neil Wagner with de Grandhomme acting as the 3rd seamer.

3. Will the pitch actually turn, or is the spinner’s SCG a bit of a myth?

Over the past 10 years, spin bowlers have bowled over 1000 overs at the SCG, but only picked up 82 wickets at an average of over 50 at the SCG. Pace bowlers have taken 192 wickets at and average just under 35 there. Nathan Lyon has averaged 47 at the ground in that time, and collectively the leg spinners used there have averaged roughly 70. The days of Stuart McGill ripping teams apart on the SCG seem to be long gone.

However, when looking at the way that the series has progressed, Australia might consider favouring spin more. New Zealand’s two standing quick bowlers have not been much less effective than their Australian counterparts. Southee and Wagner have taken 26 wickets at less than 23 runs each, while Pat Cummins and Mitchell Starc have taken 19 wickets at just under 18 each. However, Lyon has been much more effective than Santner (10 wickets at 22.7 vs 1 wicket at 250). Giving Lyon slightly more to work with might exaggerate that difference even more.

4. Will New Zealand keep trying to out last Australia with the ball?

New Zealand have had a clear bowling plan in this series. With the new ball: pitch it up, and try to get it to swing occasionally, but mostly bowl a 4th stump line, on a good length. With the older ball, bang it in short of a length. Both tactics have been mostly designed to get the batsmen to play risky shots and get out doing so, rather than trying to actively dismiss the batsmen.

While that tactic has been reasonably successful for Southee and Wagner, it has meant that there has been a lot asked of the other bowlers, and they have not been as successful. Perhaps bowling 1m fuller, and more at the stumps. Cricviz released some interesting data recently that of all batsmen who have faced 500 balls aimed at the stumps since 2006, only Steven Smith averages over 33 against those deliveries, and of players who are still active test batsmen, Virat Kohli has the third best average against balls targeting the stumps of 24.08. That suggests that bowling straighter might be a better tactic. The odd delivery will be hit through the leg side or down the ground, but the approach may well bear more fruit.

The difference in length and line from the Australian bowlers has been clear. They have tended to bowl at the stumps more. Some of that is due to the different styles, but some of it is just that they had different plans, and those plans (especially when a batsman was new to the crease) have been much more effective.

5. Should the match even be going ahead?

Cricket is the job of the players, and of the administrators, but it is still at its heart a game. Is there a point where playing games in the midst of an ongoing natural disaster becomes a little insensitive? Should this match even be going ahead?

The smoke from the New South Wales bushfires has been so thick that the views of mountains in Southern New Zealand (over 2000 km away) has been blocked and some of the New Zealand glaciers have turned brown. In terms of distance, that would be like smoke from a fire in Dubai blocking out the view of the buildings at one end of Marine Drive in Mumbai from the other.

The question has to be asked as to what point is it where player welfare comes to the fore? The atmosphere in Sydney is so polluted from the fires that one lung professor likened breathing it to smoking 40 cigarettes. The PM2.5 reading in some outer suburbs of Sydney was 734. To put that in context the match in Delhi that was called off between India and Sri Lanka had a PM2.5 reading of under 400.

Sport can be important for the morale of people who are experiencing a traumatic event, but there is such a thing as being too soon, and while the bodies of the dead from the fires are still not yet buried it may be too soon to be playing games. Even if the timing is acceptable to the public, is the safety issue to the players too extreme for such triviality.

Tuesday, 6 August 2019

Second only to Bradman?

Steven Smith has just celebrated his test come back by scoring a century in each innings at Edgbaston in Birmingham. Not content with just scoring a "come-from behind fighting century" when the bowlers were on top, he also added a "rub the salt in" century when the batsmen were on top.

It was such a match defining performance that the questions have been asked again, is he the best since Bradman?

I won't attempt to do a complete statistical breakdown right here now, but I will focus on a couple of statistics that suggest either "yes" or "not quite."

One thing that I've started to be more and more interested in is the performance of a batsman at their peak. It is hard to deny that a batsman's skill level changes throughout their careers. Some start off as amazing players, but then fade, others start slowly, then blossom into better players. Most start off slowly, have a strong middle period of their career then fade again at the end.

The graph below illustrates three players that had quite different career trajectories, but were all very good players.

Dennis Compton started off with an amazing run of scores, only Don Bradman averaged more in his first 30 test matches. His career never really reached those heights again, however, and he had a period where he really struggled, before modifying his game and ending his career on a (less dramatic) high.

Martin Crowe was picked as a teenager, and sent on a difficult tour, before he was really ready. He struggled and was in and out of the side at first. It took him a while to really own his position. After a while, he developed into one of the best batsmen in the world. Later on he struggled with injuries and his career petered out to a shadow of what he had previously been.

Marvin Atapattu scored only one run in his first 6 innings. That start was not an easy one to recover from. Throughout his career he tended to have a mixture of exceptionally large scores and regular ducks, which meant that it looked like he had patchy form. But for the majority of his career he tended to average above 40 in any given 30 match sequence after his horrific early period.

The story is clear, however, that an overall career average does not necessarily tell us about how good a player actually was. Looking at a player's peak is actually a better idea than looking at their overall career. That's especially true when comparing former players with current ones, or comparing players who retired at their peak with ones who continued on because even though they were no longer at their best, they were still better than the alternatives.

To compare players at their peak requires finding a way to define their peak. It's difficult to know how many matches to choose as a player's peak. It will certainly differ from player to player. Some will maintain their peak form for a number of years, while others may get injured, banned for ball tampering or retire just as they are starting to hit it. Added to that, the number of tests played has greatly increased for most nations, so while an old player like Jack Cowie never missed a test for 12 years and yet never made it to 30, someone playing for England now could potentially reach 30 tests after only playing test cricket for 20 months.

There's also the issue of sampling variability in small samples. If we look at 30 tests as defining a player's peak, that makes a maximum sample size of 60 innings (more likely to be closer to 55). 50 tests would make a maximum sample size of 100 innings (more likely to be close to 90).

If we simulate innings based on a player with a batting average of 45, we can find the range of likely 30 match and 50 match averages if the results are distributed randomly. For this, I've used geometric distribution to create random scores, and then found the average of them. This has been shown to be a reasonably useful way of simulating cricket scores, so it will give some indication of the expected variance in the averages.

The red and green lines here are the 95% bands for the simulated data. With the 30 match averages, the player who should have averaged 45 tended to average somewhere between 33 and 58. With 50 matches, the player tended to average between 36 and 54.

This needs to be remembered whenever comparing averages. A peak can be a player's skill improving, or it can be just random variation. Someone who averages 52 is not necessarily a better player than another one who averaged 49. It is just not possible to be confident statistically that there's a difference between these two player's ability. That's just based on sampling variability, and not accounting for non-sampling factors such as the opposition that they faced or the conditions that they played in.

Given that, is there any point in comparing at all? Well, it's not going to definitively say who was the best, but it can tell us who played the best.

For this analysis I am only including matches for players where they actually batted. As a result Don Bradman only has 50 tests, as there are two where he got injured fielding/bowling and did not end up batting. I am also not including the WSC Supertests or any matches played for the ICC World XI.

The top 21 instances of the best 30 matches by either average or total runs are the 21 combinations of 30 in a row out of Bradman's 50 matches.

He is so far ahead of the rest of the players in history that in his worst ever 30 matches he still scored 14% more runs than the best 30 matches by any other player.

Here are the tables of the top 10.


The top name is consistent, but the other names in table are much less consistent. 18 players appear at least once, with Bradman, Ponting, Sangakkara and Smith being in all 4 tables, while Sobers, Kallis and Yousuf all in the list 3 times.

This does not tell us definitively who is second. There is enough sampling variation alone that there's not enough evidence to say that Waugh was better in his best 30 innings than Hayden was, just that he performed better. But that's really all we can hope for.

Steven Smith may not be the best since Bradman, but he may well be also.

Sunday, 14 July 2019

Statistical preview, World Cup final, New Zealand vs England

Here is a brief statistical preview.

Recent head to head:

In the past 5 years England lead 8-5.
In the past 2 years England lead 6-3.

At Lord's the ball tends to bounce a bit more. As a result it tends to not suit England as much as their other home grounds. It is the only ground that England have a losing record at over the past few years, with 3 wins and 4 losses in their last 6 years.

It is also a ground where scores have been defended quite regularly.

The slope, large straight boundaries and the bounce combine to make a more bowler friendly ground than most in England, but grounds in this world cup have not exactly gone to type.

Adding in times where New Zealand bat first, and where England bowl first, gives the following result:

New Zealand had a clear plan to use the pressure of the situation as a weapon to help them defeat India, and the pressure from playing at home may do the same against England.

The model that I used to build my simulation has England at 69.8%, while New Zealand are at 30.2%. That feels about right too, New Zealand have a realistic chance, but England are certainly favorites.

The bookies have England at 73%, CricViz have England at 68%, New Zealand at 30% and a tie at a fairly high 2%.

The two teams are close enough that nobody can say exactly who will win, but it is a World Cup final - that's exactly how it should be.

A better World Cup format

As this world cup draws to a close, I've been thinking about the positives and negatives of the format.

There are quite a few of both.

Firstly the positives:

  1. Everyone plays each other.
  2. Not too many matches that seem like a mismatch on paper.
  3. Teams that lose a couple of games early still have the chance to compete.
  4. Guaranteed 9 matches for India, so the ICC get enough money to keep growing international cricket.
  5. There was a match or two every day through the majority of the tournament, so that the momentum built towards the finals.
Then the negatives:

  1. Not enough representation from lower level teams. The qualification was too difficult, and so the goal of making the world cup became unrealistic for most teams.
  2. There were only 3 matches in the final week, meaning that the momentum was lost.
  3. Dead rubbers, or similar - 3 teams were effectively eliminated with 2 weeks to go.
  4. Incomplete rounds - India being 2 matches behind made the narratives and changes in fortune less obvious. 
  5. Pitches were too different from how they've played over the past 4 years, meaning that there was too much of a role of luck in the event. 
The negatives are too great to mean that it's a good idea to continue with the same format in my opinion. But, the positives are things worth keeping.

So, using those positives as constraints as much as possible, and also keeping the tournament to the same length, I have come up with a format that I believe will make for a better event.

Friday, 5 July 2019

World Cup Simulation update - 5 July

Here's the latest update for the world cup simulation. I have New Zealand at 100%, but that's simply due to the probability of Pakistan getting the required run-rate being so low that that possibility never eventuated in the 50000 trials that I used. The probability of Pakistan going through is slightly lower than the probability of someone being shot accidentally by a dog running along a beach while holding a handgun in it's mouth during the next week,
The next graph is the expected points. The simulation has had the correct top 4 from the second match on, however, the expected points and the order of the teams have changed considerably

The top 4 was looking fairly likely from about match number 6 on. There was some excitement from the two upset losses by England, but Pakistan never got beyond 40% on the simulation.

The complete make up of the semi-finalists has not yet been decided, nor has the team in 5th place. Pakistan, Bangladesh and Sri Lanka could all end up 5th. 

Next I looked at the winning probability. This is getting close to the point where it can be calculated analytically without much trouble.
 The next thing to look at is the rankings. A thing to remember here is that it is all relative to Afghanistan, so everybody going up is more an indication that Afghanistan has gone down.

The order that the teams are in here is the same as David Kendix' official rankings order, with one exception - I have India ahead of England, rather than the other way round.

Finally, a little graph to show what Pakistan needs to do to make the semi-finals. They need to keep Bangladesh below the green line.

Monday, 1 July 2019

World Cup simulation update - 1 July

Here's the latest update to the simulation. The first two graphs disagree slightly, and that's because I have two different methods to calculate the expected net run rate. The first one seemed to be slightly more accurate than the second, but there was not a big difference when I tested them. (The margin of victory in cricket matches is actually really difficult to estimate - teams batting second tend to cruise to victory rather than try to win by as big a margin as possible) I decided to use both when doing the calculations. With the first method, New Zealand and India both have a higher than 99.98% probability of going through, while it's 99% for India and 97.7% for New Zealand with the second method. These seem more realistic.

The big thing to notice is the change to England's probability, and how England beating India damaged the chances of both Pakistan and Bangladesh. Pakistan's probability went down by slightly more than Bangladesh's probability because the ranking of India dropped slightly, and Bangladesh need to beat India to get through.

This graph shows expected value - not the most likely value. Those are actually different things. The expected value is the mean of all the expected outcomes. As a result, none of the teams will actually end up with the points that this shows, but they should mostly get close to it.

 It's now looking like there's a roughly 45% chance that net run rate will be a deciding factor in who goes through to the semi-finals.

If Bangladesh beat India (which is admittedly a fairly unlikely outcome), we could then see a situation where Pakistan and Bangladesh are playing for the opportunity to be level on points with New Zealand and India on 11 points. If that is the case, then (in all likelihood) the rained out match between New Zealand and India will have allowed both to progress at the expense of the winner of Pakistan vs Bangladesh.

The most likely semi-finals at this point are Australia vs New Zealand and England vs India, but these are by no means confirmed yet.

In individual matches, England effectively has a higher ranking than that, because teams playing at home get a ranking boost of 0.86 over their opponent. That's why I have England back on top in the next graph:
This one is quite different to what the book-makers have. I have England as favourites, while they have India and Australia both tied for favourite on roughly 30%. They also have Pakistan and Bangladesh at about double the probability that I do.

I used the first net run rate model for the winning probability, but the difference in numbers suggests that the bookies are possibly using a model that is more similar to the second one.