Sunday, 14 July 2019

Statistical preview, World Cup final, New Zealand vs England

Here is a brief statistical preview.

Recent head to head:

In the past 5 years England lead 8-5.
In the past 2 years England lead 6-3.

At Lord's the ball tends to bounce a bit more. As a result it tends to not suit England as much as their other home grounds. It is the only ground that England have a losing record at over the past few years, with 3 wins and 4 losses in their last 6 years.

It is also a ground where scores have been defended quite regularly.




The slope, large straight boundaries and the bounce combine to make a more bowler friendly ground than most in England, but grounds in this world cup have not exactly gone to type.

Adding in times where New Zealand bat first, and where England bowl first, gives the following result:

New Zealand had a clear plan to use the pressure of the situation as a weapon to help them defeat India, and the pressure from playing at home may do the same against England.

The model that I used to build my simulation has England at 69.8%, while New Zealand are at 30.2%. That feels about right too, New Zealand have a realistic chance, but England are certainly favorites.

The bookies have England at 73%, CricViz have England at 68%, New Zealand at 30% and a tie at a fairly high 2%.

The two teams are close enough that nobody can say exactly who will win, but it is a World Cup final - that's exactly how it should be.

A better World Cup format

As this world cup draws to a close, I've been thinking about the positives and negatives of the format.

There are quite a few of both.

Firstly the positives:

  1. Everyone plays each other.
  2. Not too many matches that seem like a mismatch on paper.
  3. Teams that lose a couple of games early still have the chance to compete.
  4. Guaranteed 9 matches for India, so the ICC get enough money to keep growing international cricket.
  5. There was a match or two every day through the majority of the tournament, so that the momentum built towards the finals.
Then the negatives:

  1. Not enough representation from lower level teams. The qualification was too difficult, and so the goal of making the world cup became unrealistic for most teams.
  2. There were only 3 matches in the final week, meaning that the momentum was lost.
  3. Dead rubbers, or similar - 3 teams were effectively eliminated with 2 weeks to go.
  4. Incomplete rounds - India being 2 matches behind made the narratives and changes in fortune less obvious. 
  5. Pitches were too different from how they've played over the past 4 years, meaning that there was too much of a role of luck in the event. 
The negatives are too great to mean that it's a good idea to continue with the same format in my opinion. But, the positives are things worth keeping.

So, using those positives as constraints as much as possible, and also keeping the tournament to the same length, I have come up with a format that I believe will make for a better event.

Friday, 5 July 2019

World Cup Simulation update - 5 July

Here's the latest update for the world cup simulation. I have New Zealand at 100%, but that's simply due to the probability of Pakistan getting the required run-rate being so low that that possibility never eventuated in the 50000 trials that I used. The probability of Pakistan going through is slightly lower than the probability of someone being shot accidentally by a dog running along a beach while holding a handgun in it's mouth during the next week,
The next graph is the expected points. The simulation has had the correct top 4 from the second match on, however, the expected points and the order of the teams have changed considerably

The top 4 was looking fairly likely from about match number 6 on. There was some excitement from the two upset losses by England, but Pakistan never got beyond 40% on the simulation.

The complete make up of the semi-finalists has not yet been decided, nor has the team in 5th place. Pakistan, Bangladesh and Sri Lanka could all end up 5th. 

Next I looked at the winning probability. This is getting close to the point where it can be calculated analytically without much trouble.
 The next thing to look at is the rankings. A thing to remember here is that it is all relative to Afghanistan, so everybody going up is more an indication that Afghanistan has gone down.

The order that the teams are in here is the same as David Kendix' official rankings order, with one exception - I have India ahead of England, rather than the other way round.

Finally, a little graph to show what Pakistan needs to do to make the semi-finals. They need to keep Bangladesh below the green line.



Monday, 1 July 2019

World Cup simulation update - 1 July

Here's the latest update to the simulation. The first two graphs disagree slightly, and that's because I have two different methods to calculate the expected net run rate. The first one seemed to be slightly more accurate than the second, but there was not a big difference when I tested them. (The margin of victory in cricket matches is actually really difficult to estimate - teams batting second tend to cruise to victory rather than try to win by as big a margin as possible) I decided to use both when doing the calculations. With the first method, New Zealand and India both have a higher than 99.98% probability of going through, while it's 99% for India and 97.7% for New Zealand with the second method. These seem more realistic.


The big thing to notice is the change to England's probability, and how England beating India damaged the chances of both Pakistan and Bangladesh. Pakistan's probability went down by slightly more than Bangladesh's probability because the ranking of India dropped slightly, and Bangladesh need to beat India to get through.

This graph shows expected value - not the most likely value. Those are actually different things. The expected value is the mean of all the expected outcomes. As a result, none of the teams will actually end up with the points that this shows, but they should mostly get close to it.

 It's now looking like there's a roughly 45% chance that net run rate will be a deciding factor in who goes through to the semi-finals.

If Bangladesh beat India (which is admittedly a fairly unlikely outcome), we could then see a situation where Pakistan and Bangladesh are playing for the opportunity to be level on points with New Zealand and India on 11 points. If that is the case, then (in all likelihood) the rained out match between New Zealand and India will have allowed both to progress at the expense of the winner of Pakistan vs Bangladesh.

The most likely semi-finals at this point are Australia vs New Zealand and England vs India, but these are by no means confirmed yet.

In individual matches, England effectively has a higher ranking than that, because teams playing at home get a ranking boost of 0.86 over their opponent. That's why I have England back on top in the next graph:
This one is quite different to what the book-makers have. I have England as favourites, while they have India and Australia both tied for favourite on roughly 30%. They also have Pakistan and Bangladesh at about double the probability that I do.

I used the first net run rate model for the winning probability, but the difference in numbers suggests that the bookies are possibly using a model that is more similar to the second one.

Wednesday, 26 June 2019

World Cup simulation update - 26 June

Are the wheels falling off?

England have now got a 4 win, 3 loss record, and, with 2 difficult matches coming up, have a genuine chance of not going through to the semi-finals. They are still not relying on other results, but they're getting close to the point where they are.



There's been a significant change, with Australia going up, and England going down. England are now expected to get to 10 points. That might still be enough. But it also might not be.
England's ranking has now dropped well below India's, to the point where the expected probability of England winning against India has dropped by almost 10%. They're still ahead due to home advantage, but the difference is decreasing.
There's about a 15% chance that a tie-breaker (total wins or net run rate) will be required. This may count out Sri Lanka, who have had two rain affected matches, and so will probably be on fewer wins than anyone else with the same number of points.

We see a huge drop in the semi-final probability of England, and a resultant increase in Bangladesh, Pakistan and Sri Lanka. Australia have qualified now, and there are fewer options now for New Zealand to be knocked out also (only 35 out of 50000 trials saw New Zealand miss the semi-finals.)


The decrease in England, and increase in probability of lower ranked teams making the semi-finals has meant that there are a lot more semi-final combinations with more than a 0.5% chance of happening. West Indies vs New Zealand was an epic match in the pool play, and that's now a reasonable possibility for a semi-final. The ICC and Star Sports will be licking their lips at the prospect of the 8th most likely outcome - an India Pakistan semi-final would be absolute ratings gold.
This is the first time that England has dipped below India on the winning probability graph, but it's hard to win the final if you don't get out of the group stage.

Monday, 24 June 2019

World Cup Simulation update 24th June


 Here's the update after the South Africa vs Pakistan match

Firstly, this pushed Pakistan's ranking back above Bangladesh's ranking, although they are both so close that the match between them is now predicted as 50.2% to 49.8%.
 Looking at the expected points, Pakistan have now jumped ahead of Sri Lanka and Bangladesh.

It's looking fairly likely that 5th place will be on 9 or 10 points, while 4th will be on 10, 11 or 12 points.

My simulation only uses net run rate as the tie breaker. Accordingly, there's actually a slightly higher probability of Sri Lanka and Pakistan getting through than this shows, and a slightly lower chance of England and Bangladesh.

It's takes a lot of processor time to improve the simulation, and it's likely to be less than 1% difference, but I might have a go at improving it once we get to the last 5 matches.


England are still the overwhelming favourite to be the 4th team to go through. There were still 41 out of the 50000 trials where New Zealand hadn't made it. So nobody is guaranteed through just yet.


If you have semi-final tickets - this is who you're likely to see.

The probabilities for Bangladesh and Pakistan being so low here are understandable. They both have about a 5% chance of making the semi-final, but, given that they both have about a 1/3 chance of winning each match against the top teams, it gives them a roughly 0.5% chance of winning the tournament from here. However, if Bangladesh, Australia and Pakistan win the next 3 matches, that number will rise.

It's starting to look like England's style that is so effective in series may not be so effective in one off matches. It will be interesting to see if that trend continues.

Sunday, 23 June 2019

World Cup Simulation Update, 23 June

Here's the latest outputs from the simulation.

England's loss to Sri Lanka opened the door somewhat, but we can still be fairly confident in who the semi-finalists are.
 England's ranking has gone down, after two losses to fairly ordinary sides.
It's looking like 10 points will be the magic number. Roughly a 10% chance that we'll rely on a tie-breaker.

The average points expected certainly favour England on that count to be in fourth


Accordingly, they have a much higher chance of making it through.

What the likely match ups are. (Teams in alphabetical order, rather than placings)

England are still firm favourites by my model. Home advantage is massive.