Sunday 14 July 2019

Statistical preview, World Cup final, New Zealand vs England

Here is a brief statistical preview.

Recent head to head:

In the past 5 years England lead 8-5.
In the past 2 years England lead 6-3.

At Lord's the ball tends to bounce a bit more. As a result it tends to not suit England as much as their other home grounds. It is the only ground that England have a losing record at over the past few years, with 3 wins and 4 losses in their last 6 years.

It is also a ground where scores have been defended quite regularly.




The slope, large straight boundaries and the bounce combine to make a more bowler friendly ground than most in England, but grounds in this world cup have not exactly gone to type.

Adding in times where New Zealand bat first, and where England bowl first, gives the following result:

New Zealand had a clear plan to use the pressure of the situation as a weapon to help them defeat India, and the pressure from playing at home may do the same against England.

The model that I used to build my simulation has England at 69.8%, while New Zealand are at 30.2%. That feels about right too, New Zealand have a realistic chance, but England are certainly favorites.

The bookies have England at 73%, CricViz have England at 68%, New Zealand at 30% and a tie at a fairly high 2%.

The two teams are close enough that nobody can say exactly who will win, but it is a World Cup final - that's exactly how it should be.

A better World Cup format

As this world cup draws to a close, I've been thinking about the positives and negatives of the format.

There are quite a few of both.

Firstly the positives:

  1. Everyone plays each other.
  2. Not too many matches that seem like a mismatch on paper.
  3. Teams that lose a couple of games early still have the chance to compete.
  4. Guaranteed 9 matches for India, so the ICC get enough money to keep growing international cricket.
  5. There was a match or two every day through the majority of the tournament, so that the momentum built towards the finals.
Then the negatives:

  1. Not enough representation from lower level teams. The qualification was too difficult, and so the goal of making the world cup became unrealistic for most teams.
  2. There were only 3 matches in the final week, meaning that the momentum was lost.
  3. Dead rubbers, or similar - 3 teams were effectively eliminated with 2 weeks to go.
  4. Incomplete rounds - India being 2 matches behind made the narratives and changes in fortune less obvious. 
  5. Pitches were too different from how they've played over the past 4 years, meaning that there was too much of a role of luck in the event. 
The negatives are too great to mean that it's a good idea to continue with the same format in my opinion. But, the positives are things worth keeping.

So, using those positives as constraints as much as possible, and also keeping the tournament to the same length, I have come up with a format that I believe will make for a better event.

Friday 5 July 2019

World Cup Simulation update - 5 July

Here's the latest update for the world cup simulation. I have New Zealand at 100%, but that's simply due to the probability of Pakistan getting the required run-rate being so low that that possibility never eventuated in the 50000 trials that I used. The probability of Pakistan going through is slightly lower than the probability of someone being shot accidentally by a dog running along a beach while holding a handgun in it's mouth during the next week,
The next graph is the expected points. The simulation has had the correct top 4 from the second match on, however, the expected points and the order of the teams have changed considerably

The top 4 was looking fairly likely from about match number 6 on. There was some excitement from the two upset losses by England, but Pakistan never got beyond 40% on the simulation.

The complete make up of the semi-finalists has not yet been decided, nor has the team in 5th place. Pakistan, Bangladesh and Sri Lanka could all end up 5th. 

Next I looked at the winning probability. This is getting close to the point where it can be calculated analytically without much trouble.
 The next thing to look at is the rankings. A thing to remember here is that it is all relative to Afghanistan, so everybody going up is more an indication that Afghanistan has gone down.

The order that the teams are in here is the same as David Kendix' official rankings order, with one exception - I have India ahead of England, rather than the other way round.

Finally, a little graph to show what Pakistan needs to do to make the semi-finals. They need to keep Bangladesh below the green line.



Monday 1 July 2019

World Cup simulation update - 1 July

Here's the latest update to the simulation. The first two graphs disagree slightly, and that's because I have two different methods to calculate the expected net run rate. The first one seemed to be slightly more accurate than the second, but there was not a big difference when I tested them. (The margin of victory in cricket matches is actually really difficult to estimate - teams batting second tend to cruise to victory rather than try to win by as big a margin as possible) I decided to use both when doing the calculations. With the first method, New Zealand and India both have a higher than 99.98% probability of going through, while it's 99% for India and 97.7% for New Zealand with the second method. These seem more realistic.


The big thing to notice is the change to England's probability, and how England beating India damaged the chances of both Pakistan and Bangladesh. Pakistan's probability went down by slightly more than Bangladesh's probability because the ranking of India dropped slightly, and Bangladesh need to beat India to get through.

This graph shows expected value - not the most likely value. Those are actually different things. The expected value is the mean of all the expected outcomes. As a result, none of the teams will actually end up with the points that this shows, but they should mostly get close to it.

 It's now looking like there's a roughly 45% chance that net run rate will be a deciding factor in who goes through to the semi-finals.

If Bangladesh beat India (which is admittedly a fairly unlikely outcome), we could then see a situation where Pakistan and Bangladesh are playing for the opportunity to be level on points with New Zealand and India on 11 points. If that is the case, then (in all likelihood) the rained out match between New Zealand and India will have allowed both to progress at the expense of the winner of Pakistan vs Bangladesh.

The most likely semi-finals at this point are Australia vs New Zealand and England vs India, but these are by no means confirmed yet.

In individual matches, England effectively has a higher ranking than that, because teams playing at home get a ranking boost of 0.86 over their opponent. That's why I have England back on top in the next graph:
This one is quite different to what the book-makers have. I have England as favourites, while they have India and Australia both tied for favourite on roughly 30%. They also have Pakistan and Bangladesh at about double the probability that I do.

I used the first net run rate model for the winning probability, but the difference in numbers suggests that the bookies are possibly using a model that is more similar to the second one.