Tuesday, 10 March 2015

Updated QF prediction chart

In my previous post I ran a simulation to find out potential quarter-final places. I received some criticism for having England so low, and Bangladesh so high, but events over the past 48 hours have shown that the respective probabilities of the two teams qualifying may not have been so far off.

The program that I wrote to do the simulation was corrupted when my computer crashed and I foolishly hadn't saved it, so I've written a different one to re-calculate. This time I made a couple of modifications. I moved from an additive model for run rates to a multiplicative one, as that seemed to be more sensible (teams are realistically a % better than other teams, rather than a fixed number of runs better. We would expect the margins to blow out more in terms of runs on better batting pitches than on difficult tracks).

I also slightly reduced the standard deviation of the simulation by moving it to one quarter of the mean rather than one third. This again made the results seem more sensible. There were too many teams scoring over 400 or under 100 previously.

Here are the new results. This table shows the probability of each team qualifying in position 1, 2, 3 or 4 in their group, and then the total probability of qualifying. Again I have not factored rain into this, and with Cyclone Pam heading towards New Zealand that may be a little optimistic.

New Zealand10001
Sri Lanka00.0240.97250.00351
South Africa00.9760.02401
West Indies0000.7430.743

The potential group results look like this:

Group A
NZ Aus SL Ban0.9725
NZ SL Aus Ban0.024
NZ Aus Ban SL0.0035

Group B
Ind SA Pak WI0.5295
Ind SA Ire WI0.1985
Ind SA Pak Ire0.1345
Ind SA Ire Pak0.1135
Ind Pak SA WI0.011
Ind Pak SA Ire0.006
Ind Ire SA WI0.004
Ind Ire SA Pak0.003

The three interesting potential quarter final match-ups to watch for here are

SA vs Aus4.7%
Ind vs SL0.35%
Ire vs Ban0.02%

In reality the probabilities of Ireland vs Bangladesh and Australia vs South Africa are higher, as they are both much more likely if rain starts to fall.

Sunday, 8 March 2015

World-cup quarter finals simulation

After Pakistan's tremendous win over South Africa, and Ireland's remarkable victory over Zimbabwe, the make up of the quarter finals is not really much clearer.

They question as to who is likely to be going through, and who will play whom has been the subject of many, many twitter conversations.

I thought it might be helpful to run a simulation to look at some of the possibilities.

I used Microsoft Excel as it's quite convenient. I used the scores already made in this tournament to decide the probable scores. For each team I got their average rpo scored in relation to the overall group run rate, and their average conceded in relation to the overall. Hence if a team in group A averaged scoring 5.5 rpo and conceded 5.3 rpo, they got values of +0.4 for batting and +0.2 for bowling (as the average rpo in group A has been 5.1 so far). From that point I then used an inverse normal, with a random number between 0 and 1 for the area, the group run rate plus the batting run rate modifier and the other team's bowling run rate modifier as the mean. For the standard deviation, I used the smallest of one third of the mean and 1.6. This allowed me to make sure there was (almost) no chance of a team getting a negative score, but that the scores weren't going to blow out too much.  I used 1.6 as that's the standard deviation of all innings run rates this tournament..  This gave me a 50 over score for each team, and so which ever was ahead got the points for the win.

There are a few limitations with this method. I didn't take into account the quality of the teams that each side had faced. England has played Australia, New Zealand and Sri Lanka, but has yet to play Bangladesh or Afghanistan. Their numbers are not going to necessarily show how well they will do against less fancied opponents. Likewise no adjustments were made for the pitch that the match is being played on. We know that South Africa have tended to favour playing on bouncier tracks, so an innings at the 'Gaba won't necessarily tell us much about how they would go in Dunedin. I also haven't taken into account player strengths. Bangladesh's batsmen tend to struggle against tall bowlers, such as Finn and Woakes. England can expect that those two bowlers will perform better than average against Bangladesh, and hence their team is likely to do better than the numbers would suggest.

Another major limitation is that I haven't made provision for rain. That would obviously throw off all calculations. However, given the limited information I felt that a more simple model was best.

I decided to do 2000 trials, so that I could feel that the major source of uncertainly was the assumptions rather than the natural sampling variability.

First I found the probability of the different teams making the quarter finals with my simulation:

New Zealand100%
Sri Lanka99.95%
South Africa100%
West Indies63.47%

We can see that Pool A has one crucial match (England vs Bangladesh)
Pool B, however, is still wide open. Ireland vs Pakistan is the last game of the round robin, and it's shaping up to potentially be one that has 3 team's fortunes riding on the result.

If West Indies make the final 8, they will almost definitely face New Zealand. It's very unlikely that New Zealand will not end up on top of Pool A, and impossible that West Indies will end up 3rd or higher in pool B.

Here's the full results for all possible matchups
Pool APool BProbability
New ZealandPakistan14.99%
New ZealandSouth Africa0.35%
New ZealandIreland21.23%
New ZealandWest Indies63.44%
AustraliaSouth Africa27.57%
Sri LankaIndia18.18%
Sri LankaPakistan15.83%
Sri LankaSouth Africa53.75%
Sri LankaIreland12.19%
BangladeshSouth Africa15.88%
EnglandSouth Africa2.45%

I'll redo this after tomorrow's results, and then again on Monday.

The most likely scenario at the moment is India to play Bangladesh, Australia to play Pakistan, South Africa to play Sri Lanka and New Zealand to play West Indies.

I've updated this here

Sunday, 22 February 2015

A quick look at the DRS rule with hawkeye and lbw

There is a significant issue with the way that hawkeye is used for DRS.

There is some doubt as to the exact position of the ball when captured on camera. It's only accurate to the nearest 2mm or so. While that's very accurate, once it's used to create a model, it can be dangerous. As a result there is a margin for error. Then there can be difficulty determining exactly where the ball hits the pad, especially where it brushes the front pad on the way to the second. This means that there is some doubt as to what the actual position of the ball is.

To overcome this, the ICC have ruled that more than half of the ball needs to hit the centre of the wicket. This is a user friendly option at first glance. The boundary is really clear, and the batsman needs to be clearly out in order to be given out. But near the boundaries there are occasionally situations where the ball is clearly going to hit, but instead the hawkeye system calls the ball "umpires call."

This is particularly ridiculous when the ball has hit the batsman on the back foot. In a situation where the ball has only an extra 40cm to travel, if the middle of the ball is just outside the middle of the stump then for the ball to miss the stumps, then the model would have to be out by 5.5 cm. On a distance of travel of 40cm that's allowing way too much margin for error (realistically there would be a significantly less than 1% chance of the ball missing the stumps).

A solution would be to look at a cone that was using a realistic model for the uncertainty. That would be more sensible for the commentators, fans and players to understand, and would actually provide a more sensible answer to the question "would the ball hit the stumps?"

I've put together a short video to demonstrate what I mean as well.

Sunday, 15 February 2015

South Africa vs Zimbabwe - things to watch for

Here are 5 things I'm going to be watching for in this match.

1. I really enjoy watching Elton Chigumbura. He's the sort of player who plays to win the match, rather than playing to have a good average. He gives himself the difficult jobs, and then puts everything into them.

2. Will Zimbabwe get Amla early. The Zimbabwean attack is quite suited to most New Zealand grounds, but if they don't get Amla early, then they will struggle to get him at all.

3. Quinton de Kock - can he rein his game in against the slower paced (but subtle and tricky) opening attack of Zimbabwe.

4. Brendan Taylor - Has he regained his touch that made him one of the best batsmen in the world in 2011.

5. South Africa's movement off the ball in the field. De Villiers has made it clear that he wants to see his team moving around more off the ball, like the New Zealand players do. This will be a chance to see if his talk has worked.

Tuesday, 3 February 2015

Martin Guptill and the form myth

Every season there seems to be a cause célèbre among NZ cricket fans. In 2013 the call was that Brendon McCullum wasn't scoring enough runs, and needed to be dropped. In 2013-14 it was that Peter Fulton wasn't scoring enough runs and needed to be dropped. This season the overwhelming majority of cricket talk in New Zealand has been about one man: Martin Guptill. Apparently he isn't scoring enough runs and needs to be dropped.

In either calls to Radio Sport or comments on the Vietchy On Sport facebook page there have been at least 21 players suggested as being a better option as an opener than Martin Guptill. People have suggested different ways that he might get injured in order to get him replaced in the squad.

But the opinion that Guptill's significantly out if form is not just confined to the uninformed public (I consider anyone that suggest Michael Pollard, Peter Ingram or Kyle Mills as replacements for Guptill uninformed). There have been a number of the country's sports journalists join in. In a quite well written and balanced piece, Andrew Alderson noted that Guptill "struggled for form." Charlie Bristow talked of Mike Hesson needing "to handle Martin Guptill's stuttering form." Mark Geenty commented that the top order was carrying "significance and concern." Guy Heveldt said that Guptill is "under immense pressure to find some form before the World Cup begins." Daniel Richardson said that Guptill is "out of touch", "has done little to inspire confidence" and that his "form is a concern."

Saturday, 24 January 2015

David Warner vs Rohit Sharma

Over the past couple of days I've been called a troll, a Jonathan Agnew fan and even an Australia on twitter, because I have a position that is somewhat different from others on the David Warner vs Rohit Sharma incident. The problem is that a nuanced view doesn't fit neatly inside a 140 character window, and so my views have been missinterpreted. Part of that is because people seem to have very absolute views on the matter, when I don't think what happened is really very black and white.

First of all I'll talk about my system of ethics with sledging and other play, and what I consider acceptable, then I'll look at the Warner-Sharma confrontation specifically.

Sledging is an attempt to get a psychological advantage over another player. For me this is part of the game. However, there are limits to what is acceptable. Some examples of forms that are acceptable (in my opinion) are fielders encouraging the bowler in a way that the batsman can hear and that might get into a batsman's head. For example "That's 4 dot balls in a row now" "He's got no idea about the short one" "Look at how he's holding the bat with his bottom hand, I reckon his coach will have words with him about that afterwards. It's causing him to push the bottom of the bat in. I reckon a half volley outside off will see him nick out here." These comments make the batsman doubt either their technique or their form, and can cause them to play false shots.

Likewise batting advice to the batsman is acceptable, even if it's not always genuine. The below example (about 1:20 in) where Hadlee gives Botham some advice on how to play his bowling is a classic. Botham may well have been late on the shot because he was thinking about what Hadlee had said and had anticipated a different delivery.

I f the fielding side feel that the batsmen are doing something underhanded, such at taking a run when the ball was dead, they are entitled to express their displeasure to them.

The more interesting questions are what is unacceptable. Here is my list:

Threats of violence that don't involve the playing of the game. For example "I'm going to break your ribs with the next ball" is acceptable. Likewise "If those close fielders stay there, I'm going to still play my shots and they will get hurt." Both of these, however, need to be in context. A bowler/batsman shouldn't be randomly threatening violence willy-nilly, but in the heat of an exchange they are fine. "I'll see you in the car park afterwards and smash your face in," however is not acceptable.

Racial slurs are not acceptable. They are not acceptable directed at a player or spoken about a player. There's a story about some things that Shane Thompson said to Wasim Akram and Waqar Younis to try and goad them into bowling short at him (rather than yorkers) that are totally unacceptable things to have been said on a cricket pitch.

Abuse for the sake of it is unacceptable. This includes most (but not all) send-offs. There can be time for a witty send off, provided it is brief and concludes an ongoing conversation. Prolonged send-offs, especially abusive ones, are completely unacceptable.

Likewise abusing someone to get under their skin, without there being any relation to the game or without it being in the context of an ongoing conversation is not on. The way that Fleming subjected Smith to a torrent of nastiness when he arrived at the crease may have helped New Zealand tie the series, but it was not something that New Zealand fans should be proud of.

There are other difficult situations, but generally it is fine to sledge, provided it is done in a way that has a purpose, and doesn't cross the line into pure abuse.

Now lets look at the Warner-Sharma situation. Here's my summary of what happened, as far as I understood it.

1. Rohit Sharma was slightly outside his ground, as he's entitled to be.
2. David Warner threw the ball towards the stumps.
3. The ball was very wide of the mark, and (only just) missed Sharma, and then evaded Haddin.
4. Sharma and Raina then proceeded to run an overthrow.
5. Warner thought that the ball had deflected off Sharma and got angry that they ran an overthrow contrary to established protocol.
6. Warner told Sharma that he was unimpressed
7. Sharma said something to Warner in Hindi. Warner speaks a few words of Hindi and didn't understand the full message but was upset by what he did understand.
8. Warner shouted at Sharma to speak English.
9. Sharma repeated his message in English as the umpires separated the players.

The one key point here is number 2. David Warner is a fantastic fielder. He has produced a few blinding run outs from direct hits. One of the impressive things about his fielding is just how often he hits the stumps. Given his ability, the fact that he missed the stumps by about 3m from close range is peculiar. The fact that he almost hit Sharma was concerning. How off target it was can be seen by the fact that Haddin stepped twice, then dived full length, and still didn't get to the ball.

He thought that he had hit Sharma with the throw, and that, therefore, Sharma shouldn't take a run. He didn't appologise for hitting Sharma, which would normally happen. It makes me wonder if he was aiming to hit Sharma with the throw. For me that is the key thing that was wrong with that incident.

What Warner said after that was in keeping with his understanding that Sharma had taken a run he would not normally be entitled to take. Sharma speaking Hindi successfully got in the head of Warner, and I don't have a particular problem with that. Warner's reaction, likewise, was totally understandable in context. The only issue, and it's a big one, was if Warner deliberately tried to hit Sharma with the ball.

If (in the opinion of the match referee) he did, then it would be a level 2 offence and he should be banned for a couple of games. Instead Warner was charged with a level 1 offence for "using language or a gesture that is obscene, offensive or insulting." As he had been found guilty of a similar offense within the past 12 months it was automatically raised to a level 2 offense, but he received the minimum fine for that offence, of 50% of his match fee.

The thing that I don't understand is how he was found guilty of that at all. As far as I can see he didn't abuse Sharma, and he didn't use any offensive gestures that I could see. If Sharma had spoken to him in English, then asking him to "speak English" would have been offensive, but given that Sharma didn't actually speak English, it was a perfectly reasonable request (despite not being delivered in a particularly reasonable manner). In the verbal altercation, Warner and Sharma acted equally badly, but not nearly badly enough for a charge.

If the ICC Code of Conduct was applied correctly here, either Warner would have been charged with deliberately throwing the ball at Rohit Sharma or he wouldn't have been charged at all.

Friday, 23 January 2015

Comparing between eras part 2. The survey results

In the previous post I looked at some New Zealand batsmen throughout the years and compared them, by trying to take into account some of the factors that might have batting either easier or harder for them.

I did this by looking at the runs that each player scored at a particular ground, and then looking at how easy/difficult that ground was to score at during that player's career. After that I allocated each ground a modifier value, and multiplied the runs scored at each ground by that ground's modifier. As a result (for example) the 188 runs that Martin Crowe scored at the Bourda in Georgetown were worth 164.5, because (during Crowe's era) it was a batting friendly pitch. However, his 120 runs that he scored at Karachi were worth 135.1 because that ground favoured bowlers.

I wanted to try the technique across a wider range of batsmen, so I put a simple request on twitter, for people to send me their top 5 batsmen. The tweets started pouring in.

I received a few humerous replies such as 5 votes for Rohit Sharma, 5 votes for Graham Thorpe and my personal favourite:

But eventually I had 159 serious lists of 5.

From the top 20 (plus ties) I then worked out their Normalised Averages. I left out two players, Barry Richards and WG Grace, as neither of their test careers were really the reason that people put them in the list. For both, test matches made up less than 5% of their first class career. I'll deal with them (and Charles Bannerman) in a future post.

Here's the list:

RankNameVotesAverageNorm Average
1Don Bradman11999.94101.03
2Sachin Tendulkar11253.7954.10
3Brian Lara10852.8954.41
4Viv Richards8450.2454.96
5Ricky Ponting5551.8552.50
6Kumar Sangakkara5258.4558.27
7Gary Sobers3157.7857.71
8Rahul Dravid2852.3152.73
9Jacques Kallis2755.3759.55
10Jack Hobbs2456.9563.01
11Barry Richards1272.57*
11Wally Hammond1258.4658.44
13AB de Villiers1152.1052.99
13Steve Waugh1151.0653.56
15WG Grace1032.29*
16Graeme Pollock960.9759.91
16Sunil Gavaskar951.1254.76
18Herbert Sutcliffe460.7362.00
18Dennis Compton450.0653.44
18Martin Crowe445.3747.91
18Adam Gilchrist447.6149.24
18Allan Border449.5454.30

There are a couple of interesting things here. Less than 3/4 of people picked Bradman. Often they said that it was because they had never watched him bat, and that's understandable, but I would have thought his extraordinary average alone was sufficient to put him in the mix. You don't need to know much about batting averages to know that Bradman's numbers are almost unbelievable.

The tendency to only vote for batsmen that people had seen meant that players who had played since 2000 had to score at a lower average than players who had played before that. Here's a graph comparing the number of votes that a batsmen got with their normalised average:

There was also a tendency for people to nominate players who had done well against their sides. Most votes out of England included Brian Lara who hit both of hit triple centuries against England, while votes from India often included Ricky Ponting who averaged mid fifties against the Indians.

Here's the list ordered by their Normalised Average. I've added in two other older players who only got one vote each, Ken Barrington and Everton Weekes but who both had exceptional records.

NameAverageNorm Average
Don Bradman99.94101.03
Ken Barrington58.6764.00
Jack Hobbs56.9563.01
Herbert Sutcliffe60.7362.00
Graeme Pollock60.9759.91
Jacques Kallis55.3759.55
Everton Weekes59.4659.39
Wally Hammond58.4658.44
Kumar Sangakkara58.4558.27
Gary Sobers57.7857.71
Viv Richards50.2454.96
Sunil Gavaskar51.1254.76
Brian Lara52.8954.41
Allan Border49.5454.30
Sachin Tendulkar53.7954.10
Steve Waugh51.0653.56
Dennis Compton50.0653.44
AB de Villiers52.1052.99
Rahul Dravid52.3152.73
Ricky Ponting51.8552.50
Adam Gilchrist47.6149.24
Martin Crowe45.3747.91

A couple of interesting things here are the way that players are rewarded for scoring on the harder pitches. Sutcliffe and Hobbs played together through a large part of their careers. But Hobbs was the one that scored the most runs when the conditions were the hardest for batting. As a result Hobbs' average increased by 6.06 while Sutcliffe's only increased by 1.27.

Jacques Kallis likewise scored a lot of runs at Newlands, which has been a graveyard for batsmen, and he has been rewarded for that. Kumar Sangakkara however, has scored a lot of his at the SSC, which is a place that batsmen have prospered, and so that saw his normalised average end up lower than his actual average.

I still have a number of players that I'd like to look at such as Victor Trumper, Bruce Mitchell, Zaheer Abbas and Andy Flower. But there's plenty of time for that in the next installment.