Friday 29 June 2018

Using Added Value to measure cricket performances - Part 2 ODI bowling

Reid and Matthews chat before the final over.
It was the summer of 1990/91, just before Christmas. I was on holiday at my Aunty's place in Mount Maunganui. My cousin and I were sleeping in the glass conservatory, looking out over the sand-dunes. The air smelt like salt and sand.

In our little room was a little TV, on the TV was the cricket coming out of Australia.

New Zealand were playing against Australia in Hobart.

I was 11 years old, and I was enthralled.

Danny Morrison was bowling. A year earlier, he had come to speak to my primary school assembly, then signed autographs by the school pavilion. I got him to sign my cricket bat, and I tried really hard to not get it scratched off. He was my favourite bowler. Australia needed 6 runs to win. Greg Matthews was on strike. I wasn't sure why, but as nobody seemed to like Greg Matthews, I didn't either.

Morrison bowled from around the wicket, and speared a fullish ball into leg stump. Matthews drove it, inside-out, through point for four. I really didn't like Matthews now.

Thursday 28 June 2018

Using Added Value to measure cricket performances - Part 1

The palpable drop in air pressure from 41000 people collectively inhaling is something that is hard to understand unless you've experienced it. Dale Steyn was bowling to Grant Elliott.

Eden Park in full World Cup mode.
Elliott had been a controversial selection. In the first 5 matches he averaged 19.25 at a strike rate just below 82. That's hardly justifying the selectors faith in you.  However, an important innings against Bangladesh, followed by a breathtaking 27 off 11 against West Indies gave him momentum going into this match. Often the concept of momentum is a case of us over-fitting. We see a couple of good performances back to back and assume a causative effect, where there really probably isn't one. However, regardless of the causes, Elliott was playing the sort of innings that quickly made the critics forget that they had been calling for the coach's head for selecting him before the tournament.

There was a significant obstacle in his road to glory. Dale Steyn. Quite possibly the best bowler to have ever strapped on a pair of boots. Throughout history there have been terrifying bowlers. Fred Spofforth, Jeff Thompson, Andy Roberts, Waqar Younis and Brett Lee are examples. There have also been bowlers who could work a batsman out and exploit weaknesses unmercifully. Alec Bedser, Richard Hadlee, Wasim Akram and Glenn McGrath all shared this trait. In Steyn, South Africa had someone who was in both camps. Possibly only Larwood, Truman, Lillee, Holding and Marshall have been there like Steyn.

But now there was 2 balls left, and 5 needed. Steyn with his tail up bowling to Elliott on 78 off 72. Elliott had previously hit a bouncer for 6, and Steyn's last yorker had gone for 4. Eden Park's short straight boundary meant that bowling full was risky, and bowling short could ask for a top edge carrying for 6. Better instead to try to get him to hit to the longest boundary so Steyn opted for a length ball....

The palpable increase in air pressure from 41000 people collectively screaming with exhilaration as the ball clears the boundary is something hard to understand unless you've experienced it.

Wednesday 13 June 2018

How well will Afghanistan's spinners go?

It's a fascinating question - at least two out of 4 spinners who have been very successful in limited over cricket are likely to make their test debut tomorrow. They will be playing in India, a country known for spinning tracks.

The Afghan captain, Ashgar Stanikzai is certainly bullish about their ability: "In my opinion, we have good spinners, better spinners than India."

That's a big call. India have some very high quality spinners in test cricket. But does he have a point?

There is no doubt that Afghanistan's spinners have been good in limited overs cricket, but does that mean anything?

Some spinners have excelled in test cricket and limited overs. Warne, Murali, Shakib, Swann all have excellent stats in all forms of the game.  But others haven't seen such correlation. Amit Mishra, Ish Sodhi, Abdur Razzak and Sunil Narine all have very strong limited overs stats, but have not converted that to test matches.

The comparison of Sodhi and Rashid is an interesting one. When they've played against similar opponents, they have had very similar stats. See for example this graph of combined IPL and Big Bash statistics. (This is the 8 players who played a reasonable number of matches in both tournaments over the past 2 years)

They have clearly been two of the stand out bowlers in the two major domestic T20 tournaments, and yet, Sodhi averages over 40 in test cricket with the ball.

Part of that will be that Sodhi has to play half his cricket in New Zealand, but part of it is also that it is not always possible to predict absolutely test success based on limited over success. They are sometimes linked, but not always.

I wondered if there was a general relationship in the numbers. Were Narine and Sodhi the outliers, or were Warne and Muralitheran the odd ones.

So I tried to construct a model, to see what happened. It turns out, to my surprise, that it is possible to predict, vaguely, the average and strike rate of a bowler in test cricket based on their ODI and T20 performances. I looked at the 37 players who had played as spinners in all 3 formats, had played at least 40 combined limited overs matches and had taken at least one test wicket.

The model is not very reliable, but it was better at predicting the bowling statistics of the players than just taking the average for the group. So it did provide some interesting numbers.

The equations that it came out with were as follows:
Only 4 of the 6 Afghani spinners had played enough limited overs cricket to have meaningful numbers, and here are their predicted results:

That would relate to the following results based on overs bowled:

Those wouldn't be particularly bad returns for a first ever test against India, but they also aren't really the results that Stanikzai is hoping for.

But the question comes up, how reliable is this anyway?

There is a degree of randomness in cricket results that means that any predictions are always quite unreliable. Looking at the graphs of the predicted vs actual for the career stats, it shows that there is quite a bit of variation about the trend.

I've added in the red lines manually, They show where roughly 95% of the data fits. Within 17 of the average and within 20 of the strike rate. Those are huge variations, which shows that it's very difficult to predict test performances based on limited overs performances.

I have added in the best realistic and worst realistic expected figures, based on these confidence bands:

It will be interesting to see which one of these the Afghani spinners actually get closest to.

Tuesday 12 June 2018

The greatest ever wicket keeping batsman.

The first article that I wrote that garnered any attention was a look at Matt Prior's career as a wicket-keeper batsman, and to see how he stacked up against some of the greats: Gilchrist, Flower and Ames. It had about 200 reads, until Jarrod Kimber tweeted out a link to it, and then it had about 1000, doubling the total number of reads that my blog had had up until that point. Then there was a rain break in the England vs India test, and one of the Cricinfo commentators decided to link to the article with "here's something to look at while you wait for the rain to clear." I was swamped. About 24000 people read that article in the next 4 hours, and my little project blog became something that people started to read.

I also included a brief comparison with MS Dhoni, which got me a couple of death threats, for daring to suggest that Prior was better than Dhoni. (My favourite was "I'm going to come to England and burn your house down, you biased English." - not particularly concerning at the time, as I lived on the opposite side of the world from England).

I'm going to attempt to play with fire again, and re-look at the question.

Over the past 6 months, I've given up my job, and gone back to university to study statistics. This post is in part me attempting to use some of the tools that I've learned in that process.

One thing that comes up when discussing this, is how difficult it is to bat and keep, and if it's easier to bat with the tail, or with at the top of the order.

To try to answer those questions, I took some information for a few keepers, and had a go at running some models on them. The list of keepers that I've looked at is: Adam Gilchrist, Kumar Sangakkara, Andy Flower, Matt Prior, Brendon McCullum, Mahendra Singh Dhoni, Mushfiqur Rahim, Alec Stewart and BJ Watling. Initially, I also looked at Clyde Walcott and Les Ames, but it was difficult to get some of the information to build the models for them, so I've left them out. I'm only looking at batting. This is comparing the batting of players who kept wickets.

The variables that I looked at were as follows:
1. Average of partnerships when they came to the wicket. For example if a player came to the crease at 20/4, this was 5, if they came to the wicket at 380/2 it was 190. For opening batsmen I used 21, as this is the median opening partnership, so it is a reasonable expectation of how difficult it is to bat.
2. How many balls have passed. If the player comes to the wicket in the 51st over, it's likely to be different conditions to coming to the wicket in the 3rd over.
3. Are they the designated keeper or not. For some players, being the designated keeper hurts their batting, but for others it has the opposite effect.

I then split the data into two randomly, created the most parsimonious model that I could with half of one particular batsman's data, and then tested it on the other half of the data. I repeated this 100 times, then took the average coefficients from the 10 models that tested the best.

There are possibly better options for how to do this, but it seemed to return reasonably sensible results.

The next thing that I did was to apply those models to a number of different scenarios, some as a keeper, and some as a batsman.

That resulted in the following graph:

The answer to the question, based on these models, is quite comprehensively Andy Flower. If you're wanting to select another one, as a pure batsman Sangakkara is your man. If you want someone to bat in a crisis, then Gilchrist is the second best option, but to grind the opposition into the dust, after, Flower, you would want BJ Watling.

I quite like the idea of thinking about players based on situations, rather than overall averages. There is certainly more options that I could look at to build the model, including controlling for opposition and location. But for now this is an interesting look at a difficult problem.