Saturday 24 November 2018

Historical statistical preview of the Second Test, Pakistan vs NZ

I've decided to put together a short summary of some of the historical trends at Dubai, before this match.

First, the probability of different results based on first innings scores. This suggests that a score of 300 is roughly the point where a team is more likely to win than lose, while the 50% winning score is roughly 370.


Thursday 22 November 2018

Can we determine a batsman's ability based on how he gets out?

I saw an interesting discussion online recently, suggesting that we could tell that some players had a better technique than others, based on how often they got out to different types of dismissals.

The theory was that players who get bowled or lbw have technical issues, while players who get out caught more often don't have those same issues.

This immediately stuck me as a multivariate statistics problem. Can we tell how good a batsman is based off the proportions of dismissals?

So I gathered together a sample of 160 players, all of whom had been dismissed in the 5 most common ways at least once each, and looked at what we could tell based off those players.

I grouped them based on batting average into 5 roughly equally sized groups. The groups were: Under 27.5, 27.5 to 37.5,37.5 to 43.5, 43.5 to 48.5 and over 48.5.

Once I filtered out players who hadn't played enough innings, I ended up with 29 from group 1, 30 from group 2, 28 from group 3, 36 from group 4 and  37 from group 5.

Their distributions were as follows:

There are some differences between the groups, but there seems to be more variation within the groups than between them.

I also looked at the raw numbers, without grouping, and adding in trend lines.

A pattern emerges - batsmen who have a very low average, and very high average tend to get bowled and run out more often than players who have an average between 20 and 50. Instead of getting out bowled, the players who have their average between 20 and 50 tend to get out caught more often.

This made me wonder if I could find some technique to group them effectively. I wasn't hopeful, because again the variation within the different groups seemed to be greater than the difference, other than from the very edge to the middle.

The methods that I chose to try were Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Random Forest (categorical) and Random Forest (regression then rounding). I'm aware that most readers won't have studied multivariate statistics, so I'll briefly explain how these methods work. If you don't care how they work, click here to skip.

Linear and Quadratic Discrimination (LDA and QDA) can be imagined like plotting all the information on a giant multi-dimensional graph, then rotating the axes until the different groups are as separated as much as possible. The Linear version assumes that there are straight lines that separate the different groups, while the Quadratic version allows the groups to be separated by a curve. This is a more powerful technique, but it needs more data to be able to get an answer.

An example of how LDA works is in the two graphs below. There's a set of data that has two variables and is in two groups, and can't be easily distinguished by splitting any variable alone (displayed on the graph on the left). By making a decision based just on Variable 1, the best split gets 20 out of the 30 points classified correctly, and the best split on Variable 2 gets 22 out of the 30 points classified correctly. But if we put in a set of axes that are rotated, (the green lines) and redraw the graph (on the right) the two groups are able to be split quite well by being greater than or less than -0.45 on the new rotated x axis. By splitting on the rotated axis, 27 out of the 30 points are now classified correctly.


Random Forest is a technique that creates lots of decision trees, each based on small samples of the data and only some variables. Then it averages them out, giving the ones that performed best a higher weighting. This is a harder method to explain easily, but the process itself is actually reasonably simple. It's often remarkably effective for making predictions, but can be hard to explain how the final model actually works.

To test the four methods, I first of all tried doing leave-one-out cross validation to see how they performed. This is where I use all but one data point to build the model, then test that model on the remaining data point and see how well it was allocated - I do this for all 160 batsmen.

The groups are roughly evenly split, so randomly allocating the different batsmen to different groups saw about 20% get put in the right place. The different methods saw these results:

MethodSuccess
LDA25.00%
QDA28.13%
Random Forest (groups)18.75%
Random Forest (average)23.75%

I wondered if these were within the range of what I would expect from just randomly allocating batsmen to groups, so I randomly allocated the batsmen to groups 10000 times, and saw what the distribution of the number correct looked like. Below is a graph of that, with the 4 methods on it.


The method that worked the best here (Random Forest to predict averages) was still beaten by about 1% of the randomly allocated trials.

One of the issues with just looking at the proportion correct is that it doesn't tell us about how many are close to being right. I wondered how they would go if I plotted the actual group against the expected group, and found a measure of goodness of fit for it. The values for the goodness of fit here go from 1 down, where 1 is a perfect fit, 0 is the result of just giving every point the average, and negative values are even worse than that.

Here's how the 4 methods stacked up:

MethodGoodness-of-fit
LDA-0.710
QDA-0.505
Random Forest (groups)-0.624
Random Forest (average)-0.064

The Random Forest (average) method was the best, but was still not as good a fit as just allocating every batsman to group 3. (That had a goodness of fit value of -0.009). 

I put these methods against a random allocation, to see how well they actually did:

Even though the Random Forest average method was better than random arrangement, it was not as good as just classifying everybody as group 3. 

What this tells us is that this data really is not very useful at all for making a classification. We know that the methods would all have done a reasonable job of distinguishing between the different groups if there was actually a difference between them that could be found. However, there is not any real way to distinguish how good a batsman is by the way that they got out.

I also tried some other methods, that theoretically shouldn't have been as good, just to see how well they went. I managed to get a couple that had a goodness-of-fit as high as -0.235 with 26% correct classification. These seem to be reasonable, but they were still not as good as just classifying every batsman as group 3. Any method that isn't as good as that, is really worthless in making a classification.

In conclusion, looking at the proportion of ways that a batsman has been dismissed is not particularly helpful in deciding how good they are. The differences within groups are much larger than the differences between groups. What that means is that if you're having an argument with someone on the internet, and they say that a player should be dropped because they get out LBW so often that it shows that they have a bad technique, you can smile smugly to yourself, knowing that they are speaking nonsense. You could even send them a link to this article if you want.

Thursday 1 November 2018

Comedy Run Out Set to Music 5

In a field of strong contenders, this might be the best yet.

Will Somerville got hit by Ben Horne, dived, probably made his ground (based on fairly low quality video quality), then was given out.

It was clearly not his day.

Wednesday 31 October 2018

Comedy run out set to music 4

Sean Solia has had a bit of a golden run recently. His List A averages are scarcely believable, averaging 56 with the bat and 18 with the ball. But he learned today that it's not a good idea to go out walking when the ball isn't dead.

Also watch for Henry Nicholls getting high fived in the face by Tom Latham.

Monday 29 October 2018

Comedy run out 3

Time for the 3rd instalment of comedy run outs set to music.

The context here was that it was the start of the 46th over, Auckland vs Wellington in a 50 over match. The pair are the wicket had added 66 runs in just over 8 overs, but this started a collapse where Auckland lost 4 wickets for 21 over 3.4 overs.

I hope you enjoy.

Sunday 21 October 2018

Plunket Shield update - Round 2

At the end of round 2, I thought it would be good to do an update on the progress of the tournament, and look at some trends that have emerged.

One thing that I thought I would focus on is how the runs have been scored, rather than just how many.

I looked at each innings and looked at the total runs from boundaries, and the total other runs (I called them run runs, but they include no balls and wides, as they were too hard to separate).

I plotted them on a graph, to see if there were any interesting patterns emerge.


There were a couple of things that I noticed. Auckland, Otago, Wellington and Canterbury have all had similar rates across the different innings that they've batted, while Northern Districts and Central Districts have had more variety in how they've accumulated their runs.

The triangles seemed to be higher up the chart on average, with all of them being above the median boundary rate, so I thought that I'd see if there was a correlation between the rates and the total competition points gathered in a match.

There is a reasonably strong relationship between the boundary rate and the points earned in a match, however, there's almost no relationship at all between the speed of accumulation of non-boundary runs and the points earned.

There is a theory that regularly rotating the strike makes it easier to survive a match, as it doesn't allow the bowlers to settle. I certainly know that I hated batsmen hitting singles off my bowling, and I remember Dale Steyn saying in a press conference something to the effect of "I don't mind dropped catches that much. Dropped catches happen. But I get really upset when a fielder lets a batsman get off strike when I had him under pressure."

Of the 7 innings played by a losing side, 5 of them had a run-runs rate below the median. That made me wonder if there was a pattern there. I looked at the final innings by teams that batted out a draw or lost, and looked to see if there was a difference in the rates for the teams that lost vs the teams that drew.

This graph isn't particularly meaningful at the moment, with only 5 innings to look at, but I intend on building this up as the season goes on.

Looking at it as individual points makes it more clear:

I've circled the point at the bottom, because that was an innings where Canterbury lost their last wicket with only 6 balls remaining, and so it was very close to being a saved match. Interestingly the teams that have scored a lot of boundaries have lost, but it is a very small sample to be drawing too many conclusions from.

The final table, with other information, looks like this:


This made me wonder which correlation was stronger, scoring rate with total points, or the traditional value of Net Average Runs Per Wicket (batting average minus bowling average).

The Net Average Runs Per Wicket seems to be a better predictor of success, but there is a clear relationship with the scoring rate also.

I'll be interested to see how these develop as the season progresses, but for now we seem to have a separation between the sides, with Auckland, Canterbury and Otago all needing to find another gear for the next round.

Friday 19 October 2018

Another comedy run out

Auckland went from 54/1 to 66/4. And it started off with this piece of gold.

Comedy run out

Everyone loves a good comedy run out, and most batting collapses have one.

This one happened in the process of Otago going from 114/4 to 120/8 in 4 overs...


Tuesday 31 July 2018

Fakhar Zaman's amazing start

18 matches. That's all it took.

1 year and 45 days to go from relative obscurity to being one of the most talked about cricketers in the world. Fakhar Zaman has certainly burst onto the world stage.

I wanted to look into the context around the record, and also to see how his start compares to others.

I then will look at how well we can predict someone's eventual career record based on their start.


Tuesday 3 July 2018

Using Added Value to assess cricket performances - Part 3 ODI all rounders

Viv Richards - by Aditya Naikdesai
Growing up in the 1980's, cricket was all about three things: the mighty West Indies, Lance Cairns biffing 6's and the "Big 4" all rounders.

I remember sitting on the school bus as someone was sharing the updates of a match that they'd been listening to on the radio: Viv Richards hit a century, Malcolm Marshall took 3 wickets, Lance Cairns hit a ball onto the roof. I may have been mixing up 3 different games, but that's what every match report felt like.

Other matches were all about the battle between Imran Khan and Ian Botham, or Richard Hadlee vs Kapil Dev. Those 4 players transfixed a generation - each of them had the ability to win a match with either the bat or the ball.

Throughout history, all rounders have been both highly sought after, and incredibly rare. To find a player who was capable of playing as a batsman or a bowler was unusual. To find someone who was a star with both was phenomenal.

There are 43 players who took more than 100 wickets and scored more than 2000 runs in ODI cricket. Some of them were batsmen who bowled a bit, some bowlers who occasionally contributed with the bat, some were "bit's and pieces players." An interesting question is if any were truly all rounders. Were any players both above average batsmen and bowlers?

Friday 29 June 2018

Using Added Value to measure cricket performances - Part 2 ODI bowling

Reid and Matthews chat before the final over.
It was the summer of 1990/91, just before Christmas. I was on holiday at my Aunty's place in Mount Maunganui. My cousin and I were sleeping in the glass conservatory, looking out over the sand-dunes. The air smelt like salt and sand.

In our little room was a little TV, on the TV was the cricket coming out of Australia.

New Zealand were playing against Australia in Hobart.

I was 11 years old, and I was enthralled.

Danny Morrison was bowling. A year earlier, he had come to speak to my primary school assembly, then signed autographs by the school pavilion. I got him to sign my cricket bat, and I tried really hard to not get it scratched off. He was my favourite bowler. Australia needed 6 runs to win. Greg Matthews was on strike. I wasn't sure why, but as nobody seemed to like Greg Matthews, I didn't either.

Morrison bowled from around the wicket, and speared a fullish ball into leg stump. Matthews drove it, inside-out, through point for four. I really didn't like Matthews now.

Thursday 28 June 2018

Using Added Value to measure cricket performances - Part 1

The palpable drop in air pressure from 41000 people collectively inhaling is something that is hard to understand unless you've experienced it. Dale Steyn was bowling to Grant Elliott.

Eden Park in full World Cup mode.
Elliott had been a controversial selection. In the first 5 matches he averaged 19.25 at a strike rate just below 82. That's hardly justifying the selectors faith in you.  However, an important innings against Bangladesh, followed by a breathtaking 27 off 11 against West Indies gave him momentum going into this match. Often the concept of momentum is a case of us over-fitting. We see a couple of good performances back to back and assume a causative effect, where there really probably isn't one. However, regardless of the causes, Elliott was playing the sort of innings that quickly made the critics forget that they had been calling for the coach's head for selecting him before the tournament.

There was a significant obstacle in his road to glory. Dale Steyn. Quite possibly the best bowler to have ever strapped on a pair of boots. Throughout history there have been terrifying bowlers. Fred Spofforth, Jeff Thompson, Andy Roberts, Waqar Younis and Brett Lee are examples. There have also been bowlers who could work a batsman out and exploit weaknesses unmercifully. Alec Bedser, Richard Hadlee, Wasim Akram and Glenn McGrath all shared this trait. In Steyn, South Africa had someone who was in both camps. Possibly only Larwood, Truman, Lillee, Holding and Marshall have been there like Steyn.

But now there was 2 balls left, and 5 needed. Steyn with his tail up bowling to Elliott on 78 off 72. Elliott had previously hit a bouncer for 6, and Steyn's last yorker had gone for 4. Eden Park's short straight boundary meant that bowling full was risky, and bowling short could ask for a top edge carrying for 6. Better instead to try to get him to hit to the longest boundary so Steyn opted for a length ball....

The palpable increase in air pressure from 41000 people collectively screaming with exhilaration as the ball clears the boundary is something hard to understand unless you've experienced it.


Wednesday 13 June 2018

How well will Afghanistan's spinners go?

It's a fascinating question - at least two out of 4 spinners who have been very successful in limited over cricket are likely to make their test debut tomorrow. They will be playing in India, a country known for spinning tracks.

The Afghan captain, Ashgar Stanikzai is certainly bullish about their ability: "In my opinion, we have good spinners, better spinners than India."


That's a big call. India have some very high quality spinners in test cricket. But does he have a point?

There is no doubt that Afghanistan's spinners have been good in limited overs cricket, but does that mean anything?

Some spinners have excelled in test cricket and limited overs. Warne, Murali, Shakib, Swann all have excellent stats in all forms of the game.  But others haven't seen such correlation. Amit Mishra, Ish Sodhi, Abdur Razzak and Sunil Narine all have very strong limited overs stats, but have not converted that to test matches.

The comparison of Sodhi and Rashid is an interesting one. When they've played against similar opponents, they have had very similar stats. See for example this graph of combined IPL and Big Bash statistics. (This is the 8 players who played a reasonable number of matches in both tournaments over the past 2 years)

They have clearly been two of the stand out bowlers in the two major domestic T20 tournaments, and yet, Sodhi averages over 40 in test cricket with the ball.

Part of that will be that Sodhi has to play half his cricket in New Zealand, but part of it is also that it is not always possible to predict absolutely test success based on limited over success. They are sometimes linked, but not always.

I wondered if there was a general relationship in the numbers. Were Narine and Sodhi the outliers, or were Warne and Muralitheran the odd ones.

So I tried to construct a model, to see what happened. It turns out, to my surprise, that it is possible to predict, vaguely, the average and strike rate of a bowler in test cricket based on their ODI and T20 performances. I looked at the 37 players who had played as spinners in all 3 formats, had played at least 40 combined limited overs matches and had taken at least one test wicket.

The model is not very reliable, but it was better at predicting the bowling statistics of the players than just taking the average for the group. So it did provide some interesting numbers.

The equations that it came out with were as follows:
Only 4 of the 6 Afghani spinners had played enough limited overs cricket to have meaningful numbers, and here are their predicted results:


That would relate to the following results based on overs bowled:


Those wouldn't be particularly bad returns for a first ever test against India, but they also aren't really the results that Stanikzai is hoping for.

But the question comes up, how reliable is this anyway?

There is a degree of randomness in cricket results that means that any predictions are always quite unreliable. Looking at the graphs of the predicted vs actual for the career stats, it shows that there is quite a bit of variation about the trend.


I've added in the red lines manually, They show where roughly 95% of the data fits. Within 17 of the average and within 20 of the strike rate. Those are huge variations, which shows that it's very difficult to predict test performances based on limited overs performances.

I have added in the best realistic and worst realistic expected figures, based on these confidence bands:



It will be interesting to see which one of these the Afghani spinners actually get closest to.

Tuesday 12 June 2018

The greatest ever wicket keeping batsman.

The first article that I wrote that garnered any attention was a look at Matt Prior's career as a wicket-keeper batsman, and to see how he stacked up against some of the greats: Gilchrist, Flower and Ames. It had about 200 reads, until Jarrod Kimber tweeted out a link to it, and then it had about 1000, doubling the total number of reads that my blog had had up until that point. Then there was a rain break in the England vs India test, and one of the Cricinfo commentators decided to link to the article with "here's something to look at while you wait for the rain to clear." I was swamped. About 24000 people read that article in the next 4 hours, and my little project blog became something that people started to read.

I also included a brief comparison with MS Dhoni, which got me a couple of death threats, for daring to suggest that Prior was better than Dhoni. (My favourite was "I'm going to come to England and burn your house down, you biased English." - not particularly concerning at the time, as I lived on the opposite side of the world from England).

I'm going to attempt to play with fire again, and re-look at the question.

Over the past 6 months, I've given up my job, and gone back to university to study statistics. This post is in part me attempting to use some of the tools that I've learned in that process.

One thing that comes up when discussing this, is how difficult it is to bat and keep, and if it's easier to bat with the tail, or with at the top of the order.

To try to answer those questions, I took some information for a few keepers, and had a go at running some models on them. The list of keepers that I've looked at is: Adam Gilchrist, Kumar Sangakkara, Andy Flower, Matt Prior, Brendon McCullum, Mahendra Singh Dhoni, Mushfiqur Rahim, Alec Stewart and BJ Watling. Initially, I also looked at Clyde Walcott and Les Ames, but it was difficult to get some of the information to build the models for them, so I've left them out. I'm only looking at batting. This is comparing the batting of players who kept wickets.

The variables that I looked at were as follows:
1. Average of partnerships when they came to the wicket. For example if a player came to the crease at 20/4, this was 5, if they came to the wicket at 380/2 it was 190. For opening batsmen I used 21, as this is the median opening partnership, so it is a reasonable expectation of how difficult it is to bat.
2. How many balls have passed. If the player comes to the wicket in the 51st over, it's likely to be different conditions to coming to the wicket in the 3rd over.
3. Are they the designated keeper or not. For some players, being the designated keeper hurts their batting, but for others it has the opposite effect.

I then split the data into two randomly, created the most parsimonious model that I could with half of one particular batsman's data, and then tested it on the other half of the data. I repeated this 100 times, then took the average coefficients from the 10 models that tested the best.

There are possibly better options for how to do this, but it seemed to return reasonably sensible results.

The next thing that I did was to apply those models to a number of different scenarios, some as a keeper, and some as a batsman.

That resulted in the following graph:



The answer to the question, based on these models, is quite comprehensively Andy Flower. If you're wanting to select another one, as a pure batsman Sangakkara is your man. If you want someone to bat in a crisis, then Gilchrist is the second best option, but to grind the opposition into the dust, after, Flower, you would want BJ Watling.

I quite like the idea of thinking about players based on situations, rather than overall averages. There is certainly more options that I could look at to build the model, including controlling for opposition and location. But for now this is an interesting look at a difficult problem.