Tuesday 6 August 2019

Second only to Bradman?

Steven Smith has just celebrated his test come back by scoring a century in each innings at Edgbaston in Birmingham. Not content with just scoring a "come-from behind fighting century" when the bowlers were on top, he also added a "rub the salt in" century when the batsmen were on top.

It was such a match defining performance that the questions have been asked again, is he the best since Bradman?

I won't attempt to do a complete statistical breakdown right here now, but I will focus on a couple of statistics that suggest either "yes" or "not quite."

One thing that I've started to be more and more interested in is the performance of a batsman at their peak. It is hard to deny that a batsman's skill level changes throughout their careers. Some start off as amazing players, but then fade, others start slowly, then blossom into better players. Most start off slowly, have a strong middle period of their career then fade again at the end.

The graph below illustrates three players that had quite different career trajectories, but were all very good players.

Dennis Compton started off with an amazing run of scores, only Don Bradman averaged more in his first 30 test matches. His career never really reached those heights again, however, and he had a period where he really struggled, before modifying his game and ending his career on a (less dramatic) high.

Martin Crowe was picked as a teenager, and sent on a difficult tour, before he was really ready. He struggled and was in and out of the side at first. It took him a while to really own his position. After a while, he developed into one of the best batsmen in the world. Later on he struggled with injuries and his career petered out to a shadow of what he had previously been.

Marvin Atapattu scored only one run in his first 6 innings. That start was not an easy one to recover from. Throughout his career he tended to have a mixture of exceptionally large scores and regular ducks, which meant that it looked like he had patchy form. But for the majority of his career he tended to average above 40 in any given 30 match sequence after his horrific early period.

The story is clear, however, that an overall career average does not necessarily tell us about how good a player actually was. Looking at a player's peak is actually a better idea than looking at their overall career. That's especially true when comparing former players with current ones, or comparing players who retired at their peak with ones who continued on because even though they were no longer at their best, they were still better than the alternatives.

To compare players at their peak requires finding a way to define their peak. It's difficult to know how many matches to choose as a player's peak. It will certainly differ from player to player. Some will maintain their peak form for a number of years, while others may get injured, banned for ball tampering or retire just as they are starting to hit it. Added to that, the number of tests played has greatly increased for most nations, so while an old player like Jack Cowie never missed a test for 12 years and yet never made it to 30, someone playing for England now could potentially reach 30 tests after only playing test cricket for 20 months.

There's also the issue of sampling variability in small samples. If we look at 30 tests as defining a player's peak, that makes a maximum sample size of 60 innings (more likely to be closer to 55). 50 tests would make a maximum sample size of 100 innings (more likely to be close to 90).

If we simulate innings based on a player with a batting average of 45, we can find the range of likely 30 match and 50 match averages if the results are distributed randomly. For this, I've used geometric distribution to create random scores, and then found the average of them. This has been shown to be a reasonably useful way of simulating cricket scores, so it will give some indication of the expected variance in the averages.

The red and green lines here are the 95% bands for the simulated data. With the 30 match averages, the player who should have averaged 45 tended to average somewhere between 33 and 58. With 50 matches, the player tended to average between 36 and 54.

This needs to be remembered whenever comparing averages. A peak can be a player's skill improving, or it can be just random variation. Someone who averages 52 is not necessarily a better player than another one who averaged 49. It is just not possible to be confident statistically that there's a difference between these two player's ability. That's just based on sampling variability, and not accounting for non-sampling factors such as the opposition that they faced or the conditions that they played in.

Given that, is there any point in comparing at all? Well, it's not going to definitively say who was the best, but it can tell us who played the best.

For this analysis I am only including matches for players where they actually batted. As a result Don Bradman only has 50 tests, as there are two where he got injured fielding/bowling and did not end up batting. I am also not including the WSC Supertests or any matches played for the ICC World XI.

The top 21 instances of the best 30 matches by either average or total runs are the 21 combinations of 30 in a row out of Bradman's 50 matches.

He is so far ahead of the rest of the players in history that in his worst ever 30 matches he still scored 14% more runs than the best 30 matches by any other player.

Here are the tables of the top 10.


The top name is consistent, but the other names in table are much less consistent. 18 players appear at least once, with Bradman, Ponting, Sangakkara and Smith being in all 4 tables, while Sobers, Kallis and Yousuf all in the list 3 times.

This does not tell us definitively who is second. There is enough sampling variation alone that there's not enough evidence to say that Waugh was better in his best 30 innings than Hayden was, just that he performed better. But that's really all we can hope for.

Steven Smith may not be the best since Bradman, but he may well be also.