Sign in to follow this  
Seiyashi

Kaminariyuki's Spirited Rikishi question; or, Spirited Rikishi and the Kanto-sho

Recommended Posts

TL;DR:

Spoiler

There is a statistically significant correlation between appearing in the voting poll for Spirited Rikishi and winning the kanto-sho.

@Kaminariyuki recently asked, in the Spirited Rikishi Aki 2020 thread, whether wrestlers showing up in the Spirited Rikishi daily voting rankings (hereinafter the "voting rankings") were more likely to get a kanto-sho at the end. So for the hell of it, I pulled the results from the last 22 of @Akinomaki's threads (since Kyushu 2016, excluding Haru 2020 as kindly pointed out by @Jakusotsu), as well as the data on kanto-sho winners from the SumoDB, and did a little processing in Python with pandas and scipy.

Rephrasing Kaminariyuki's question within a hypothesis testing framework, the question is whether rikishi in the voting rankings are more likely to get the kanto-sho compared to rikishi not in the rankings. For the purposes of this question, all rikishi each basho can be divided into two categories - those that appear in the voting rankings, and those that don't.

The null hypothesis - or the default assumption that the voting rankings do not affect the award of the kanto-sho - is that rikishi in the voting rankings are as likely as other rikishi to get the kanto-sho. Therefore, the frequency of rikishi who are in the voting rankings and who are awarded the kanto-sho, should be approximately equal to the frequency of rikishi who are not in the voting rankings and who are awarded the kanto-sho. Conversely, the alternate hypothesis is that rikishi in the voting rankings are more likely to get the kanto-sho; therefore, the frequency of the rikishi who are in the voting rankings and who are awarded the kanto-sho is much higher than rikishi not in the voting rankings and who are awarded the kanto-sho. We can test these two hypotheses by means of recourse to the data.

A simple analysis shows the following:

image.thumb.png.d1a17180a6726839ac754cd230e7a055.png

From this, we can see that of 34 kanto-sho awarded over this period, 26 went to rikishi who also appeared in the voting rankings. Out of the whole top division of 42 rikishi, an average of 16 rikishi feature in the voting rankings. Therefore, approximately 40% of the rikishi in the top division will feature in the rankings for a 2:3 split, but rikishi in the voting rankings win the kanto-sho in an approximate ratio of 5:2.

So, clearly, the voting rankings have an effect on the award of the kanto-sho, right?

Well, it's not that simple. "But Seiyashi," some of you may say, "not every rikishi in the top division is eligible for the kanto-sho, and neither is every rikishi in the voting rankings!" And that is entirely true. To examine this a bit more closely, we'll have to review each banzuke more specifically to determine the true ratio of voting ranking rikishi and top division rikishi who are eligible for the kanto-sho.

A quick recap - the kanto-sho is awarded to maegashira, komusubi, or sekiwake with a kachikoshi. Juryo visitors are ineligible, as are yokozuna and ozeki. To take an example, let's look at the Kyushu 2016 banzuke and results.

On the banzuke, only 14 men out of 42 ware eligible for the kanto-sho: Tamawashi, Shodai, Shohozan, Takarafuji, Tochinoshin, Takekaze, Ikioi, Myogiryu, Chiyoshoma, Arawashi, Hokutofuji, Sokokurai, Ishiura, and Gagamaru.

On the list of rikishi in the voting rankings, only 5 out of 13 are eligible for the kanto-sho: Ishiura, Shodai, Tamawashi, Tochinoshin, and Ikioi. Endo, Mitakeumi and Yoshikaze are ineligible due to their makekoshi, and the remaining 5 are the yokozuna and ozeki Hakuho, Kakuryu, Kisenosato, Goeido and Terunofuji.

Applying this methodology to the remainder of the basho, we arrive at the following table:

image.png.cac49663b73688eca08012aa78983505.png

So for Kyushu 2016, out of 14 eligible rikishi on the banzuke ("EROB"), 9 of them were not on the voting rankings ("NVRR"), 5 of them were ("VRR"), and of the 2 kanto-sho awarded, 0 went to rikishi not on the voting rankings ("NVRKTS") and 2 went to rikishi on the voting rankings.

Summing them all and using a statistical technique known as the Fisher exact test, we can figure out how unlikely this lopsided distribution is - or to put it another way, how likely is it that this ratio arises by chance. The result:

image.png.a64ab34686bf5a1759f6f2a19234a480.png

This means that there is a 0.0004, or 0.04% chance, that the distribution of kanto-sho in such a way that favours rikishi in the voting ranks arises by chance. In other words, there is a statistically significant correlation, taking an alpha/significance level of 5%, that rikishi who appear in the voting rankings are more likely to be awarded the kanto-sho.

However, as very kindly pointed out by my partner, that there is a statistically significant correlation does not imply causation, and the best that can be said is that the sansho committee and the pollsters are both generally attracted to the same factors. It is certainly an overreach to say the sansho committee takes into account the voting rankings.

So, to answer Kaminariyuki's question in the affirmative, yes, rikishi who appear in the voting rankings are more likely to be awarded the kanto-sho.

Edited by Seiyashi
  • Like 2
  • Thanks 6

Share this post


Link to post
Share on other sites
2 hours ago, Seiyashi said:

TL;DR:

Love this research. Should be peer-reviewed and published as the first journal of “The International Journal of Sumo Forum” (Injosufo)!

Edited by code_number3
  • Like 1

Share this post


Link to post
Share on other sites
1 minute ago, code_number3 said:

Love this research. Should be peer-reviewed and published as the first journal of “The International Journal of Sumo Forum” (Injosofo)!

Hence I am putting it before the Sumo Forum of peers for review :D 

Share this post


Link to post
Share on other sites

Seiyashi you absolute legend!  

Awesome post and thanks for taking the time to compile the data for this :)

  • Like 1

Share this post


Link to post
Share on other sites

It's not my specialization, but I have been up through multi-variable regression analysis, and I don't know what a Fisher Exact test is. I'm assuming it's a variation of the student's t-test, but I'm not doing due diligence on a Sunday morning, even though I have had coffee. I am devastated by the effort Seiyashi has put into that simple question, one to which I suspected I already knew the answer.

If I had one to offer, I'd give you a job on the basis of that analysis.

 

Share this post


Link to post
Share on other sites
4 hours ago, code_number3 said:

Love this research. Should be peer-reviewed and published as the first journal of “The International Journal of Sumo Forum” (Injosufo)!

There are a ton of online-only journals now. If one wanted to take it to yet another level. Perhaps better to start with a special issue of the Journal for Irreprocducible Results. I could most likely be prevailed on to contribute.  ;)

 

Edited by Kaminariyuki
grammar
  • Like 3

Share this post


Link to post
Share on other sites
6 minutes ago, Kaminariyuki said:

It's not my specialization, but I have been up through multi-variable regression analysis, and I don't know what a Fisher Exact test is. I'm assuming it's a variation of the student's t-test, but I'm not doing due diligence on a Sunday morning, even though I have had coffee. I am devastated by the effort Seiyashi has put into that simple question, one to which I suspected I already knew the answer.

If I had one to offer, I'd give you a job on the basis of that analysis.

 

Domo arigato gozaimasu. I am very humbled by your kind words.

It was a small enough dataset that it could be confirmed and processed largely by hand. I haven't quite gotten around to scraping the DB yet - I wrote to, I think, Doitsuyama to ask for permission prior to joining the SF, but I didn't get a reply at the time. 

Share this post


Link to post
Share on other sites

I was so entertained that I went back and read the second half again, and this time rather carefully for a Sunday morning. It seems quite sound. And, quite likely, there are confounding variables. One would hope so...

Oh, I wonder if any of the awards committee are also contributing to the VR?

Please don't devise a statistical analysis. I'd rather not know. Some things should remain a mystery. I will attempt to refrain from asking questions for a bit, although that may pose a challenge.

Edited by Kaminariyuki
typos, I need to proof better...
  • Haha 1

Share this post


Link to post
Share on other sites

Wonderful work! Next perhaps you can try to find out which of these rikishi remarks is more likely to be followed by a victory the following day: "I shall gambarize" / "I want to do my brand of sumo" / "I take it day by day" / "I'm not thinking about the yusho" :-D

  • Haha 2

Share this post


Link to post
Share on other sites
18 minutes ago, hakutorizakura said:

Wonderful work! Next perhaps you can try to find out which of these rikishi remarks is more likely to be followed by a victory the following day: "I shall gambarize" / "I want to do my brand of sumo" / "I take it day by day" / "I'm not thinking about the yusho" :-D

Off the top of my head, I suspect you will find no correlation. B-)

Share this post


Link to post
Share on other sites
On 20/09/2020 at 09:54, hakutorizakura said:

Wonderful work! Next perhaps you can try to find out which of these rikishi remarks is more likely to be followed by a victory the following day: "I shall gambarize" / "I want to do my brand of sumo" / "I take it day by day" / "I'm not thinking about the yusho" :-D

I thought that all rikishi said all three things every time they are in front of a camera?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this