Sumo Elo Ratings

Onibushou · November 13, 2011

My attempt at ranking the sekitori through the Elo formula. Just something I started playing around with after the last basho. I figured it was a good place to start, since Aki was the first one back to normal (numbers of sekitori rikishi, anyways). It appears there's already been some great work trying to compare them historically. This was a little different though, and also extends into the Juryo division so I figured I'd post the results anyways.

Note- When new players are first ranked under this system (As they all are for now) their rating is usually followed by a p for 'provisional'. It takes a fair few matches for the rating to stabilize and drop the p, and right now its just the one basho. During the provisional period, the ratings fluctuate more wildly and tend to be a little inaccurate. As with most things of this nature, the more data entered, the more accurate the results. Assuming I get the chance to update after Kyushu, I'll probably end up tweaking it a little, and maybe after January too. If anyone cares to know, I can also post a little about the formula itself.

East	Shikona	Rating	[/td]	West	Shikona	Rating
Y	Hakuho	2,507		-	-	-
O1	Baruto	2,371		O1	Harumafuji	2,328
O2	Kotooshu	2,150		O2	Kotoshogiku	2,373
S	Kisenosato	2,368		S	Kakuryu	2,284
K	Toyonoshima	2,230		K	Homasho	2,241
M1	Okinoumi	2,200		M1	Goeido	1,966
M2	Tochinoshin	2,018		M2	Kyokutenho	1,773
M3	Aran	2,142		M3	Gagamaru	1,815
M4	Tochinowaka	1,811		M4	Tochiozan	2,081
M5	Kitataiki	1,756		M5	Yoshikaze	2,091

M6Aminishiki1,664M6Miyabiyama1,882M7Tokitenku1,924M7Takekaze2,002M8Shotenro1,644M8Takayasu1,744M9Wakanosato2,005M9Wakakoyu1,768M10Kokkai1,617M10Fujiazuma1,719M11Toyohibiki1,684M11Myogiryu1,526M12Daido1,569M12Sagatsukasa1,652M13Tamawashi1,561M13Asasekiryu1,552M14Kaisei1,647M14Takarafuji1,407M15Shohozan1,459M15Sadonofuji1,351M16Aoiyama1,393M16Tsurugidake1,384M17Kimurayama1,440---

Juryo-

East	Shikona	Rating	West	Shikona	Rating
J1	Tenkaiho	1,353	J1	Yoshiazuma	1,435
J2	Bushuyama	1,304	J2	Masunoyama	1,407
J3	Chiyonokuni	1,388	J3	Takanoyama	1,399

J4Tamaasuka1,394J4Nionoumi1,292J5Tosayutaka1,377J5Hochiyama1,405J6Kyokushuho1,311J6Kimikaze1,285J7Masaraumi1,242J7Hokutokuni1,362J8Shironoryu1,219J8Kotoyuki1,283J9Tochinonada1,207J9Sadanoumi1,236J10Chiyoarashi1,249J10Takamisakari1,178J11Sotairyu1,136J11Tokushoryu1,190J12Tamanoshima1,140J12Oiwato1,180J13Asahisho1,170J13Satoyama1,159J14Ikioi1,160Chiyozakura1,094

Non-sekitori by rating-

Shikona	Rating
Hitenryu	1,072
Hishofuji	1,001

Kaonishiki996Hamanishiki934

Largest Risers- Gagamaru, Hokutokuni, Kokkai

Largest Fallers- Masunoyama, Kotooshu, Hamanishiki, Kaonishiki

Smallest movement- Takekaze

Edited November 13, 2011 by Onibushou

Asojima · November 13, 2011

The extra spaces come from the new line [enter] characters imbedded within the table. Do not use [enter] within the table code.

Onibushou · November 13, 2011

The extra spaces come from the new line [enter] characters imbedded within the table. Do not use [enter] within the table code.

Ah, much better. Thank you.

Kotomikey · November 13, 2011

If anyone cares to know, I can also post a little about the formula itself.

I would like to know about the formula.

Thank You, for posting this info.

Jakusotsu · November 13, 2011

When looking at Aran (only 5 wins against low-rankers at Komusubi), I think you either put too much emphasis on banzuke rank for the starting values or your measure of fluctuation is too conservative.

Vikanohara · November 13, 2011

Interesting (though by a former chess player, I must admit) !

Some observations and strange differences imo :

- Kotooshu would only be worth a top Maegashira or just Komusubi, which might actually be his true strength at the moment

- Homasho slightly outscoring Toyonoshima struck me, as Toyonoshima is already occasionally in Sanyaku ranks for so long now, while Homasho only just

- Goeido quite far behind Okinoumi, Aran, Tochinoshin, Tochiozan and even Yoshikaze, Takekaze & Wakanosato is simply stunning for a man with almost 100 more victories than defeats

- Aminishiki's rating is also rather low

Curious about how it will continue though.

Randomitsuki · November 14, 2011

Thanks for the ratings!

You have quite a spread in those data - 1500 points between Hakuho and lower Juryo. In comparison, my strength ratings have a 1500 points difference between Hakuho and Daishiryu, the lowest-rated rikishi on the banzuke. It's interesting that the same basic approach can get such different results.

FWIW, here are my current ratings for Makuuchi and Juryo:

East	Shikona	Rating	[/td]	West	Shikona	Rating
Y	Hakuho	2,606		-	-	-
O1	Baruto	2,339		O1	Harumafuji	2,321
O2	Kotooshu	2,177		O2	Kotoshogiku	2,354
S	Kisenosato	2,339		S	Kakuryu	2,309
K	Toyonoshima	2,236		K	Homasho	2,185
M1	Okinoumi	2,142		M1	Goeido	2,193
M2	Tochinoshin	2,146		M2	Kyokutenho	2,023
M3	Aran	2,099		M3	Gagamaru	2,035
M4	Tochinowaka	1,976		M4	Tochiozan	2,122
M5	Kitataiki	2,028		M5	Yoshikaze	2,100

M6Aminishiki2,060M6Miyabiyama2,017M7Tokitenku1,989M7Takekaze2,075M8Shotenro1,944M8Takayasu1,923M9Wakanosato2,014M9Wakakoyu1,988M10Kokkai1,884M10Fujiazuma1,901M11Toyohibiki1,943M11Myogiryu1,966M12Daido1,898M12Sagatsukasa1,903M13Tamawashi1,927M13Asasekiryu1,943M14Kaisei1,917M14Takarafuji1,857M15Shohozan1,894M15Sadonofuji1,813M16Aoiyama1,847M16Tsurugidake1,814M17Kimurayama1,900---

Juryo-

East	Shikona	Rating	West	Shikona	Rating
J1	Tenkaiho	1,832	J1	Yoshiazuma	1,807
J2	Bushuyama	1,811	J2	Masunoyama	1,888
J3	Chiyonokuni	1,821	J3	Takanoyama	1,826

J4Tamaasuka1,777J4Nionoumi1,795J5Tosayutaka1,917J5Hochiyama1,798J6Kyokushuho1,792J6Kimikaze1,792J7Masuraumi1,775J7Hokutokuni1,843J8Shironoryu1,777J8Kotoyuki1,786J9Tochinonada1,789J9Sadanoumi1,763J10Chiyoarashi1,754J10Takamisakari1,785J11Sotairyu1,819J11Tokushoryu1,774J12Tamanoshima1,745J12Oiwato1,743J13Asahisho1,747J13Satoyama1,746J14Ikioi1,774Chiyozakura1,745

Asashosakari · November 14, 2011

Some observations and strange differences imo :
- Kotooshu would only be worth a top Maegashira or just Komusubi, which might actually be his true strength at the moment

- Homasho slightly outscoring Toyonoshima struck me, as Toyonoshima is already occasionally in Sanyaku ranks for so long now, while Homasho only just

- Goeido quite far behind Okinoumi, Aran, Tochinoshin, Tochiozan and even Yoshikaze, Takekaze & Wakanosato is simply stunning for a man with almost 100 more victories than defeats

- Aminishiki's rating is also rather low

Either you or I missed something here big-time - aren't these ratings based only on a starting rating + the (at most) 15 Aki bouts?

When looking at Aran (only 5 wins against low-rankers at Komusubi), I think you either put too much emphasis on banzuke rank for the starting values or your measure of fluctuation is too conservative.

Perhaps a combination of both? Even accounting for the fact that Wakanosato faced mostly upwards in last basho's torikumi and Goeido faced mostly downwards, it's unfathomable to me that Goeido ended up with the lower ranking despite 12 point differential (+5 wins versus -7 wins) between them. It almost seems to imply that Goeido's expected win-loss record against his slate of opponents was near his actual 10-5 record, which just seems out of whack for starting values.

Reverse-engineering the whole thing a bit - Takekaze is at 2002 points after the basho and listed as "smallest movement" so I guess his starting rating at M3e was just about 2000. Kimurayama at the bottom of the division presumably didn't move a whole lot with his 7-8 either (probably even improved given that by necessity nearly all his opponents were ranked higher), so probably 1400 or so for the guy second from the bottom. 27 spots distance between them, so some 20 points per spot? That would make Goeido's initial rating something near 1900...? Tochinowaka would have been around 1820 if those assumptions hold; he's at 1811 now after a 9-6 record. Hmm. Maybe I'm way off, I dunno.

Edited November 14, 2011 by Asashosakari

shumitto · November 14, 2011

1st, thank you both for posting those tables. As to the first, if the ratings take into account only Aki and Kotooshu is 10th with a single win then the formula really overemphasizes rank and thus is very conservative. I would like to know a bit more about the formula employed, the starting values and how you've gauged the scores to get to this result.

Doitsuyama · November 14, 2011

1st, thank you both for posting those tables. As to the first, if the ratings take into account only Aki and Kotooshu is 10th with a single win then the formula really overemphasizes rank and thus is very conservative. I would like to know a bit more about the formula employed, the starting values and how you've gauged the scores to get to this result.

Indeed, the starting values seem to be off. Of course there are at least six basho needed for the ratings to have meaningful values even with appropriate starting values, but I suspect we can wait quite a while here as the spread is too big.

Onibushou · November 17, 2011

Sorry for the late replies, busy few days here already and then trying to keep up with Kyushu. Anyways, to the replying...

I would like to know about the formula.
Thank You, for posting this info.

It turned into a bit of a novel, but for those who really want to know... (and maybe look like this :-P)

The Elo system was developed by Arpad Elo (Born in Hungary I think, immigrated to the US, taught physics, highly ranked chess player). He thought there to be some inaccuracies in the old rating system, so he made his own system and it was adopted by the USCF (US Chess Federation) and eventually FIDE (the international chess body). Elo basically looks at two things- How well you should have done, and how well you actually did. For Hakuho, you would expect him to do well. If he doesn't, rating would drop significantly, if he does it would go up. His expected score would be so high though, that it isn't going to go up a lot. His Aki score was only good for +5pts. Gagamaru on the other hand, came out of left field to put up some good wins. His expected score was pretty low, especially after getting paired with wrestlers like Baruto. His good score thus carried more weight, and caused his rating to shoot up. On the other end are the Maegashira who struggle early and are already losing points (Though possibly not enough, as most of their opponents will be rated higher causing a fairly low expected score). They are likely to get matched with some Juryo wrestlers, further hurting them. A win doesn't help recover from the losses much, and another loss drops them even further (As the Juryo wrestler should theoretically be rated lower).

When calculating an expected score for a match, you should get a result between 0 and 1. A 1 means they are 100% likely to win, and 0 would be a guaranteed loss. You'd never actually hit either of those, instead getting a decimal followed by a long string of numbers. If their ratings are identical, then it will come out to .5, a coin flip basically. Actual score is basically what they did, +1 for a win, +0 for a loss, and if they tie/draw +.5 for both of them. High-level chess has a lot of draws, but I suppose sumo doesn't have any, so just the 1s and 0s.

For a quick example, lets take Baruto's (2,371) opening loss to Toyonoshima (2,230). Elo says Baruto would have an expected score of 0.692463393 (69.3% chance of winning). Baruto lost, making his Actual score 0. So, 0-0.692463393= -0.692463393 multiply this by the variable "K" to find out how many points he lost. Depending on K, he'd lose ~20pts. Of course, that was just one match as an example. The real ratings update for several matches at one time, so for this application, all 15 days of a basho. That's about it for the base formula as chess uses it, but all kinds of little changes have crept up to tailor it to an intended use*.

The Elo system works well (IMO, at least), but has two major things you must figure out/decide in order to make it accurate- Where to put the rankings for the first time/for new entrants to the rankings, and what to set the value of "K" to. I figured 2,500 for Hakuho looked like a good starting place, as that is the minimum rating for earning the Grandmaster title. Using the last banzuke, I just worked down from there. I've never tried doing this from scratch, so this one was something of a test run. The spacing between rikishi is probably too large. I ended up redoing the low M and Juryo starting points after noticing this (and before I had calculated their new score), so hopefully the bottom looks a little better. The formula will eventually correct itself, but of course the further out it is, the longer it will take to do so.

K is tricky. Too small and the ratings won't move up/down enough. Too high and the ratings become sensitive (especially for the elevator rikishi). FIDE uses K=10 for anyone who has ever been ranked over 2,400 (regardless of title, or how long they stay there), K=15 for anyone who has not reached 2,400 and effective November 1, 2011 K for anyone with less than 30 total matches has gone from 25 to 30. I believe the USCF uses different K values, but the idea is the same- At the beginning your rating is "provisional" (denoted with a p following it) and has a really high K value. After X number of games it stabilizes and K drops significantly, dropping further as you near the top. For my K, I erred on the side of large, figuring they were all provisional at this point. I think its mostly a starting point error on my part, and not K, that gave some weird results. I'll have to play around with the K vaule for sure though.

*Such as bonuses/penalties for Yusho, sansho, playing someone outside of your division, etc. (and for a lot of team sports, home/away)

Onibushou · November 17, 2011

Some observations and strange differences imo :
- Kotooshu would only be worth a top Maegashira or just Komusubi, which might actually be his true strength at the moment

- Homasho slightly outscoring Toyonoshima struck me, as Toyonoshima is already occasionally in Sanyaku ranks for so long now, while Homasho only just

- Goeido quite far behind Okinoumi, Aran, Tochinoshin, Tochiozan and even Yoshikaze, Takekaze & Wakanosato is simply stunning for a man with almost 100 more victories than defeats

- Aminishiki's rating is also rather low

Either you or I missed something here big-time - aren't these ratings based only on a starting rating + the (at most) 15 Aki bouts?

Yes, so far its just the 15. Kotooshu's low rating was mostly a result of his withdrawal (The big droppers were the 3 withdrawals and Hamanishiki's 2-13). And trying to figure out how to accurately calculate kyujo was one of the things I was most looking at. Still not sure what the best way to do that is.

When looking at Aran (only 5 wins against low-rankers at Komusubi), I think you either put too much emphasis on banzuke rank for the starting values or your measure of fluctuation is too conservative.

Perhaps a combination of both? Even accounting for the fact that Wakanosato faced mostly upwards in last basho's torikumi and Goeido faced mostly downwards, it's unfathomable to me that Goeido ended up with the lower ranking despite 12 point differential (+5 wins versus -7 wins) between them. It almost seems to imply that Goeido's expected win-loss record against his slate of opponents was near his actual 10-5 record, which just seems out of whack for starting values.

Reverse-engineering the whole thing a bit - Takekaze is at 2002 points after the basho and listed as "smallest movement" so I guess his starting rating at M3e was just about 2000. Kimurayama at the bottom of the division presumably didn't move a whole lot with his 7-8 either (probably even improved given that by necessity nearly all his opponents were ranked higher), so probably 1400 or so for the guy second from the bottom. 27 spots distance between them, so some 20 points per spot? That would make Goeido's initial rating something near 1900...? Tochinowaka would have been around 1820 if those assumptions hold; he's at 1811 now after a 9-6 record. Hmm. Maybe I'm way off, I dunno.

I think it to be mostly the former, though the latter might have played some part too. And you are correct about Kimurayama gaining a few points.

Kotomikey · November 17, 2011

Thank you very much for sharing. Excellent explanation.

Andreas21 · May 22, 2012

I would like to push this thread up as I'm also interested in Rankings.

I would enjoy a fresh ranking list of the two systems described above. Are they kept regularly?

My take on the subject is:

The Banzuke is sort of a ranking list obviously but with serious drawbacks:

- the sticky feature of the Yokozuna and Ozeki ranks

- pecuilarities at the division borders

- accounting only the last tounament, leading to larger than useful oscillations

The Elo system is really clever, especially in the rather irregular dates of the chess tournaments (not so much necessary for the very regular honbasho). It suffers also from the oscillations, I would guess. The challenge would be to get the parameters right to make it look reasonable.

I could fancy a system which averages the last year (as in many other sports ranking lists.)

A very simple system (which is used sometimes in threads) is to add the wins over 2,3 or 6 basho but that only works among the jo-i regulars.

Another very simple system would work for the rest: the average Banzuke position over the last year (have not seen this one yet).

Randomitsuki · May 22, 2012

I would like to push this thread up as I'm also interested in Rankings.

I would enjoy a fresh ranking list of the two systems described above. Are they kept regularly?

There are several people who use Elo rating systems, particularly in preparation for sumo games (and this might hint at some unwillingness to share the data...): Doitsuyama did that, Zentoryu did that, and I am pretty convinced that nomadwolf does that as well. As for myself, I have computed and regularly update complete Elo ratings for all divisions since 1934.

I could fancy a system which averages the last year (as in many other sports ranking lists.)

A very simple system (which is used sometimes in threads) is to add the wins over 2,3 or 6 basho but that only works among the jo-i regulars.

Another very simple system would work for the rest: the average Banzuke position over the last year (have not seen this one yet).

I do not quite understand this: Elo systems should be far superior as they do not rely on the last basho, but take the entire history of a rikishi into account. The main issues are:

1) Which initial values to take? As for my approach, I am using the average Elo points score for all retired rikishi in the past as the starting value for new rikishi. In addition, I add or subtract a bonus for all shin-deshi based on their position on their first banzuke.

2) How to ensure that the average ratings do not change too much over time? This is an incredibly tricky part, as the ratings inevitably change by factors such as size of the banzuke, number of bashos per year, number of bouts per basho etc..

As for my approach, I use a mechanism that adds at least some stability to the rankings by ensuring that the average Elo rating of all rikishi on the banzuke remains at 1500. If the actual value goes lower (e.g. because a very highly rated rikishi retires), the values for each rikishi will be increased in order to arrive at the 1500 points average.

Edited May 22, 2012 by Randomitsuki

Randomitsuki · May 22, 2012

FWIW, here are my Elo ratings for sekitori before Natsu (I haven't computed the pre-Nagoya values yet). The values were really crappy predictors this time around, but usually they are quite accurate.

Hakuho           2585
Harumafuji       2343
Baruto           2397
Kisenosato       2335
Kotoshogiku      2308
Kotooshu         2231
Kakuryu          2369
Toyonoshima      2224
Goeido           2209
Homasho          2152
Aminishiki       2141
Aran             2094
Takayasu         2048
Myogiryu         2073
Gagamaru         2078
Toyohibiki       2007
Takekaze         2051
Tochiozan        2078
Tochinowaka      2041
Okinoumi         2066
Miyabiyama       2018
Wakakoyu         2050
Aoiyama          1952
Shohozan         1977
Kyokutenho       2034
Tochinoshin      2035
Kitataiki        2009
Tokitenku        1975
Yoshikaze        2031
Wakanosato       1993
Chiyotairyu      1890
Shotenro         1951
Sadanofuji       1888
Kaisei           1878
Daido            1912
Tenkaiho         1871
Kimikaze         1879
Asasekiryu       1887
Chiyonokuni      1885
Fujiazuma        1862
Tamawashi        1884
Takarafuji       1867
Asahisho         1830
Masunoyama       1856
Ikioi            1855
Tamaasuka        1797
Hochiyama        1806
Yoshiazuma       1804
Takanoyama       1811
Kotoyuki         1793
Kyokushuho       1814
Sagatsukasa      1794
Takamisakari     1798
Tosayutaka       1840
Sotairyu         1794
Oiwato           1804
Bushuyama        1787
Chiyootori       1826
Nionoumi         1780
Kimurayama       1786
Satoyama         1763
Masuraumi        1773
Tokushinho       1768
Kokkai           1734
Masakaze         1757
Jokoryu          1773
Tokushoryu       1741
Homarefuji       1742
Hokutokuni       1767
Kitaharima       1742

Edited May 22, 2012 by Randomitsuki

Andreas21 · May 22, 2012

... in preparation for sumo games (and this might hint at some unwillingness to share the data...

Okay, I didn't expect that. I didn't want to be intrusive here! I thought it was an idea which was tested but not followed up.

I do not quite understand this: Elo systems should be far superior ...

For sure.

But from my perspective, I have nothing but the Banzuke, and would like to have a more realistic ranking. Thank you for that, for now!

The main issues are:

...

and 3) the K-value, isn't it? It's challenging e.g. for the steep rise of new ama-sumo entries.

There is obiously a huge difference between the FIDE-Elo and any secondary Elo model. The former is official, a lot of things depend on it: tournament entries, payment, it's even a matter of identification for chess players (same as Banzuke ranks for Rikishi). So the Elo-parameters for the FIDE are fixed but here anybody can have his own.

I was actually thinking about an open ranking which is consensus between some interested people and published regularly.

Randomitsuki · May 22, 2012

The main issues are:

...

and 3) the K-value, isn't it? It's challenging e.g. for the steep rise of new ama-sumo entries.

Yeah, the K-value is another issue. For those who don't know: the K-value expresses how much a bout counts. If you set it too low, the ratings do not change that much (which of course is bad if someone is rising through the ranks very fast). If you set the K-value too high, the Elo systems reacts too strong to a single very good or very bad basho. I've experimented with some of these values, and settled on a K-value of 20 for all bouts. In other words, if two rikishi of equal strength face each other, he winner's rating increases by 10 points, and the loser's rating decreases by 10 points. If there is an extremely lop-sided bout on paper, and the clear favorite loses, he might lose up to 20 points (and the winner will get those 20 points). Conversely, if the favorite wins, he might not earn a single point, and the loser might not lose a single point.

As far as I know, Zentoryu used a different K-value for sekitori bouts and toriteki bouts. I try to avoid that because it appears somewhat unelegant to me. But it could be that his predictions are even better.

Edited May 22, 2012 by Randomitsuki

Pandaazuma · May 23, 2012

Whenever I beat someone who uses these systems in Sumo Game, I feel like I've beaten Deep Blue at chess or something!

By the way, ELO...some great songs:

Check out that hair!

nomadwolf · May 24, 2012

As far as I know, Zentoryu used a different K-value for sekitori bouts and toriteki bouts. I try to avoid that because it appears somewhat unelegant to me. But it could be that his predictions are even better.

I started to have different values primarily because of the different number of bouts per basho. If K is the same, the change per basho will be half of what the sekitori can achieve.

In the end, I started to use different K values for each division. 20 for Makuuchi, 24 for Juryo, 50 for makushita, 55 for sandanme, 64 for jonidan, and 70 below that.

Main purpose is because fast risers are nearly always underrated when they arrive at sekitori, so I try to adjust for that. If there are some erroneous ELO ratings down in sandanme, I don't really care since I don't actually use those values except to calculate everyone else's rating.

On the other side, I think my code does leave an uneven (and inelegant result) in that for inter-division bouts 2 different K-values will be used, meaning the total point change for the bout is non-zero. Too lazy to fix it.

However, Random, I'm curious what you do for lower division bouts before 1988, since there are generally few torikumi for those basho! For myself, I list the points of the surrounding 7 rikishi, calculate the # of wins they should have achieved against them, and adjust points accordingly.

Not sure if it actually helps or not, but I'd expect better than doing nothing.

On the other side, I don't use the strength ratings to make my picks, but rather as a guidance. I'll make my Oracle picks on my own, and if there's a discrepancy to the ELO calculation, I'll review.

For Quad, I sort the bouts by ELO calculated winning odds, but still pick on my own. (But I do keep track of a computer-generated picks to see if they do better than me... same for bench. It's on and off when they do better or I do better).

For Bench, this past basho, my 2 "drones" would have been 7-8 and 8-7 (I was 9-6). (The first uses the same rikishi as I have... the other has it's own lineup that can be different from mine). But against me (instead of opponents), it would've been 8-7 and 10-5.

For Quad, my drones would have been 6-9 and 8-7 (I was 5-10). 1st drone just takes top 4 picks every day. 2nd drone has at least 2 Juryo for the first half days, and at least 1 for the remaining days (mostly the same stragety I use to avoid running out of Makuuchi rikishi by day 11).

Randomitsuki · May 24, 2012

I started to have different values primarily because of the different number of bouts per basho. If K is the same, the change per basho will be half of what the sekitori can achieve.

In the end, I started to use different K values for each division. 20 for Makuuchi, 24 for Juryo, 50 for makushita, 55 for sandanme, 64 for jonidan, and 70 below that.

Main purpose is because fast risers are nearly always underrated when they arrive at sekitori, so I try to adjust for that.

Now that's interesting because I do not have the same problem. What is your average Elo rating for divisions anyway? Here are my values before Haru 2012:

Makuuchi: 2059

Juryo: 1799

Makushita: 1637

Sandanme: 1474

Jonidan: 1332

Jonokuchi: 1274

The average value that I assign for shin-deshi is 1441 (ranging from 1541 for the highest-ranked shin-deshi to 1341 for the lowest-ranked shin-deshi). In other words, I treat the top new guy of each basho as someone of high Sandanme strength. That's probably the reason why they arrive at sekitori with a reasonable rating.

However, Random, I'm curious what you do for lower division bouts before 1988, since there are generally few torikumi for those basho! For myself, I list the points of the surrounding 7 rikishi, calculate the # of wins they should have achieved against them, and adjust points accordingly.

My method is somewhat similar. For example, if the Makushita average is 1650 points, and the Sandanme average is 1450 points, I expect that someone at Ms60 will meet seven opponents with a strength of 1550 points.

Edited May 24, 2012 by Randomitsuki

Asashosakari · May 24, 2012

The average value that I assign for shin-deshi is 1441 (ranging from 1641 for the highest-ranked shin-deshi to 1241 for the lowest-ranked shin-deshi). I just realized that I assume that the top new guy of each basho is already of mid-Makushita strength. That's probably the reason why they arrive at sekitori with a reasonable rating.

I know I've told you that before, and I still think that's are very weird way of handling it for current entrants, because the maezumo results are so useless as a proper ranking. (In Haru because the sign-up order plays a huge role in the finishing order, in all other bashos because there are usually fewer than 10 entrants.)

Collegiates and "old" foreigners = bottom of makushita strength (1550 or so in your system?), everybody else of high-school graduation age and older [yes, including 20+ year-olds without a college background] = bottom of sandanme strength (1400?), all youngers ones = whatever's appropriate (1300, maybe). Voilá...

Randomitsuki · May 24, 2012

I know I've told you that before, and I still think that's are very weird way of handling it for current entrants, because the maezumo results are so useless as a proper ranking. (In Haru because the sign-up order plays a huge role in the finishing order, in all other bashos because there are usually fewer than 10 entrants.)

Collegiates and "old" foreigners = bottom of makushita strength (1550 or so in your system?), everybody else of high-school graduation age and older [yes, including 20+ year-olds without a college background] = bottom of sandanme strength (1400?), all youngers ones = whatever's appropriate (1300, maybe). Voilá...

College experience, nationality, and birth dates are not available for many rikishi of the past. And I'd like to have a consistent approach of handling the data. Otherwise, I completely agree that your method would be superior.

Oh, and one small correction: I gave a wrong range for Jonokuchi debutants two posts up. The average is 1441 points, and the range I set is between 1541 and 1341 points.

Doitsuyama · May 24, 2012

Collegiates and "old" foreigners = bottom of makushita strength (1550 or so in your system?), everybody else of high-school graduation age and older [yes, including 20+ year-olds without a college background] = bottom of sandanme strength (1400?), all youngers ones = whatever's appropriate (1300, maybe). Voilá...

Your shin-deshi average would be significantly below the 1441 though which would have long-term ramifications about rating stability.

1441 seems a bit high anyway to me for the average shin-deshi (the average shin-deshi probably isn't sandanme strength).

As far as I know, Zentoryu used a different K-value for sekitori bouts and toriteki bouts. I try to avoid that because it appears somewhat unelegant to me. But it could be that his predictions are even better.

I started to have different values primarily because of the different number of bouts per basho. If K is the same, the change per basho will be half of what the sekitori can achieve.

I think there is another reason to have variable K-values: It's the dilemma to get a stable rating system with rated persons who develop. The average rikishi certainly is much stronger when he retires compared to his entry level strength, resulting in more points going away than newly coming in (if the entry level is done correctly). To reflect the development curve in the ratings, it is only fair to have different K values (but depending on age/experience rather than division). If a young rikishi indeed develops to get stronger, beating older rikishi in the process, it is simply correct to have more rating points added to him than getting subtracted from the older opponent. This asymmetry is getting counterbalanced by retiring at a higher strength.

nomadwolf · May 24, 2012

I started to have different values primarily because of the different number of bouts per basho. If K is the same, the change per basho will be half of what the sekitori can achieve.

In the end, I started to use different K values for each division. 20 for Makuuchi, 24 for Juryo, 50 for makushita, 55 for sandanme, 64 for jonidan, and 70 below that.

Main purpose is because fast risers are nearly always underrated when they arrive at sekitori, so I try to adjust for that.

Now that's interesting because I do not have the same problem. What is your average Elo rating for divisions anyway? Here are my values before Haru 2012:

Makuuchi: 2059

Juryo: 1799

Makushita: 1637

Sandanme: 1474

Jonidan: 1332

Jonokuchi: 1274

The average value that I assign for shin-deshi is 1441 (ranging from 1641 for the highest-ranked shin-deshi to 1241 for the lowest-ranked shin-deshi). I just realized that I assume that the top new guy of each basho is already of mid-Makushita strength. That's probably the reason why they arrive at sekitori with a reasonable rating.

I have lower entry levels, and thus lower overall levels: [pre-natsu]

Hakuho: 2140

Makuuchi: 1603

Juryo: 1265

makushita: 908

sandanme: 532

jonidan: 118

jonokuchi: 355 (!.. hadn't noticed it's higher than jonidan, but probably because poorly performing jonidan rikishi will not drop to jk, though maybe I misunderstand how things work down here... I see Daishohoma jumping from jk4e to jd106w with a 1-7 result. but Wakayamanaka only drops from jd100w to jd102w with a 2-4-1 result.... seemingly only kyujo will drop you to jonokuchi, but Hakubizan is an exception to that (jd60w to jk1w with 0-7 result))

But, really, for ELO, all that matters is the difference between 2 players, not the absolute values, so if we shift by 456 points (to make makuuchi match up), we get:

Hakuho: 2596

Makuuchi: 2059 (yours = 2059)

Juryo: 1721 (1799)

makushita: 1364 (1637)

sandanme: 988 (1474)

jonidan: 574 (1332)

jonokuchi: 812 (1274)

You have average makushita much closer to Juryo level than I do. Intuitively, I'm not sure I buy that. Then again, it is probably due to my higher K-values...

For shin-deshi, I use a rank-based starting value that I calculated long ago and may not quite be valid for my current calculation method. I just set a given value based on division (regardless of rank within the division). crude, but whatever.

(not listing above Maegashira... it's in the code, but probably never used except as starting values in the first basho of my DB (1950))

Maegashira: 1021

Juryo: 800

ms: 650

sandanme: 450

all others: 400 (the starting value I found for new players in the Wikipedia article on ELO)... however, with my large K values in lower divisions, I sometimes end up with negative ELO values!

I got these values by running the calculations from 1988 onward (when I had full torikumi data) and then taking averages for each division (or maybe I took the average of the bottom half... can't remember since it was 3 years ago).

As for players being underrated when entering juryo, your starting values together with the re-averaging to 1500 clearly make a big difference. Without the reaveraging, higher shin-deshi values would simply shift up the rating of everyone else. This is because the formulas are purely additive and don't take any ratios (thus the actual value doesn't matter, only differences between 2 rikishi). [however, there's still some consequences of this that I still have to think through...]

On the other hand, I don't know if the re-averaging is conceptually valid. In your system (with a constant K), the only way the average should change is when rikishi enter or leave the banzuke. If you use the same K-value for both sides of a bout (which I don't for intra-bout matches... that's a bug, not a feature), then the points added to the winning rikishi are exactly the same as points subtracted for the losing rikishi, so the average will stay the same.... Actually, this is even valid if you use different K for each division, as long as both sides of a given bout have the same K.

Only new & leaving rikishi will change the average. I need to ponder more on the consequences of re-averaging.

But I think I remember why I had such high K-values for the lower divisions. A 7-0 rikishi in jonidan will jump right away to sandanme. With a K of 20 or 40, he will be sorely underrated there despite being a good prospect. Now, for going 7-0 in sandanme, he'll get a bigger bump since he's the underdog compared to the others, but it doesn't seem "right". I haven't considered which result will give a higher ending rating (two 7-0s with K = 40, or two with K=70 then 50). The 2nd seems more likely, but the first might work... maybe. I should probably be working now, so no time for in-depth research.

[this is with my starting values...]

But I'm a little confused about how your jonokuchi average is much less than the bottom of your shin-deshi range... sure, there are some stragglers, but at least for pre-Natsu, 75% of the rikishi in jonokuchi were in mae-zumo the previous basho, so the average should be close to your shin-deshi average. Maybe Hatsu had a smaller mae-zumo...

Edited May 24, 2012 by nomadwolf

Sign In

Sumo Elo Ratings

Recommended Posts

Onibushou

Asojima

Onibushou

Kotomikey

Jakusotsu

Vikanohara

Randomitsuki

Asashosakari

shumitto

Doitsuyama

Onibushou

Onibushou

Kotomikey

Andreas21

Randomitsuki

Randomitsuki

Andreas21

Randomitsuki

Pandaazuma

nomadwolf

Randomitsuki

Asashosakari

Randomitsuki

Doitsuyama

nomadwolf

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity