Sign in to follow this  
Randomitsuki

Yokozuna Strength, Opponent Strength, and Yokozuna Dominance

Recommended Posts

Are you still predicting career paths based on age, early results etc? How has that been working out? I guess you've been doing it more than a decade now right? Be interesting to see how accurate your predictions have been.

I basically gave up on that project. While I still create the predictions as part of bimonthly sumo routines, I do not even care to save the prediction files. The predictions have been surprisingly accurate when it comes to run-of-the mill rikishi. But let's face it: nobody (except me) is delighted when the prediction says this guy will make it to Makushita 7, and then he has a career high of Makushita 9 or so. The only thing that people are really interested in (even from the "moneyball" perspectve :-)) is whether the prediction gets Sekiwake, Ozeki, and Yokozuna right. Which it doesn't. Sometimes on an embarrassing scale.

I just took a look at the last prediction file that I didn't delete (from November 2012). Some predictions went fairly well:

Osunaarashi was on a career-high Sd20, and I predicted Ozeki.

Iwasaki/Hidenoumi was on a career-high Sd18, and I predicted M13.

Wakamisho/Terunofuji was on Ms37 (career high of Ms15), and I predicted Ozeki.

Other predictions are likely to fail miserably: anybody else betting on Sadanofuji or Daikiho (now named Yamaguchi, and in Sandanme) as future Ozeki? (Laughing...)

Edited by Randomitsuki
  • Like 5

Share this post


Link to post
Share on other sites

With the availability of data for Sumo, I was thinking of doing something similar until I saw this thread! I already am working on similar data for rugby league.

Good to see that my team - Leeds Rhinos - are where they ought to be :-)

  • Like 1

Share this post


Link to post
Share on other sites

Many results are of course similar, with the biggest deviation being that I took only results from juryo and makuuchi as the lower divisions varied so much during the decades that I feel the assumption of a constant average of 1500 is a fallacy possibly leading to bad results.

This almost certainly will lead to different results, and it has nothing to do with differences between the decades: not everyone enters Juryo at the same skill level relative to others there. If you assign Ichinojo and Mitakeumi the same initial rating upon entering Juryo as Kotoeko and Wakanoshima, you're throwing away a lot of information. While any deviations from their true relative rating will be corrected over time, the cumulative effect is to not be injecting enough points into those rikishi who are far more likely to make it through Juryo and the bottom of Makuuchi and thus depressing ratings in the top of the division.

I don't see much of an argument here. The rating for new entrants rather quickly converges to the true rating, most certainly by the time they reach yokozuna, and this actually is the topic of the debate.

But I am still bothered (a lot) about the artificial assumption that the average rating of the whole banzuke is 1500 at all times. I assume not only Hakuho and his direct competition is at an all-time high, but also the (for example) rikishi at positions 31 to 40? Actually this is a question at Randomitsuki as well. If this is indeed the case this just means the makuuchi division is (theoretically) beating up the middle of the banzuke better than ever which might well be true. But how do you know this midddle of the banzuke hasn't gotten weaker now than in other times? The structure of rikishi in the deshi ranks is totally different now than decades ago in age and experience, while the makeup of juryo and makuuchi is basically unchanged, but you (by definition even!) declare it the same. Therefore I still like my method better.

  • Like 1

Share this post


Link to post
Share on other sites

But I am still bothered (a lot) about the artificial assumption that the average rating of the whole banzuke is 1500 at all times. I assume not only Hakuho and his direct competition is at an all-time high, but also the (for example) rikishi at positions 31 to 40? Actually this is a question at Randomitsuki as well. If this is indeed the case this just means the makuuchi division is (theoretically) beating up the middle of the banzuke better than ever which might well be true. But how do you know this midddle of the banzuke hasn't gotten weaker now than in other times? The structure of rikishi in the deshi ranks is totally different now than decades ago in age and experience, while the makeup of juryo and makuuchi is basically unchanged, but you (by definition even!) declare it the same. Therefore I still like my method better.

It's fine and dandy to like your system better! :-)

Maybe I am on the wrong track here, but I have a feeling that you are draining your system as rikishi obviously tend to improve over time. AFAIK, all your rikishi (sekitori) enter the system with a fixed rating of 1800. Now I guess that on average they go intai with a somewhat higher rating. If their average intai rating is at, say, 1900 points (educated guess), every rikishi drains the system by 100 points over his career. That would be only about a 1-point deduction for the remaining sekitori, but if you do this over 50 years of simulation, you have a classically deflated system. The fact that your difference between Taiho and Chiyonofuji is so big (Taiho entered into a fresh system and Chiyonofuji entered into an already deflated system), the fact that your system has Asashoryu clearly behind these two (rather than in front of them, but he entered a seriously deflated system), plus your anecdotal assumption that "rikishi were much better back then" might be further evidence for that.

But I can only repeat: I'd love to compare the performance of both systems, as I really have no clue which one is better.

Edited by Randomitsuki
  • Like 1

Share this post


Link to post
Share on other sites

Of course, since it's quite possible that the average rikishi today is somewhat better than the average rikishi in the past - not for any intrinsic reason but simply because a lot more toriteki stay active into their mid and late 20s now when rikishi tend to reach their physical peak - keeping the average rating fixed at 1500 might even mean that today's rikishi are being underrated in Randomitsuki's system, even though the upper makuuchi ratings have already exploded anyway...

I have a feeling I've already asked this in the past - in percentile terms, where exactly on the banzuke do you have the rikishi with the average rating of 1500? And has this position moved over the last 80 years? I assume it's not actually in the middle of the banzuke (i.e. around Sd65 right now)...?

Edited by Asashosakari

Share this post


Link to post
Share on other sites

I have a feeling I've already asked this in the past - in percentile terms, where exactly on the banzuke do you have the rikishi with the average rating of 1500? And has this position moved over the last 80 years? I assume it's not actually in the middle of the banzuke (i.e. around Sd65 right now)...?

The mid-points (1500 points) historically, have been:

1934, first banzuke: 60% (with 100% being the top of the banzuke)

1945: 68%

1955: 68%

1965: 65%

1975: 65%

1985: 65%

1995: 65%

2005: 64%

2015: 61%

What insights do you gain from these numbers?

Share this post


Link to post
Share on other sites

What insights do you gain from these numbers?

I'm not sure. :-) I was mostly wondering if the point spread has just become wider in recent years (the best getting better, the worst getting worse), or if the distribution has also become more skewed. Interesting that there are now more rikishi above the mid-point than before...maybe that is a result of increased career longevity in the lower ranks? That might cut both ways - more rikishi fulfilling their potential, but also more rikishi staying around long beyond their prime (such as Kyokuhikari), or rikishi who never develop at all.

Then again, 3-4 percentile points isn't a huge difference either, about 10-15 ranks. Hmm.

Share this post


Link to post
Share on other sites

maybe that is a result of increased career longevity in the lower ranks?

Is that actually the case? Have you crunched the numbers? Somehow it seems counterintuitive.
It's a long-running process:

Year Ms1-30 Ms1-30(x) Ms31-60 Ms31-60(x) Sd1-50 Sd51-100 Jd1-50 Jd51+
1970 23y 0m 22y 7m (13.1%) 22y 7m 22y 4m (2.8%) 21y 1m 19y 10m 18y 6m 17y 0m
1975 24y 4m 23y 5m (29.7%) 23y 0m 22y 9m (5.5%) 21y 6m 20y 3m 19y 3m 17y 5m
1980 24y 0m 23y 1m (24.7%) 23y 5m 23y 0m (5.8%) 22y 0m 20y 7m 19y 5m 17y 4m
1985 24y 5m 23y 5m (35.3%) 23y 6m 23y 3m (8.0%) 22y 6m 21y 5m 20y 0m 17y 11m
1990 24y 1m 23y 7m (17.8%) 23y 9m 23y 6m (5.2%) 22y 9m 21y 6m 20y 7m 18y 1m
1995 25y 4m 24y 3m (35.8%) 24y 0m 23y 8m (9.6%) 23y 5m 22y 1m 20y 10m 18y 8m
2000 25y 9m 24y 10m (31.7%) 25y 2m 24y 9m (9.9%) 24y 4m 22y 10m 22y 5m 19y 6m
2005 26y 6m 24y 11m (31.3%) 25y 8m 25y 4m (8.1%) 24y 8m 23y 8m 22y 4m 19y 8m
2010 26y 10m 25y 5m (31.9%) 25y 9m 25y 1m (11.1%) 25y 3m 24y 7m 23y 3m 20y 11m
2015 26y 1m 25y 0m (27.4%) 25y 10m 25y 3m (10.0%) 26y 0m 25y 0m 23y 9m 21y 3m

For each year, I've averaged the data from all 6 tournaments (4 for 2015) to eliminate the possibility of one-basho outliers.

The two (x) columns for makushita only include rikishi who haven't been to juryo before; the percentage is how many rikishi were excluded by that (i.e. how many were ex-sekitori). You can see that the age increases in makushita aren't only driven by ex-sekitori hanging around longer. But now I really wonder what happened from 1985 to 1990...why so few ex-sekitori in makushita all of a sudden?

The most noteworthy recent development might be that the average sandanme rikishi used to be quite a bit younger than the lower makushita guys, but that age gap has almost completely disappeared. That only happened over the last 10 years.

  • Like 4

Share this post


Link to post
Share on other sites

That four year three month increase at the Jd51+ level is interesting. Stables not being able to pick and choose and having to accept guys who took more time to decide sumo was an option one reason I guess. Higher number of university guys starting down there too maybe?

In general deshi who start out of high school or university don't hang around for long below sandanme, so they're not going to affect the average jonidan age by much. The no-hopers who are old on-join but have no sumo experience probably do contribute to it, but my impression is that they often quit rather quickly, so their overall number on the banzuke doesn't build up too much and they also shouldn't play much of a role.

I suspect your previous suspicion was partly correct...my impression is that it's become more fashionable to become a professional rikishi now if you like doing sumo but you're not very good at it. Prior to the last 15 years or so I suspect oyakata still felt free to send home those who were enthusiastic but untalented (or maybe just told them to become gyoji or yobidashi instead...), but with the lower applicant numbers these days that's not an option.

But the major driver is that a significant number of rikishi simply stay for a long time now. Especially at the extreme end - in the Jd1-50 section, there are more rikishi older than 32 now than there were rikishi older than 25 back in 1995. But even if we eliminate the effect of those very old guys, the average age has still gone up: The median Jd1-Jd50 age was 20y 0m in 1995, and it's 22y 2m now. Less than the mean increase (20y 10m to 23y 9m as per the above), but still quite remarkable.

I think there's a cascading effect at work now. Makushita-quality rikishi stay in sumo for longer, somewhat increasing the competitive depth at this level, so it takes longer for sandanme-quality rikishi to reach makushita-quality (and in addition, fewer of them are giving up early now), and consequently it also takes longer to go from jonidan-quality to sandanme-quality for those rikishi who aren't already at sandanme-quality (through amateur sumo) when they join.

The obvious conclusion one might draw from that and from Randomitsuki's ratings is that today's sekitori have to fight harder to get to and through makushita, so naturally they should be better and more battle-hardened. And then the cascading effect continues...the lower juryo being a bit better than in the past, the upper juryo even "more better", and so on, up to the yokozuna ranks where the difference would be the most pronounced. It's a fairly intuitive idea, but I must admit I'm also still skeptical of that. The ratings boost over the last ~15 years is just so large that it's hard not to wonder what's going on.

---

As for the very old guys in the lower ranks - is it possible that running a stable (or shisho life in general) is more complex than a few decades ago, and there's more of a need for active veterans serving as player-managers in their stable? There's of course the expansion in the number of stables that went on from the late 1980s until a few years ago, but that alone doesn't seem to explain it, especially now that quite a few heya have had to shut down.

Share this post


Link to post
Share on other sites

Interesting to follow the - at least to me - new discipline of

Applied Sumology

Maybe the best IBB* thread ever?

Anyway, I can only (try to) follow in awe....

(*) Intra Basho Boredom

Edited by kuroimori
  • Like 2

Share this post


Link to post
Share on other sites

[entering a heya] Not fashionable but certainly easier.

These days there is much more equal opportunity and young people have choices their parents did not have. But economic growth is not guaranteed and as the group of overeducated unemployed youngsters grows and chances of success on the job market decline, some may choose to exchange the rat race for a different kind of competition, hard life notwithstanding.

And when this produces an abundance of talented new Japanese recruits, entering a heya might become more difficult again.

Edited by orandashoho

Share this post


Link to post
Share on other sites

Maybe the best IBB* thread ever?

Anyway, I can only (try to) follow in awe....

(*) Intra Basho Boredom

Or Inter Basho? :-P

  • Like 1

Share this post


Link to post
Share on other sites

And when this produces an abundance of talented new Japanese recruits, entering a heya might become more difficult again.

I don't think there's really a shortage of newer Japanese talent in sumo - sure, there hasn't been one to get to the very top in a long time, but plenty of them do get to juryo and makuuchi, often relatively young, too. The thing is that the recruiting has almost bifurcated. There are now relatively many deshi who come in very well prepared through amateur sumo and have bright career prospects right from the start, and then there are also a lot of kids who come in lacking almost everything needed to become successful...they're not tall, they're not strong, they haven't done much sumo at all, and consequently it's 99% sure they're never going to get anywhere.

In turn, the number of applicants that would be considered "pure tools prospects" in e.g. baseball has dropped a lot; athletic kids who still need to be taught everything about professional sumo, but who bring the physical makeup needed to become successful. Many of those have probably been lost to other sports, and I doubt they're coming back. We're not going to see years with 150+ shindeshi again, and without that it's unlikely that oyakata can afford to become picky again. As Nishi says it takes a lot of manpower to run a heya, and while that role used to be filled by a revolving cast of talented-but-green rikishi in their early years, now it often falls mostly to the no-hope lifers.

It's a bit anachronistic - imagine a football youth academy where the cook and the equipment guy aren't normal employees but rather classed (and paid) as playing team members even though they have no chance to contribute to team success on the pitch. But as long as there's no push to change how stables work, there are going to be plenty of spots for guys whose primary task it has become to keep heya life going, not to reach new personal heights on the banzuke.

I did ask one jonidan lifer once who endured almost daily hazing and abuse why he stuck with it and his answer was "I like sumo". Doubt if that explains the increase but maybe it's a combination of factors already mentioned.

I don't know if it applies to that specific guy, so as a general comment: Lifer guys like that, if they're filling a substantial role in keeping the stable going, I would hope that they'd be treated with respect in most stables even though they're failing at a core task of doing sumo (achieving competitive success). But I suspect that's not actually the case everywhere.

(Then again, I suppose there are equipment guys at football youth teams who also get mistreated by the jocks. The difference is that those guys have a normal employee contract and all the protections that come with that.)

Edited by Asashosakari

Share this post


Link to post
Share on other sites

as the group of overeducated unemployed youngsters grows

Which group? You can't just transplant social conditions from one country to another. The undergraduate to lifetime employment system still absolutely dominates no matter what people might want to believe. That group you mentioned is minuscule if it exists at all here.

I am sorry if I misread the situation. I thought that competition for those jobs was very fierce. If the Japanese univeristy admission system is better tuned to the economy than elsewhere, that is great.

Share this post


Link to post
Share on other sites

Maybe the best IBB* thread ever?

Anyway, I can only (try to) follow in awe....

(*) Intra Basho Boredom

Or Inter Basho? :-P

Your rite, off coarse... :-)

  • Like 1

Share this post


Link to post
Share on other sites

I've kept my own ELO-type rating system for the makuuchi division rikishi since late 2007. It is not an ELO system, merely like one in most key aspects. I'd probably keep an ELO system if I could figure out how to do it for sumo, but I can't, as the lower divisions are just too difficult - and time consuming - for me to assess. Randomistuki and I have compared some data in the past and our systems produce generally similar results, with some differences. I'm not sure mine is as precise, but for the purpose I keep it for - to track active rikishi strenght - it works well for me.

I do know a lot about the ELO system in chess, though. And one thing that is undeniable is that there has been inflation of ELO numbers. For example. the top 9 players in chess strength achieved their top score since 1999. The 10th player is Bobby Fischer. The 11th is Anatoly Karpov. The next 9 again achieved their top score since 1999. In fact, of the top 20 players in ELO strength, 10 of them achieved their highest rating since 2010. Now, is it really possible that in the hundreds of years of chess playing, half of the 20 strongest players in history have played at their peak in the last 6 years? The answer is a pretty obvious "no" to me.

There are literally hundreds of grandmasters now, whereas perhaps a few dozen when Jose Raul Capablanca and Emanuel Lasker played. And the top players play each other much more often. (The term "grandmaster" was used as an honorific a century ago and wasn't "earned" by norms (specific criteria) then in the way it is now, so it's a little harder to say exactly who a grandmaster was back then. It's a bit similar to how the title yokozuna evolved). These top 10 ELO strength players undoubtedly know more chess theory than either Capaplanca or Lasker did. But are they really "stronger" players? I have no doubt that if Capablanca and Lasker were alive today and had the same computers and methods of training they would blow away people such as Peter Leko (number 17) and Alexander Morozevich (number 7).

Similarly, is Harumafuji (who I love) really the 6th strongest rikishi in sumo history? Is Kakuryu (he of the one yusho) really the 10th strongest rikishi? I think these questions are easily answered in the negative. That isn't to say that the rating system Randomitsuki uses isn't useful and interesting. It's just to note that any rating system is limited (as I'm well aware of every time I update mine), because it applies artificial secondary criteria to something for which there are really only two criteria: the number of yusho a rikishi wins and the number of victories a rikishi wins.

  • Like 2

Share this post


Link to post
Share on other sites

First of all, I highly appreaciate Randomitsuki for sharing the results of his ELO ranking system.

Fixing the average at 1500 seems appropriate to me to counter inflation/deflation issues. I did an brief attempt to implement a system myself, and took the opposite path. I modified the update function, with some special cases, and had a complicated starting Elo value function. Finally, the ELO average was stable over time. I confess, it is no less artifical.

In my opinion, historical ELO comparisons show the relative strength of the Yokozuna, not the absolute. Relative to the peers of the respective period, of course. It simply doesn't answer, who did the better Sumo, or even if Taiho would beat Hakuho if he arrived with a time machine at the same age.

However, it precisely shows the level of dominance. It also shows the pattern of strength developent over time. It shows the structure of the Banzuke much better than the actual ranks themselves.

Viewed from this perspective, deflation is not that a big issue at all.

I still think, Elo's system is more appropriate for Sumo than for chess. The system in chess suffers from the highly divergent number and quality (strength of the opponent) of the matches. In Sumo, most Sekitori have 90 matches per year, with opponents very similar to the own strength. Lower division rikishi about 42. It should be much more reliable.

  • Like 3

Share this post


Link to post
Share on other sites
Now when cyborg sumo starts your analogy will have more weight.

**cough** Aminishiki **cough**

Edited by Benevolance
  • Like 6

Share this post


Link to post
Share on other sites

Nice topic. It’s really too bad that we will not get to see these dream matches in reality.

Share this post


Link to post
Share on other sites

Excellent work.

I understand ELO ratings have a tendency to inflate over time. But chart 3 controls for this, and interestingly Hakuho remains far ahead of the pack, leading Taiho, Kitanoumi and Chyonofuji.

Edited by HenryK

Share this post


Link to post
Share on other sites

I think you have the 90's too low and are forgetting about how important being in a good stable was then. Akebono and Musashimaru had much much harder paths to victory than Takanohana and Wakanohana did. I also think the 2000's and up are way too high.

Edited by rzombie1988

Share this post


Link to post
Share on other sites

I changed my prediction system and don't keep track of anything that can be used to assess someone's peak rating.  While I have the ability to run the ratings as of any date, I don't keep track of them and I don't have a need for old ratings after new results come about.  I don't even really look at the rating numbers, but just the outputs after all the calculations regarding who to pick for which game, which take into account probable scheduling and head-to-head results as well as raw rating (and are influenced somewhat by each game's rules).

Share this post


Link to post
Share on other sites
On 21/08/2015 at 07:02, ScreechingOwl said:

I do know a lot about the ELO system in chess, though. And one thing that is undeniable is that there has been inflation of ELO numbers. For example. the top 9 players in chess strength achieved their top score since 1999. The 10th player is Bobby Fischer. The 11th is Anatoly Karpov. The next 9 again achieved their top score since 1999. In fact, of the top 20 players in ELO strength, 10 of them achieved their highest rating since 2010. Now, is it really possible that in the hundreds of years of chess playing, half of the 20 strongest players in history have played at their peak in the last 6 years? The answer is a pretty obvious "no" to me.

What I like about chess is that we have an "objective" (with an asterisk) method of verifying Elo scores: by comparing them to engine assessment of playing strength. There's an asterisk because obviously a chess engine is not really objective at all, and as they get improved over time they might yield different results. However, modern engines have a proven track record of assessing positions accurately, so it's a good approximation if nothing else.

Unfortunately I forgot who did research with this method. I need to look this up again, because it was very interesting work. As I recall, the main conclusion was that there was some rating inflation, but not very much. I personally don't find it that hard to believe that chess players have gotten much better in recent years with the advent of modern computer analysis which totally revolutionized preparation work. A footnote is also that since 1985, far more lower rated players started getting FIDE ratings, which had the effect of transferring extra points to the higher rated players.

edit: believe I found the study I was looking for: http://www.chessanalysis.ee/Quality of play in chess and methods for measuring.pdf

His conclusion seems to be that FIDE ratings inflated by about 5 points per decade since 1970.

Edited by dada78641

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this