FargoRate question, time?

poolscholar

AzB Silver Member
Silver Member
Does time factor into the rating calculation or robustness at all?

For example Wu, Jiaqing is world number one. But we have a lot more recent results from other top players Shane, Dennis, Alex, etc...

I'd think that old results (a few years or older) would become less weighted compared to recent results. Or perhaps just a decay in robustness. I guess I'm not sure if the system is suppose to be defining the relative skill over a certain period of time? or forever? It would seem silly to enter results from more than 10 years ago as players get old and also inactive over time.

Some sort of time factor in this system
https://en.wikipedia.org/wiki/Glicko_rating_system
 
Does time factor into the rating calculation or robustness at all?

For example Wu, Jiaqing is world number one. But we have a lot more recent results from other top players Shane, Dennis, Alex, etc...

I'd think that old results (a few years or older) would become less weighted compared to recent results. Or perhaps just a decay in robustness. I guess I'm not sure if the system is suppose to be defining the relative skill over a certain period of time? or forever? It would seem silly to enter results from more than 10 years ago as players get old and also inactive over time.

Some sort of time factor in this system
https://en.wikipedia.org/wiki/Glicko_rating_system

Mike Page has already explained that matches decay with a half-life of three years. Games 9 years old are counted 1/8th as important as current games.
 
Does time factor into the rating calculation or robustness at all?

For example Wu, Jiaqing is world number one. But we have a lot more recent results from other top players Shane, Dennis, Alex, etc...

I'd think that old results (a few years or older) would become less weighted compared to recent results. Or perhaps just a decay in robustness. I guess I'm not sure if the system is suppose to be defining the relative skill over a certain period of time? or forever? It would seem silly to enter results from more than 10 years ago as players get old and also inactive over time.

Some sort of time factor in this system
https://en.wikipedia.org/wiki/Glicko_rating_system

I'm going to articulate three different questions Dave, two of which I think you might be asking.

(1) Is Jiaqing Wu REALLY currently performing as the best player in the world?
(2) Does FargoRate include a reasonable time decay factor generally?
(3) Is there a legitimate reason to keep old data in the system?

I'll focus in the first one now.

(1) The answer to this will be surprising to many. It is a resounding yes. We have nearly 500 games for Mr. Wu from just the last 12 months, and nearly all of these have been played against top-100-in-the-world players. Against this top-100 crowd, he consistently can give 3 games on the wire to 10. He is 30 to 19 against the Ko brothers, and that is not an anomaly. So he was a 16-year-old phenom a decade ago and we seem to have not heard much in a while.

What is going on? Why would nobody really be talking about this player if it wasn't for Fargorate? The answer is we--as a community-- put irrational weight on VERY few games and ignore others. Ko Pin Yi and Albin Ouschan are in the fronts of our minds because they have won key single-elimination events. But in some cases players that win these events may have been one game or one roll away from not even making the elimination rounds. We basically ignore most of the games, and we treat someone who lost on the hill to SVB in the first round of the elimination stage as completely forgetable, even though that player may be playing top worldclass pool.

Below is Wu's record from the last 12 months with the world rank of his opponents in parentheses.

I want to be clear that no person on planet Earth has a record that rivals this one.

And this, once again, is games just from the past 12 months.
 

Attachments

  • Screen Shot 2016-06-13 at 11.37.04 PM.png
    Screen Shot 2016-06-13 at 11.37.04 PM.png
    71.2 KB · Views: 429
Mike thanks for responding. So far you have answered every question thrown at you. Not only answered but have always done so in a clear concise manner. I applaud you.
 
Good info thanks. On another note, do you throw out any results?

Lets say two players are equal rating and both established. They play a race to 100 but one guy got sick the night before and plays horrible. He loses 100-50. The chance of this result is incredibly low, close to zero. This would seem to be an anomaly.
 
Good info thanks. On another note, do you throw out any results?

Lets say two players are equal rating and both established. They play a race to 100 but one guy got sick the night before and plays horrible. He loses 100-50. The chance of this result is incredibly low, close to zero. This would seem to be an anomaly.

I am no statistics whiz but, if they play even isnt 100/50 equally as likely as 50/100 ? As I feel it is its not the abberation you think.
 
I am no statistics whiz but, if they play even isnt 100/50 equally as likely as 50/100 ? As I feel it is its not the abberation you think.

Yes it is equally likely.

Seems that a score line of 24-9 (See above Wu vs Chang) is a pretty unlikely result for two very strong players.

In fact, if you enter 24-9 (13 game spot) into fargorate (wu 827 vs chang 803). Then Wu would only win about 1% of the time. I'm guessing this is fine in a very large data set. But could it skew smaller data sets?

A real world data collection example would be this. What happens if someone fat fingers the score entry and suddenly someone has a score of 77-5 instead of 7-5. Statistical methods might throw out this outlier because it is so far off the average result.

I'm not a math/stats expert so I can't speak on how someone would identify outliers vs valid results. Just an interesting question to me.

http://pareonline.net/getvn.asp?v=9&n=6
 
Good info thanks. On another note, do you throw out any results?

Lets say two players are equal rating and both established. They play a race to 100 but one guy got sick the night before and plays horrible. He loses 100-50. The chance of this result is incredibly low, close to zero. This would seem to be an anomaly.
It is a dangerous, slippery slope if the statistician starts to look for reasons to exclude data. Maybe the "sick" player was reacting to his chemo session and this is his new normal.

It is better to look only at the numbers.

But if the FargoRate team wanted to, they could extract a player's "beta" from the match results which would indicate how erratic their scores were in single matches. A player who always won 9-0 and always lost 0-9 almost regardless of who he was playing would have a very high beta.

(The "beta" of a stock describes how volatile its price is.)
 
Yes it is equally likely.

Seems that a score line of 24-9 (See above Wu vs Chang) is a pretty unlikely result for two very strong players.

In fact, if you enter 24-9 (13 game spot) into fargorate (wu 827 vs chang 803). Then Wu would only win about 1% of the time. I'm guessing this is fine in a very large data set. But could it skew smaller data sets?

A real world data collection example would be this. What happens if someone fat fingers the score entry and suddenly someone has a score of 77-5 instead of 7-5. Statistical methods might throw out this outlier because it is so far off the average result.

I'm not a math/stats expert so I can't speak on how someone would identify outliers vs valid results. Just an interesting question to me.

http://pareonline.net/getvn.asp?v=9&n=6

Fatfingers issue: We have done various internal checks, like that in a race to 5 match, the total number of games must be 5,6,7,8,or 9. And this helps the 5-3 score being reported as 5-33.

We might be flagged by an unusual result to check for errors. For instance, if 50 games from a tournament for pro player Mike Davis get assigned in an error to low-level-league-player Mike Davis from Peoria IL, then being flagged to check it is reasonable.
 
I asked a related question on another thread, but I'll ask it here again:

How far back does it go? And do all the tours (Mezz, Joss, Predator) give information to the FargoRate system?

How about leagues? BCAPL/CSI, APA, VNEA, TAP, etc.

I play in a local San Diego team (gaining size)... How do we get those scores into fargorate. They use something that is similar to NPL / Argonne.

Freddie <~~~ rateless
 
It is a dangerous, slippery slope if the statistician starts to look for reasons to exclude data. Maybe the "sick" player was reacting to his chemo session and this is his new normal.

It is better to look only at the numbers.

But if the FargoRate team wanted to, they could extract a player's "beta" from the match results which would indicate how erratic their scores were in single matches. A player who always won 9-0 and always lost 0-9 almost regardless of who he was playing would have a very high beta.

(The "beta" of a stock describes how volatile its price is.)

It's not that dangerous to remove outliers if you have a known distribution. And may be useful if you have a small sample size. For example M8 leagues takes the last 20 match scores, and then drops the best couple matches and worst couple of matches before calculating rating. I'm guessing most players match scores are rather typical, bell curved.

FargoRate probably requires enough data that wildly inaccurate ratings are unlikely. However, I would be interested to see what Wu's rating would be if you dropped some of the lopsided results from the data set.
 
It's not that dangerous to remove outliers if you have a known distribution. And may be useful if you have a small sample size. For example M8 leagues takes the last 20 match scores, and then drops the best couple matches and worst couple of matches before calculating rating. I'm guessing most players match scores are rather typical, bell curved.



FargoRate probably requires enough data that wildly inaccurate ratings are unlikely. However, I would be interested to see what Wu's rating would be if you dropped some of the lopsided results from the data set.



On the other side of the coin remove the close matches as they maybe outliers


Sent from my iPhone using Tapatalk
 
On the other side of the coin remove the close matches as they maybe outliers


Sent from my iPhone using Tapatalk

Right.. you don't really know if a score is valid or an outlier until you have enough match scores. After 20 matches you could probably notice if there were a couple flukes just by visual inspection.

Also people have "starter ratings" which are supposed to approximate the skill. So if a 525 starter rating gets crushed by a pro, its probably not a fluke. Similar if they have an even match with a 525 its probably not a fluke.
 
Right.. you don't really know if a score is valid or an outlier until you have enough match scores. After 20 matches you could probably notice if there were a couple flukes just by visual inspection.

Also people have "starter ratings" which are supposed to approximate the skill. So if a 525 starter rating gets crushed by a pro, its probably not a fluke. Similar if they have an even match with a 525 its probably not a fluke.

Wu's record against Chang is actually two different matches

11 to 2 at the CBSA tournament in China
13 to 7 at the Trailer tournament following the China Open

I don't know which you think we should remove. Or whether we should remove Alex's 15 to 2 win against Deuel last weekend or Dennis's 15 to 6 win against Shane. What if Chang beat Wu in a match? Should we remove that?
 
... However, I would be interested to see what Wu's rating would be if you dropped some of the lopsided results from the data set.
One of the results appeared to be about a 1% chance of being as lop-sided as it was. I think that's not far enough from the mean to throw out, especially if there are 20 or so matches recorded. Vandenberg's result appears to be lopsided in the other direction -- would you also throw it out?

In my former day job I was faced with measurements of components in which a few items came in at 15 or 20 sigma from the mean of the "normal" components. I was willing to exclude those from characterization of the "true" average. However, the presence of such a tail on the distribution may indicate something very unusual is going on with either the process or the measurement and it is cleaner to fix what is wrong rather than "fix" the data.

For the odds of the Yu Lung Chang result, I think you need to do the calculation differently. They played a total of 33 games. The expected number of wins for Chang out of 33 games is close to 15 but he won 9. The sigma on that number is roughly sqrt(33)/2 or 2.86 games. The observed result is only about 2.14 sigma from the expected result. That is a 1 in 30 occurrence, roughly, if you include both tails.
 
Bob and Mike can, and more than likely have had, some interesting conversations. They both understand this


Sent from my iPad using Tapatalk
 
Wu's record against Chang is actually two different matches

11 to 2 at the CBSA tournament in China
13 to 7 at the Trailer tournament following the China Open

I don't know which you think we should remove. Or whether we should remove Alex's 15 to 2 win against Deuel last weekend or Dennis's 15 to 6 win against Shane. What if Chang beat Wu in a match? Should we remove that?


I'm sure you can figure some criteria, you're the math guy.

https://en.wikipedia.org/wiki/Outlier
 
One of the results appeared to be about a 1% chance of being as lop-sided as it was. I think that's not far enough from the mean to throw out, especially if there are 20 or so matches recorded. Vandenberg's result appears to be lopsided in the other direction -- would you also throw it out?

In my former day job I was faced with measurements of components in which a few items came in at 15 or 20 sigma from the mean of the "normal" components. I was willing to exclude those from characterization of the "true" average. However, the presence of such a tail on the distribution may indicate something very unusual is going on with either the process or the measurement and it is cleaner to fix what is wrong rather than "fix" the data.

For the odds of the Yu Lung Chang result, I think you need to do the calculation differently. They played a total of 33 games. The expected number of wins for Chang out of 33 games is close to 15 but he won 9. The sigma on that number is roughly sqrt(33)/2 or 2.86 games. The observed result is only about 2.14 sigma from the expected result. That is a 1 in 30 occurrence, roughly, if you include both tails.

I just punched in a race on the FargoRate calculator. Could have been off by a game or so but i doubt its 3%

Mike says one of Wu's matches was 11-2 vs chang

If you put in a race to 11 vs race to 3 the odds of wu winning are 2.3%. For even rated players the chance drops to 1%. Maybe they are even without that result? In a small sample size of 20 matches, I'd think dropping this might be useful, but again. I'm not a stats expert.
 
... In a small sample size of 20 matches ...
The sample size is the number of games which is the number of independent (pretty much, in the statistical sense, at least at nine ball) events being tallied. If a match is a race to 100, you will have nearly 200 independent events.

As mentioned above, Wu has about 500 games within the last year. I think that is a pretty good sample. Does it give a perfectly accurate measure of his rating? No. Of course such a thing is impossible to achieve.

A more interesting problem is to estimate how much error the rating is likely to have. It is possible to calculate that. My very rough calculation says that 500 games gives a rating that is accurate to +- 9 rating points (one sigma).
 
Back
Top