How to calculate your Fargo performance in a particular tournament?

Does anybody know if there is a way to calculate your speed you played at, Fargo wise, in a tournament that has finished? I've always been curious about this. When my Fargo goes up, it only moves a few points (1200 games in system, doesn't move much), but would be curious how I performed for certain individual tournaments.
Yes
 
Does anybody know if there is a way to calculate your speed you played at, Fargo wise, in a tournament that has finished? I've always been curious about this. When my Fargo goes up, it only moves a few points (1200 games in system, doesn't move much), but would be curious how I performed for certain individual tournaments.

Does it per tournament.

Under Performance tab.

Here is an example, effective rating vs current FargoRate (skill level):

Screenshot_20230515_061129_Chrome.jpg
 
So, all you have to do is convince tournament directors to use Digital Pool....

Isaac, who owns and runs it had direct help from FargoRate team to get the maths right.
So it is legit.
 
I made a web page that does this, here:


Hope somebody finds it useful!
I believe the method that digitalpool.com uses and the method used on Tom Kerrigan's Home page differ. By experimenting a few times, I can replicate outputs on Tom Kerrigan's website by maximizing the likelihood function for all games played against all (different) opponents, and replicate digitalpool.com result by maximizing the likelihood function over all games played, except against a single (theoretical) opponent with a Fargorate F_A comprised of a weighted average of opponent Fargorates. Below I've attached a link to a desmos calculator in which you can experiment yourself. You can type in Fargorates and games won/lost against for up to 12 opponents. You can see how results are calculated by scrolling down the left side and opening the folders.

https://www.desmos.com/calculator/yegsxondiu

in many situations the two methods produce similar results, but the example I used shows the difference can become significant (651 vs 591). I believe the method on Tom's website is more in tune with Fargorate, though Mike Page would be the authority on this. This method is also a bit more challenging to implement in that I had to use Newton's iterative method to compute a solution, though the solution is essentially revealed in just a few iterations. The digitalpool method has an easy "closed form' solution of F_A + 100 log (total wins / total losses ) / log 2.
 
Last edited:
I believe the method that digitalpool.com uses and the method used on Tom Kerrigan's Home page differ. By experimenting a few times, I can replicate outputs on Tom Kerrigan's website by maximizing the likelihood function for all games played against all (different) opponents, and replicate digitalpool.com result by maximizing the likelihood function over all games played, except against a single (theoretical) opponent with a Fargorate F_A comprised of a weighted average of opponent Fargorates. Below I've attached a link to a desmos calculator in which you can experiment yourself. You can type in Fargorates and games won/lost against for up to 12 opponents. You can see how results are calculated by scrolling down the left side and opening the folders.

https://www.desmos.com/calculator/btg2tazztd

in many situations the two methods produce similar results, but the example I used shows the difference can become significant (651 vs 591). I believe the method on Tom's website is more in tune with Fargorate, though Mike Page would be the authority on this. This method is also a bit more challenging to implement in that I had to use Newton's iterative method to compute a solution, though the solution is essentially revealed in just a few iterations. The digitalpool method has an easy "closed form' solution of F_A + 100 log (total wins / total losses ) / log 2.

Very nice. Good explanation, and I agree with this.
 
Does that include the robustness of the opponents?
I can see that it doesn't.

Here is an example of the two approaches. Both assume opponents are providing resistance implied by their rating.

You play two matches, 30 games against a 500 and 12 games against a 600
Against the 500 you win 20 to 10. At this point you're looking like a 600, right?
Then against the 600, you are 8 to 4
For this one match, you're looking like a 700. But it is for fewer games than you were looking like a 600.

Intuitively, you're performing higher than 600 but lower than 650.

If we say you performed like a 600 for 30 games and like a 700 for 12 games and make a weighted average, we get 628.6. That's the weighted average result.

The other way to do it (what DCMike and Tom do) requires you first make a guess of the performance rating and compare the expected wins to the actual wins.

If we guess, you're a 600, Then your actual 20 wins in the first match matches the expected 20 wins. But in the second match we would have expected you'd win 6 of the 12 games, and you actually won 8. So your aggregate (actual - expected) is 2.0 games

If we bump up the guess to 628.6, then the amount you underperformed in the first match (1.28 games) is a smidge smaller than the amount you overperformed in second match (1.41 games). Seems like we need to bump up just a little more.

At 630.7, it is -1.37 and +1.36

When DCMike suggested this latter approach is more in tune with FargoRate, he is recognizing this kind of balancing simultaneously across all the players and games in the database is what the daily FargoRate optimization is.
 
Does that include the robustness of the opponents?
No, robustness is not included in either, and this brings up an important point in addition to the obvious one. The obvious point being that, for either method, a performance rating can be considered reliable for assessing performance only if opponent robustness numbers are high.

The second point is this: the two methods are fundamentally dissimilar, and because robustness values are not used, the two methods DO NOT become more "in agreement" as opponent FRs become more robust. Therefore, if one wishes to calculate a performance rating for purposes other than just fun information, maybe instead for tweaking tournament handicaps, the choice of method can make a big difference, even if all opponents FRs are extremely robust.

Here's another example of how dissimilar the two methods can be: If you play Andy (a player with a 200 FR) and beat him 16-1 and then play Josh Filler (841 FR) and tie him 1-1, then
  • The single representative player method assigns you a performance rating of 576.2
  • The multiple player method (more "FR-like") assigns you a performance rating of 694.9
These values are produced whether Andy and Josh both have 200 games in the system or 20000 games in the system. interestingly, these same ratings happen even if you played more games against both of them, provided the wins and losses for each are scaled by a common multiple, e.g. you beat Andy 64-4 and tie Josh 4-4.
 
Last edited:
...
The other way to do it (what DCMike and Tom do) requires you first make a guess of the performance rating and compare the expected wins to the actual wins.
...
Yes, basically correct.

My page has a function which computes the probability of a rating, given the data entered.

For example, let's say you play a 500 and win two games and lose one game.

What's the probability that you're a 500 (50% odds of beating a 500)? 0.5 * 0.5 * 0.5 = 0.125

What's the probability that you're a 600 (66% odds of beating a 500)? 0.66 * 0.66 * 0.33 = 0.143

So the 600 rating better matches the data entered.

I used a binary search to find the most probable rating. It requires more computation than Newton's method but is easier to implement.
 
[...]

Here's another example of how dissimilar the two methods can be: If you play Andy (a player with a 200 FR) and beat him 16-1 and then play Josh Filler (841 FR) and tie him 1-1, then
  • The single representative player method assigns you a performance rating of 576.2
  • The multiple player method (more "FR-like") assigns you a performance rating of 694.9
[...]
Though this is a big difference, it is important to note that the likelihood function is quite flat when the opponents are either much weaker or much stronger. Look at the top curve. The most likely rating is around 700. But you can go 100 points in either direction and the likelihood is still more than 80% of what it was at the top. In other words these 19 games don't nail down the person's rating very well.

Contrast that with the likelihood curve below it. This rating is also based on 19 games, but they are going 9-10 against an opponent rated 711. Here we have more confidence the most probable rating is not too far off.

The expression for the curvature/flatness at the top weights each game by p*(1-p). When an opponent is close, that's 0.5*0.5=0.25. When an opponent is further away, it might be 0.9*0.1=0.09. So the curvature is lower (curve is flatter) when match is lopsided.







1685035253658.png
 
Though this is a big difference, it is important to note that the likelihood function is quite flat when the opponents are either much weaker or much stronger. Look at the top curve. The most likely rating is around 700. But you can go 100 points in either direction and the likelihood is still more than 80% of what it was at the top. In other words these 19 games don't nail down the person's rating very well.

Contrast that with the likelihood curve below it. This rating is also based on 19 games, but they are going 9-10 against an opponent rated 711. Here we have more confidence the most probable rating is not too far off.

The expression for the curvature/flatness at the top weights each game by p*(1-p). When an opponent is close, that's 0.5*0.5=0.25. When an opponent is further away, it might be 0.9*0.1=0.09. So the curvature is lower (curve is flatter) when match is lopsided.







View attachment 701337

This makes sense.... and I imagine that in some extreme cases the output of the direct method would be so unrealistic as to be considered pathological, but not so in the other case due to the filtering effect of averaging.
 
Back
Top