Does that include the robustness of the opponents?
I can see that it doesn't.
Here is an example of the two approaches. Both assume opponents are providing resistance implied by their rating.
You play two matches, 30 games against a 500 and 12 games against a 600
Against the 500 you win 20 to 10. At this point you're looking like a 600, right?
Then against the 600, you are 8 to 4
For this one match, you're looking like a 700. But it is for fewer games than you were looking like a 600.
Intuitively, you're performing higher than 600 but lower than 650.
If we say you performed like a 600 for 30 games and like a 700 for 12 games and make a weighted average, we get 628.6. That's the weighted average result.
The other way to do it (what DCMike and Tom do) requires you first make a guess of the performance rating and compare the expected wins to the actual wins.
If we guess, you're a 600, Then your actual 20 wins in the first match matches the expected 20 wins. But in the second match we would have expected you'd win 6 of the 12 games, and you actually won 8. So your aggregate (actual - expected) is 2.0 games
If we bump up the guess to 628.6, then the amount you underperformed in the first match (1.28 games) is a smidge smaller than the amount you overperformed in second match (1.41 games). Seems like we need to bump up just a little more.
At 630.7, it is -1.37 and +1.36
When DCMike suggested this latter approach is more in tune with FargoRate, he is recognizing this kind of balancing simultaneously across all the players and games in the database is what the daily FargoRate optimization is.