[...] Is there a theoretical reason to believe that an exponent rather than an offset gives a better result?
[..]
I know of none, but comparison with simulations shows the HFS with optimized exponent works quite well.
Here's what we did.
A
"simulation" means playing out a tournament where each match winner is determined by flipping a bent coin in the computer. Knowing the rating difference between the players and that it is a race to 11 (what we used) tells us how much to bend each coin.
We did
100,000 64-player SE simulations for each of several tournament fields (Florida Open US Open, Peri, etc.) with random draw, and for each tournament we assigned points to finish places of of 1, 2, 4, 8, 16, or 32 for R32 finish, R16, QF, SF, runnerup and winner. Then for each player we compute an average number of points per tournament. So this tracks typical $$ distribution pretty well.
The first tournament we'll call
THE BEAST: the field is the
top 64 players in world by Fargo Rating.
And because THE BEAST is the hardest tournament field, all players will earn fewer points in THE BEAST than in any real-world field. Take YAPP, for instance, because he is in every field. If YAPP or any given player earns more points in a tournament, that's what we mean by a weaker tournament field. YAPP earns 2.86 points per tournament in THE BEAST, and he earns 5.43 points per tournament in the China Open, close to twice as many, i.e., a ratio of close to 2. There is a ratio like this for all 24 players who played both in the China Open and in THE BEAST, and the
geometric mean of those those ratios is our tournament difficulty factor. For China Open that factor really is 2.0. We express that 2.0 in "rating speak," meaning we call the China Open 100 points in some sense below THE BEAST. Rating-speak (converting a ratio to a rating gap) allows a cleaner comparison to Harmonic Field Strength, which already lives in that rating-type world.
If the brute-force tournament difficulty factor and HFS really are measuring the same thing (what we want), there should be a nice
straight line relationship between them. A weaker nice-to-have is that the fields are ordered the same way in terms of difficulty (field strength).
Tuning that exponent is really like finding the sweet spot for that
top dogs vs
depth competition.
This is a good straight line, meaning HFS/0.77 is an easy way to assess and compare tournament field strengths for pro tournaments without doing extensive simulation.