For example, would a rollout with 5184 trialsThere's a subtle distinction between "precision" and "accuracy."
at xg-roller level be as reliable as a rollout
with 2592 tials at xg-roller+ level and as a
rollout with 1296 trials at xg-roller++ level?
...
Accuracy is another matter. Murat of all people should understand
that "what the bot thinks the correct play is" is not necessarily
the same as "the correct play"; indeed, in some positions, it is
debatable what "the correct play" is since that can depend on who
your opponent is, what their emotional state is at the time, etc.
But even setting those things aside, suppose for the sake of
argument that we define "the correct play" as what game theorists
would call an (expectiminimax) "equilibrium" play. We can ask whether >stronger settings are more likely to yield the correct play. The
answer is that we can't ever be completely sure, but one can give
heuristic arguments in support of this principle. For example,
equilibrium play has a certain self-consistency property, so you
can "cross-examine" the bot and see its answers are self-consistent. >Experience suggests that stronger settings exhibit greater
self-consistency. Bob Wachtel's book "In the Game Until the End"
has some examples of this. But again, the arguments are only
heuristic, and we certainly can't be completely sure in any
particular instance that stronger settings are giving us more
"accurate" answers.
On December 13, 2023 at 7:27:56 AM UTC-7, Timothy Chow wrote:
If you have a lot of trials then you can be very
confident that you are learning "what the bot
really thinks" and that it is very unlikely to
change its mind even if you increase the number
of trials to infinity.
This isn't necessarily true and indeed incomplete.
While random errors decrease, systematic errors
may increase (accumulate and compound), thus
cause the bot to change its mind.
I assume you mean look-ahead plies? Can you (or
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?
I won't argue against self-consistency if you can
prove that your equilibrium play is actually that.
so you can "cross-examine" the bot and see its
answers are self-consistent.
This would be most interesting for me to see. Has
any bot been cross-examined for this and how?
But again, the arguments are only heuristic, and
we certainly can't be completely sure in any
particular instance that stronger settings are
giving us more "accurate" answers.
I argue that we can if we have unbiased bots that
are trained not only through cubeless, single-game
play but also through cubeful and "matchful" play,
eliminating extrapolated cubeful/matchful equities.
The fact that the main protagonist here has a ridiculous interest in hawking fast cars around various US "strips" is irrelevant.
Now that I do, my immediate reaction is that it
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?
On 12/22/2023 12:18 PM, MK wrote:
I assume you mean look-ahead plies? Can you (or
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?
The GNU team can answer this better than I can. One thing to note
is that during rollouts, the bots will apply some kind of move
filter to screen out unpromising plays. That is, if you perform
a 3-ply rollout, the bot doesn't necessarily evaluate every legal
move at 3-ply and pick the highest-scoring one. It will evaluate
all the options at the lowest ply but then discard a lot of them
as not likely to emerge as the top play.
On December 27, 2023 at 5:22:06 AM UTC-7, Timothy Chow wrote:
On 12/27/2023 2:16 AM, MK wrote:
Now that I do, my immediate reaction is that it
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?
It's done for speed. Each additional ply slows
things down by a factor of (about) 21.
Ah, that magic number 21 again. :) The number
of possible dice rolls at every turn... ;)
But why the factor is imprecise, i.e. "about 21"?
Can't you give us the exact math...?
On January 8, 2024 at 6:55:38 AM UTC-7, Timothy Chow wrote:
The speed at which a complex piece of code
You mean like this one?:
=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================
So, you should be able to explain the reason based
on the above pseudo code.
This is not it. Just like dice rolls even out (or can
be forced to artificially even out faster), number
of legal ways to play for given dice rolls at given
positions will alse average out.
On 1/11/2024 4:17 AM, MK wrote:
This is not it. Just like dice rolls even out
(or can be forced to artificially even out
faster), number of legal ways to play for given
dice rolls at given positions will average out.
Of course. That's what "approximately" means.
Check your dictionary.
=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================
Oh, I almost forgot. There is a kind of rotten easter
egg in the above pseudocode. Let's see how long it
will take for you whizzes to find it...? :)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 297 |
Nodes: | 16 (2 / 14) |
Uptime: | 14:35:37 |
Calls: | 6,667 |
Calls today: | 1 |
Files: | 12,216 |
Messages: | 5,336,614 |