Forum: >>> Magnum BBS <<<

Re: Balancing number of plies and number of trials

From Timothy Chow@21:1/5 to All on Wed Dec 13 09:27:53 2023

On 12/13/2023 4:47 AM, MK wrote:

For example, would a rollout with 5184 trials
at xg-roller level be as reliable as a rollout
with 2592 tials at xg-roller+ level and as a
rollout with 1296 trials at xg-roller++ level?

There's a subtle distinction between "precision" and "accuracy."

An "accurate" verdict is one that gives the correct answer.

A "precise" estimate has very little statistical noise.

Increasing the number of trials increases the precision. If you
have a lot of trials then you can be very confident that you are
learning "what the bot really thinks" and that it is very unlikely
to change its mind even if you increase the number of trials to
infinity.

Accuracy is another matter. Murat of all people should understand
that "what the bot thinks the correct play is" is not necessarily
the same as "the correct play"; indeed, in some positions, it is
debatable what "the correct play" is since that can depend on who
your opponent is, what their emotional state is at the time, etc.
But even setting those things aside, suppose for the sake of
argument that we define "the correct play" as what game theorists
would call an (expectiminimax) "equilibrium" play. We can ask whether
stronger settings are more likely to yield the correct play. The
answer is that we can't ever be completely sure, but one can give
heuristic arguments in support of this principle. For example,
equilibrium play has a certain self-consistency property, so you
can "cross-examine" the bot and see its answers are self-consistent.
Experience suggests that stronger settings exhibit greater
self-consistency. Bob Wachtel's book "In the Game Until the End"
has some examples of this. But again, the arguments are only
heuristic, and we certainly can't be completely sure in any
particular instance that stronger settings are giving us more
"accurate" answers.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bradley K. Sherman@21:1/5 to tchow12000@yahoo.com on Wed Dec 13 14:46:35 2023

Timothy Chow <tchow12000@yahoo.com> wrote:

...
Accuracy is another matter. Murat of all people should understand
that "what the bot thinks the correct play is" is not necessarily
the same as "the correct play"; indeed, in some positions, it is
debatable what "the correct play" is since that can depend on who
your opponent is, what their emotional state is at the time, etc.
But even setting those things aside, suppose for the sake of
argument that we define "the correct play" as what game theorists
would call an (expectiminimax) "equilibrium" play. We can ask whether >stronger settings are more likely to yield the correct play. The
answer is that we can't ever be completely sure, but one can give
heuristic arguments in support of this principle. For example,
equilibrium play has a certain self-consistency property, so you
can "cross-examine" the bot and see its answers are self-consistent. >Experience suggests that stronger settings exhibit greater
self-consistency. Bob Wachtel's book "In the Game Until the End"
has some examples of this. But again, the arguments are only
heuristic, and we certainly can't be completely sure in any
particular instance that stronger settings are giving us more
"accurate" answers.

Related:
|
| Man beats machine at Go in human victory over AI
|
| Amateur exploited weakness in systems that have otherwise
| dominated grandmasters.
| ... <https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/>

--bks

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Sat Dec 23 08:57:17 2023

On 12/22/2023 12:18 PM, MK wrote:

On December 13, 2023 at 7:27:56 AM UTC-7, Timothy Chow wrote:

If you have a lot of trials then you can be very
confident that you are learning "what the bot
really thinks" and that it is very unlikely to
change its mind even if you increase the number
of trials to infinity.

This isn't necessarily true and indeed incomplete.

While random errors decrease, systematic errors
may increase (accumulate and compound), thus
cause the bot to change its mind.

No, this is not correct, at least when you are simply extending
a specific rollout. Systematic errors can indeed accumulate and
compound over the course of a game, but a rollout trial repeatedly
samples an entire game, so *each individual* trial is subject to
the accumulated systematic error. There will be some randomness
involved from trial to trial, of course; some trials may be "lucky"
enough to avoid the variations that suffer from a lot of accumulated
systematic error, while other trials may be "unlucky" enough to hit
those variations, but in the long run these fluctuations will even
out, and the rollout will converge. The final result will be an
average over all accumulated systematic errors.

I assume you mean look-ahead plies? Can you (or
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?

The GNU team can answer this better than I can. One thing to note
is that during rollouts, the bots will apply some kind of move
filter to screen out unpromising plays. That is, if you perform
a 3-ply rollout, the bot doesn't necessarily evaluate every legal
move at 3-ply and pick the highest-scoring one. It will evaluate
all the options at the lowest ply but then discard a lot of them
as not likely to emerge as the top play.

I won't argue against self-consistency if you can
prove that your equilibrium play is actually that.

The *theoretical* equilibrium play is *defined* in terms of a
system of equations that expresses self-consistency. If you insist
on an empirical definition, though, then self-consistency can't be
proved.

so you can "cross-examine" the bot and see its
answers are self-consistent.

This would be most interesting for me to see. Has
any bot been cross-examined for this and how?

I don't know if anyone has done this in a systematic fashion, but
certainly, if you take some crazy superbackgame or containment
position, you can observe inconsistency yourself. Note down the
3-ply equity (for example). Then run through all the possible rolls,
and note down their 3-ply equities. Average them, and you'll find
that they don't average out to the original 3-ply equity. This means
that the 3-ply equity isn't (entirely) self-consistent. In many
positions, the top play will still be the top play, but in the crazy superbackgame positions, this experiment can result in wild swings
that drastically change the top play.

But again, the arguments are only heuristic, and
we certainly can't be completely sure in any
particular instance that stronger settings are
giving us more "accurate" answers.

I argue that we can if we have unbiased bots that
are trained not only through cubeless, single-game
play but also through cubeful and "matchful" play,
eliminating extrapolated cubeful/matchful equities.

There are certainly ways to improve the way bots are trained, but it
will still be true that we won't be *completely* sure that we're getting
more accurate answers in every position. That would require more
computing power than is available in the observable universe.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to Tikli Chestikov on Tue Dec 26 23:08:39 2023

On 12/26/2023 1:03 PM, Tikli Chestikov wrote:

The fact that the main protagonist here has a ridiculous interest in hawking fast cars around various US "strips" is irrelevant.

He's back!! Yay!!

Been busy helping Hans Niemann file lawsuits, I presume?

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Wed Dec 27 07:22:02 2023

On 12/27/2023 2:16 AM, MK wrote:

Now that I do, my immediate reaction is that it
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?

It's done for speed. Each additional ply slows things
down by a factor of (about) 21.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Philippe Michel@21:1/5 to Timothy Chow on Thu Dec 28 22:09:57 2023

On 2023-12-23, Timothy Chow <tchow12000@yahoo.com> wrote:

On 12/22/2023 12:18 PM, MK wrote:

I assume you mean look-ahead plies? Can you (or
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?

The GNU team can answer this better than I can. One thing to note
is that during rollouts, the bots will apply some kind of move
filter to screen out unpromising plays. That is, if you perform
a 3-ply rollout, the bot doesn't necessarily evaluate every legal
move at 3-ply and pick the highest-scoring one. It will evaluate
all the options at the lowest ply but then discard a lot of them
as not likely to emerge as the top play.

This is not specific to rollouts. Interactive play, hints, analysis all
uses this.

To answer issues raised later in the thread by Murat, this is done for
speed as already mentionned by Timothy.

The cost in accuracy seems perfectly acceptable although it is not
entirely negligible. For instance there are two predefined 2-ply
settings: world class and supremo.

The first one evaluates at 2-ply up to the top 8 0-ply moves if they are
no worse than 0.16 point weaker than the best. The second one evaluates
up to 16 moves no worse than 0.32 point. On the Depreli benchmark the
cost of errors from world class is about 4% more than from supremo.

The differences between either 1-ply or 3-ply and either of these 2-ply settings are much larger than this.

You can change this in the analysis or rollout settings (look for
Advanced settings and then Move filter). As far as I know, the default
settings are conservative compared to what is used by the similar
feature in eXtreme Gammon.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Mon Jan 8 08:55:34 2024

On 1/6/2024 7:50 PM, MK wrote:

On December 27, 2023 at 5:22:06 AM UTC-7, Timothy Chow wrote:

On 12/27/2023 2:16 AM, MK wrote:

Now that I do, my immediate reaction is that it
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?

It's done for speed. Each additional ply slows
things down by a factor of (about) 21.

Ah, that magic number 21 again. :) The number
of possible dice rolls at every turn... ;)

But why the factor is imprecise, i.e. "about 21"?
Can't you give us the exact math...?

The speed at which a complex piece of code runs depends on many
factors beyond the simple math of how many different rolls there
are.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Mon Jan 8 21:26:04 2024

On 1/8/2024 1:14 PM, MK wrote:

On January 8, 2024 at 6:55:38 AM UTC-7, Timothy Chow wrote:

The speed at which a complex piece of code

You mean like this one?:

=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================

That's pseudocode, not code.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Tue Jan 9 08:48:16 2024

On 1/9/2024 1:28 AM, MK wrote:

So, you should be able to explain the reason based
on the above pseudo code.

Finding the best move for a given roll isn't necessarily going
to take the same amount of time for every roll. To find the
best move, one must first generate all the legal moves and
evaluate them. The number of legal ways to play 11 is not
necessarily going to be the same as the number of legal ways
to play 66. It will depend on the position.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Thu Jan 11 17:44:13 2024

On 1/11/2024 4:17 AM, MK wrote:

This is not it. Just like dice rolls even out (or can
be forced to artificially even out faster), number
of legal ways to play for given dice rolls at given
positions will alse average out.

Of course. That's what "approximately" means. Check your
dictionary.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MK@21:1/5 to Timothy Chow on Fri Jan 12 03:17:33 2024

On 1/11/2024 3:44 PM, Timothy Chow wrote:

On 1/11/2024 4:17 AM, MK wrote:

This is not it. Just like dice rolls even out
(or can be forced to artificially even out
faster), number of legal ways to play for given
dice rolls at given positions will average out.

Of course. That's what "approximately" means.

Absolutely not!

Check your dictionary.

I would prefer to check your dictionary instead.
Please tell us what dictionary you have checked?

Noo-BG manual says "on average there are about
20 legal moves" but because those chimpanzees
are incapable of human language either.

An average is just a single number result, like
the average winning/losing PR in your contrived
example.

Once you compute an average, you treat it as a
constant in your later calculations. There is
no such thing as an "approximate average".

Indeed, the following paragraph in the Noo-BG
manual says: "GNU Backgammon needs to consider
21 rolls by the opponent, 20 and possible legal
moves per roll) = 420 positions to evaluate."

Do you understand why it doesn't say *about*
420 positions to evaluate? Because neither the
21 possible combinations of rolls, nor the
average 20 possible legal moves, nor their
product are *not approximate*..!

Thus, the reason for each additional ply being
approximately 21 times slower has to to with
something else than the number of possible of
legal moves.

Ask someone who knows math. Axel, Paul, etal.
are looking up to you but maybe Bob Coca can
help you with this on bgonline... ;)

MK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MK@21:1/5 to All on Fri Jan 19 18:14:24 2024

On 1/8/2024 11:28 PM, MK wrote:

=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================

Oh, I almost forgot. There is a kind of rotten easter
egg in the above pseudocode. Let's see how long it
will take for you whizzes to find it...? :)

Bzzzt! Time's up.

Loop over 21 dice rolls and divide by 36...?

I keep telling you folks that your venerated
bots are garbage... :(

MK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Keyop
  Sun Apr 28 20:37:53 2024
  from Huddersfield, West Yorkshire via SSH
- Keyop
  Sun Apr 28 20:37:37 2024
  from Huddersfield, West Yorkshire via SSH
- Keyop
  Sun Apr 28 20:30:04 2024
  from Huddersfield, West Yorkshire via SSH
- Bob Worm
  Mon Apr 29 09:04:47 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	297
Nodes:	16 (2 / 14)
Uptime:	14:35:37
Calls:	6,667
Calls today:	1
Files:	12,216
Messages:	5,336,614

Re: Balancing number of plies and number of trials

Who's Online

Recent Visitors

System Info