Forum: >>> Magnum BBS <<<

Try to be right even when it makes very little difference

From pepstein5@gmail.com@21:1/5 to All on Sat Dec 4 15:01:36 2021

The two legal 33s differ in equity by about one trillionth.
But (particularly if you're having trouble sleeping), why not try
to use your combinatorial logic to figure out which is correct?
I guessed wrong, by the way.
I suppose zero is close to a trillionth so the two equities could
be the same.
But they aren't -- your opinion really does matter.

Paul

XGID=-ECE--A-----------A-aaa---:1:1:1:33:6:8:3:0:10
X:Daniel O:eXtremeGammon

Score is X:6 O:8. Unlimited Game, Jacoby Beaver
+13-14-15-16-17-18------19-20-21-22-23-24-+
| X | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| | | X X X X | +---+
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 50 O: 12 X-O: 6-8
Cube: 2, X own cube
X to play 33

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From pepstein5@gmail.com@21:1/5 to peps...@gmail.com on Sat Dec 4 15:17:36 2021

On Saturday, December 4, 2021 at 11:01:37 PM UTC, peps...@gmail.com wrote:

The two legal 33s differ in equity by about one trillionth.
But (particularly if you're having trouble sleeping), why not try
to use your combinatorial logic to figure out which is correct?
I guessed wrong, by the way.
I suppose zero is close to a trillionth so the two equities could
be the same.
But they aren't -- your opinion really does matter.

Paul

XGID=-ECE--A-----------A-aaa---:1:1:1:33:6:8:3:0:10
X:Daniel O:eXtremeGammon

Score is X:6 O:8. Unlimited Game, Jacoby Beaver +13-14-15-16-17-18------19-20-21-22-23-24-+
| X | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| | | X X X X | +---+
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 50 O: 12 X-O: 6-8
Cube: 2, X own cube
X to play 33

Actually, I don't believe XG's answer. So I'll post it and leave it to others to explain.
XG says:

1. 4-ply 18/6 eq:-1.114
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:11.40% B:0.00%)

2. 4-ply 18/9 6/3 eq:-1.121 (-0.007)
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.10% B:0.00%)

But what is wrong with my proof that the choices are equivalent?
Since our opponent is guaranteed to bear off in four rolls, we are simply trying
to max our probability of saving the gammon.

If our opponent takes exactly two rolls to bear off then 18/6 loses a gammon to 54,
and 18/9 6/3 loses a gammon to 21.
If the opponent takes more than two rolls to bear off, both plays save the gammon.

So what am I missing??
What makes this particularly puzzling is that it's hard to see how a miscounting could
drop as little equity as 0.7%. If the plays were between 2% to 5% apart, I'd be less puzzled.

Paul

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Sat Dec 4 22:03:16 2021

I think your proof is correct. Below is a rollout with XG's
strongest settings, with variance reduction *turned off* and with
36^4 = 1679616 trials. The results are as expected.

XGID=-ECE--A-----------A-aaa---:1:1:1:33:0:0:3:0:10

X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver
+13-14-15-16-17-18------19-20-21-22-23-24-+
| X | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| | | X X X X | +---+
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 50 O: 12 X-O: 0-0
Cube: 2, X own cube
X to play 33

1. Rollout¹ 18/9 6/3 eq:-1.1209
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.09% B:0.00%)
Confidence: ±0.0003 (-1.1212..-1.1206) - [53.4%]

2. Rollout¹ 18/6 eq:-1.1209
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.09% B:0.00%)
Confidence: ±0.0003 (-1.1212..-1.1206) - [46.6%]

¹ 1679616 Games rolled.
Dice Seed: 271828
Moves and cube decisions: XG Roller++
Search interval: Gigantic

eXtreme Gammon Version: 2.19.207.pre-release

Interestingly, though, if we turn *on* variance reduction
and repeat the rollout, then XG is very confident that 18/9
6/3 is slightly better. (It actually reaches 100% confidence
very early on in the rollout and never wavers.) So this is
some kind of bug in XG. In my version of XG, I can examine
just how many times X loses a gammon. After 18/9 6/3, it
loses a gammon 203015 times, and after 18/6, it loses a gammon
203045 times. So it doesn't seem to be a bug in the way it
performs the rollout trials, but a bug in the way it does the
variance reduction, or calculates the numbers that it chooses
to display.

XGID=-ECE--A-----------A-aaa---:1:1:1:33:0:0:3:0:10

X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver
+13-14-15-16-17-18------19-20-21-22-23-24-+
| X | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| | | X X X X | +---+
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 50 O: 12 X-O: 0-0
Cube: 2, X own cube
X to play 33

1. Rollout¹ 18/9 6/3 eq:-1.1209
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.09% B:0.00%)
Confidence: ±0.0000 (-1.1209..-1.1209) - [100.0%]

2. Rollout¹ 18/6 eq:-1.1211 (-0.0002)
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.11% B:0.00%)
Confidence: ±0.0000 (-1.1211..-1.1211) - [0.0%]

¹ 1679616 Games rolled with Variance Reduction.
Dice Seed: 271828
Moves and cube decisions: XG Roller++
Search interval: Gigantic

eXtreme Gammon Version: 2.19.207.pre-release

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From pepstein5@gmail.com@21:1/5 to Tim Chow on Sun Dec 5 03:44:13 2021

On Sunday, December 5, 2021 at 3:03:21 AM UTC, Tim Chow wrote:

I think your proof is correct. Below is a rollout with XG's
strongest settings, with variance reduction *turned off* and with
36^4 = 1679616 trials. The results are as expected.

XGID=-ECE--A-----------A-aaa---:1:1:1:33:0:0:3:0:10

X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver +13-14-15-16-17-18------19-20-21-22-23-24-+
| X | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| | | X X X X | +---+
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 50 O: 12 X-O: 0-0
Cube: 2, X own cube
X to play 33
1. Rollout¹ 18/9 6/3 eq:-1.1209
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.09% B:0.00%)
Confidence: ±0.0003 (-1.1212..-1.1206) - [53.4%]

2. Rollout¹ 18/6 eq:-1.1209
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.09% B:0.00%)
Confidence: ±0.0003 (-1.1212..-1.1206) - [46.6%]

¹ 1679616 Games rolled.
Dice Seed: 271828
Moves and cube decisions: XG Roller++
Search interval: Gigantic

eXtreme Gammon Version: 2.19.207.pre-release

Interestingly, though, if we turn *on* variance reduction
and repeat the rollout, then XG is very confident that 18/9
6/3 is slightly better. (It actually reaches 100% confidence
very early on in the rollout and never wavers.) So this is
some kind of bug in XG. In my version of XG, I can examine
just how many times X loses a gammon. After 18/9 6/3, it
loses a gammon 203015 times, and after 18/6, it loses a gammon
203045 times. So it doesn't seem to be a bug in the way it
performs the rollout trials, but a bug in the way it does the
variance reduction, or calculates the numbers that it chooses
to display.

XGID=-ECE--A-----------A-aaa---:1:1:1:33:0:0:3:0:10

X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver +13-14-15-16-17-18------19-20-21-22-23-24-+
| X | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| | | X X X X | +---+
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 50 O: 12 X-O: 0-0
Cube: 2, X own cube
X to play 33
1. Rollout¹ 18/9 6/3 eq:-1.1209
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.09% B:0.00%)
Confidence: ±0.0000 (-1.1209..-1.1209) - [100.0%]

2. Rollout¹ 18/6 eq:-1.1211 (-0.0002)
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.11% B:0.00%)
Confidence: ±0.0000 (-1.1211..-1.1211) - [0.0%]

¹ 1679616 Games rolled with Variance Reduction.
Dice Seed: 271828
Moves and cube decisions: XG Roller++
Search interval: Gigantic

Thanks for the analysis.
I think this position is best evaluated by multi-ply analysis rather than by rollouts.
It's surprising to me that the 4 ply analysis by XG is (slightly) wrong.
In fact, how can this be??

So each legal play is optimal.
XG correctly understands that each legal play loses 100% of the time.
XG's error is that (for at least one of the plays), the gammon probability is inaccurately calculated.
I'll calculate the gammon probability by hand, and I hope we can see where XG fails.
I lose a gammon if A) XG bears off in one or B) XG bears off in exactly two and I hit my anti-joker.
The probability of A is 1/12.
The probability of my anti-joker is 1/18.
The probability of XG bearing off in exactly 2 is trickier.
I'll go through the first rolls in (an) ascending order -- there are only 21 of them.
For each first roll, I'll give the probability of the next roll bearing off. [If the first roll bears off, I'll assign a probability for the next roll bearing off of 0].
11: 7/18
21: 5/18
31: 7/18
41: 19/36
51: 23/36
61: 23/36
22: 17/18
32: 19/36
42: 23/36
52: 29/36
62: 29/36
33: 1
43: 31/36
53: 17/18
63: 17/18
44: 0
54: 1
64: 1
55: 0
65: 1
66: 0

The probability of bearing off in two is therefore:
7/36 * 1 + 5/36 * 17/18 + 1/18 * 31/36 + 1/9 * 29/36 + 1/6 * 23/36 + 1/9 * 19/36 + 1/12 * 7/18 + 1/18 * 5/18 + 1/12 * 0.
[Check: 7/36 + 5/36 + 1/18 + 1/9 + 1/6 + 1/9 + 1/12 + 1/18 + 1/12 = 1].
The probability of bearing off in two is 126/648 + 85/648 + 31/648 + 58/648 + 69/648 + 38/648 + 21/648 + 10/648 = 438/648 = 73/108.
Probability of a gammon is therefore: 1/12 + 1/18 * 73/108 = 162/1944 + 73/1944 = 235/1944 = approx 0.12088477366 which validates your 12.09% rollout.

So why does XG give us this 11.4% nonsense??
I wasn't using > 4 ply reasoning above so why couldn't XG do what I did?

Paul

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to peps...@gmail.com on Sun Dec 5 08:33:55 2021

On 12/5/2021 6:44 AM, peps...@gmail.com wrote:

So why does XG give us this 11.4% nonsense??
I wasn't using > 4 ply reasoning above so why couldn't XG do what I did?

I don't know exactly, but I discovered something interesting.
Look what happens if I force XG to perform a 1-ply analysis
of O's decision after each of X's possible plays. For some
reason, after 18/6, if O rolls 31, then all of O's choices are
evaluated equally as 1.000. When this happens, it seems to me
that XG has a chance of misplaying the roll.

This may also partly explain the variance reduction thing, since
I believe that luck is calculated using 1-ply analysis.

--------------
After 18/9 6/3
--------------

XGID=-ECF-----A----------aaa---:1:1:-1:13:0:0:3:0:10

X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver
+12-11-10--9--8--7-------6--5--4--3--2--1-+
| | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | 6 X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| X | | X X X | +---+
+13-14-15-16-17-18------19-20-21-22-23-24-+
Pip count X: 38 O: 12 X-O: 0-0
Cube: 2, X own cube
O to play 13

1. 1-ply 4/Off eq:+1.030
Player: 100.00% (G:2.99% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

2. 1-ply 5/4 3/Off eq:+1.014 (-0.016)
Player: 100.00% (G:1.43% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

3. 1-ply 5/2 3/2 eq:+1.014 (-0.016)
Player: 100.00% (G:1.38% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

4. 1-ply 4/1 3/2 eq:+1.009 (-0.021)
Player: 100.00% (G:0.92% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

5. 1-ply 5/2 4/3 eq:+1.009 (-0.021)
Player: 100.00% (G:0.91% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

6. 1-ply 5/1 eq:+1.009 (-0.021)
Player: 100.00% (G:0.89% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

----------
After 18/6
----------

XGID=-ECE--B-------------aaa---:1:1:-1:13:0:0:3:0:10

X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver
+12-11-10--9--8--7-------6--5--4--3--2--1-+
| | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X X | | 2 |
| | | X X X X | +---+
+13-14-15-16-17-18------19-20-21-22-23-24-+
Pip count X: 38 O: 12 X-O: 0-0
Cube: 2, X own cube
O to play 13

1. 1-ply 4/Off eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

2. 1-ply 5/4 3/Off eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

3. 1-ply 5/2 3/2 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

4. 1-ply 5/2 4/3 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

5. 1-ply 5/1 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

6. 1-ply 4/1 3/2 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

eXtreme Gammon Version: 2.19.207.pre-release

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From pepstein5@gmail.com@21:1/5 to Tim Chow on Sun Dec 5 05:50:04 2021

On Sunday, December 5, 2021 at 1:33:59 PM UTC, Tim Chow wrote:

On 12/5/2021 6:44 AM, peps...@gmail.com wrote:

So why does XG give us this 11.4% nonsense??
I wasn't using > 4 ply reasoning above so why couldn't XG do what I did?

I don't know exactly, but I discovered something interesting.
Look what happens if I force XG to perform a 1-ply analysis
of O's decision after each of X's possible plays. For some
reason, after 18/6, if O rolls 31, then all of O's choices are
evaluated equally as 1.000. When this happens, it seems to me
that XG has a chance of misplaying the roll.

This may also partly explain the variance reduction thing, since
I believe that luck is calculated using 1-ply analysis.

--------------
After 18/9 6/3
--------------

XGID=-ECF-----A----------aaa---:1:1:-1:13:0:0:3:0:10
X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver +12-11-10--9--8--7-------6--5--4--3--2--1-+
| | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | 6 X |
| | | X X |
| | | X X X | +---+
| | | X X X | | 2 |
| X | | X X X | +---+
+13-14-15-16-17-18------19-20-21-22-23-24-+
Pip count X: 38 O: 12 X-O: 0-0
Cube: 2, X own cube
O to play 13

1. 1-ply 4/Off eq:+1.030
Player: 100.00% (G:2.99% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

2. 1-ply 5/4 3/Off eq:+1.014 (-0.016)
Player: 100.00% (G:1.43% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

3. 1-ply 5/2 3/2 eq:+1.014 (-0.016)
Player: 100.00% (G:1.38% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

4. 1-ply 4/1 3/2 eq:+1.009 (-0.021)
Player: 100.00% (G:0.92% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

5. 1-ply 5/2 4/3 eq:+1.009 (-0.021)
Player: 100.00% (G:0.91% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

6. 1-ply 5/1 eq:+1.009 (-0.021)
Player: 100.00% (G:0.89% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

----------
After 18/6
----------

XGID=-ECE--B-------------aaa---:1:1:-1:13:0:0:3:0:10
X:Player 1 O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver +12-11-10--9--8--7-------6--5--4--3--2--1-+
| | | O O O |
| | | |
| | | |
| | | |
| | | |
| |BAR| |
| | | X X |
| | | X X |
| | | X X X | +---+
| | | X X X X | | 2 |
| | | X X X X | +---+
+13-14-15-16-17-18------19-20-21-22-23-24-+
Pip count X: 38 O: 12 X-O: 0-0
Cube: 2, X own cube
O to play 13

1. 1-ply 4/Off eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

2. 1-ply 5/4 3/Off eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

3. 1-ply 5/2 3/2 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

4. 1-ply 5/2 4/3 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

5. 1-ply 5/1 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)

6. 1-ply 4/1 3/2 eq:+1.000
Player: 100.00% (G:0.00% B:0.00%)
Opponent: 0.00% (G:0.00% B:0.00%)
eXtreme Gammon Version: 2.19.207.pre-release

Thanks.
Yes, I think that resolves it.
After 18/6, XG sometimes misplays (or intends to misplay) 31.
This results in XG underestimating its gammons after 18/6 and therefore
wrongly preferring 18/6.

Surprising (to me) to see such a basic bug in a program that strong.

Paul

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to peps...@gmail.com on Sun Dec 5 21:44:39 2021

On 12/5/2021 8:50 AM, peps...@gmail.com wrote:

Yes, I think that resolves it.
After 18/6, XG sometimes misplays (or intends to misplay) 31.
This results in XG underestimating its gammons after 18/6 and therefore wrongly preferring 18/6.

Surprising (to me) to see such a basic bug in a program that strong.

I'm no longer certain that it's a "bug." If it's just a matter of
the neural net misevaluating a position, I don't consider that a bug,
even if it looks to a human like a "simple" position.

On the other hand, with variance reduction, what you're supposed to
do is add a random variable whose expected value is zero. Ensuring
that the expected value is zero doesn't require perfect evaluations
of positions. But somehow it seems that whatever XG is adding does
not converge to zero in the limit of more and more trials. Maybe I
am underestimating the difficulties in ensuring that the expected
value is literally zero, but this part looks like a bug to me.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From J R@21:1/5 to Tim Chow on Mon Dec 6 10:43:23 2021

On Sunday, December 5, 2021 at 9:44:43 PM UTC-5, Tim Chow wrote:

On 12/5/2021 8:50 AM, peps...@gmail.com wrote:

Yes, I think that resolves it.
After 18/6, XG sometimes misplays (or intends to misplay) 31.
This results in XG underestimating its gammons after 18/6 and therefore wrongly preferring 18/6.

Surprising (to me) to see such a basic bug in a program that strong.

I'm no longer certain that it's a "bug." If it's just a matter of
the neural net misevaluating a position, I don't consider that a bug,
even if it looks to a human like a "simple" position.

On the other hand, with variance reduction, what you're supposed to
do is add a random variable whose expected value is zero. Ensuring
that the expected value is zero doesn't require perfect evaluations
of positions. But somehow it seems that whatever XG is adding does
not converge to zero in the limit of more and more trials. Maybe I
am underestimating the difficulties in ensuring that the expected
value is literally zero, but this part looks like a bug to me.

---
Tim Chow

I only skimmed the thread so if what I'm saying is wrong ignore it, but XG with variance reduction turned on gets the right answer too? Felt to me like it was suggested that was an issue, it's not. Paul originally posted a 4 ply analysis, not rollout,
if that wasn't clear.

Stick

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From pepstein5@gmail.com@21:1/5 to J R on Mon Dec 6 14:52:50 2021

On Monday, December 6, 2021 at 6:43:24 PM UTC, J R wrote:

On Sunday, December 5, 2021 at 9:44:43 PM UTC-5, Tim Chow wrote:

On 12/5/2021 8:50 AM, peps...@gmail.com wrote:

Yes, I think that resolves it.
After 18/6, XG sometimes misplays (or intends to misplay) 31.
This results in XG underestimating its gammons after 18/6 and therefore wrongly preferring 18/6.

Surprising (to me) to see such a basic bug in a program that strong.

I'm no longer certain that it's a "bug." If it's just a matter of
the neural net misevaluating a position, I don't consider that a bug,
even if it looks to a human like a "simple" position.

On the other hand, with variance reduction, what you're supposed to
do is add a random variable whose expected value is zero. Ensuring
that the expected value is zero doesn't require perfect evaluations
of positions. But somehow it seems that whatever XG is adding does
not converge to zero in the limit of more and more trials. Maybe I
am underestimating the difficulties in ensuring that the expected
value is literally zero, but this part looks like a bug to me.

---
Tim Chow

I only skimmed the thread so if what I'm saying is wrong ignore it, but XG with variance reduction turned on gets the right answer too? Felt to me like it was suggested that was an issue, it's not. Paul originally posted a 4 ply analysis, not rollout,

if that wasn't clear.

Stick

Yes, I would kind of expect a 4 ply analysis to give the right answers, particularly since I didn't find it hard to solve the problem by hand.
Apparently, it goes wrong because its one ply analysis misevaluates the 31 roll.

Paul

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Mon Dec 6 23:51:51 2021

On 12/6/2021 11:42 PM, I wrote:

On 12/6/2021 1:43 PM, J R wrote:

I only skimmed the thread so if what I'm saying is wrong ignore it,
but XG with variance reduction turned on gets the right answer too?
Felt to me like it was suggested that was an issue, it's not. Paul
originally posted a 4 ply analysis, not rollout, if that wasn't clear.

I did a rollout with over 1.6 million trials, moves and cube decisions
XGR++, Gigantic move filter, variance reduction turned on. The result incorrectly suggested that 18/9 6/3 is better than 18/6, when the truth
is they're equal.

I should say that the difference is very slight, and you might miss it
unless you ask XG to give you 4 decimal places.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to J R on Mon Dec 6 23:42:49 2021

On 12/6/2021 1:43 PM, J R wrote:

I only skimmed the thread so if what I'm saying is wrong ignore it, but XG with variance reduction turned on gets the right answer too? Felt to me like it was suggested that was an issue, it's not. Paul originally posted a 4 ply analysis, not rollout,

if that wasn't clear.

I did a rollout with over 1.6 million trials, moves and cube decisions
XGR++, Gigantic move filter, variance reduction turned on. The result incorrectly suggested that 18/9 6/3 is better than 18/6, when the truth
is they're equal.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MK@21:1/5 to Tim Chow on Thu Dec 9 22:48:03 2021

On December 5, 2021 at 7:44:43 PM UTC-7, Tim Chow wrote:

On 12/5/2021 8:50 AM, peps...@gmail.com wrote:

Surprising (to me) to see such a basic bug in a
program that strong.

I'm no longer certain that it's a "bug." If it's just a
matter of the neural net misevaluating a position,

I'm starting to enjoy reading even position discussion
threads as they seem to be "evolving" in a different
direction. Can I translate the above two paragraphs as:
the misconception "that bg bots are very strong" lives
on because bottkissers deceive themselves to believe
that fundamental errors are just inconsequential bugs.

It looks like the Wicked Witches are slowly melting...

MK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Fri Dec 10 20:56:03 2021

On 12/10/2021 1:48 AM, MK wrote:

Can I translate the above two paragraphs as:

Yes, it's a free country.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From pepstein5@gmail.com@21:1/5 to Tim Chow on Sat Dec 11 05:01:45 2021

On Saturday, December 11, 2021 at 1:56:05 AM UTC, Tim Chow wrote:

On 12/10/2021 1:48 AM, MK wrote:

Can I translate the above two paragraphs as:

Yes, it's a free country.

What country do you assign this forum to?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to peps...@gmail.com on Sat Dec 11 09:24:45 2021

On 12/11/2021 8:01 AM, peps...@gmail.com wrote:

On Saturday, December 11, 2021 at 1:56:05 AM UTC, Tim Chow wrote:

On 12/10/2021 1:48 AM, MK wrote:

Can I translate the above two paragraphs as:

Yes, it's a free country.

What country do you assign this forum to?

The relevant country for assessing MK's right to translate is
MK's country of residence, which he has said is the United States.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MK@21:1/5 to Tim Chow on Tue Dec 14 01:45:45 2021

On December 10, 2021 at 6:56:05 PM UTC-7, Tim Chow wrote:

On 12/10/2021 1:48 AM, MK wrote:

Can I translate the above two paragraphs as:

Yes, it's a free country.

You can't go more than a few days without feeling
the urge to be a jackass jerk, can you? Too bad. :(
Especially considering how I have been trying to be
nice to for a while. You need treatment.

MK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MK@21:1/5 to Tim Chow on Tue Dec 14 01:57:42 2021

On December 11, 2021 at 7:24:48 AM UTC-7, Tim Chow wrote:

The relevant country for assessing MK's right to
translate is MK's country of residence, which he
has said is the United States.

People, please just ignore this senseless jackass...

MK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MK@21:1/5 to peps...@gmail.com on Tue Dec 14 01:55:27 2021

On December 11, 2021 at 6:01:46 AM UTC-7, peps...@gmail.com wrote:

On Saturday, December 11, 2021 at 1:56:05 AM UTC, Tim Chow wrote:

Yes, it's a free country.

What country do you assign this forum to?

Not related to his comment, this is an interesting
question.

In countries with despotic, fascist regimes like
Turkey, you can actually serve jail sentence, (as
so many tens of thousands have already done),
for expressing your opinion in Cyberspace, for
reasons like being insulting to the king/dictator.
For sure if they identify and catch you in that
country, whether you are a citizen of that country
or not. Be careful where you trave after what you
say about whom. It's getting to be a scarier world.

MK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Tue Dec 14 07:17:23 2021

On 12/14/2021 4:45 AM, MK wrote:

On December 10, 2021 at 6:56:05 PM UTC-7, Tim Chow wrote:

On 12/10/2021 1:48 AM, MK wrote:

Can I translate the above two paragraphs as:

Yes, it's a free country.

You can't go more than a few days without feeling
the urge to be a jackass jerk, can you

No.

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Timothy Chow@21:1/5 to All on Tue Dec 14 07:16:12 2021

On 12/14/2021 4:57 AM, MK wrote:

On December 11, 2021 at 7:24:48 AM UTC-7, Tim Chow wrote:

The relevant country for assessing MK's right to
translate is MK's country of residence, which he
has said is the United States.

People, please just ignore this senseless jackass...

We're trying, but you keep posting...

---
Tim Chow

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zorba@21:1/5 to peps...@gmail.com on Mon Jan 3 07:49:17 2022

On 5-12-2021 0:17, peps...@gmail.com wrote:

1. 4-ply 18/6 eq:-1.114
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:11.40% B:0.00%)

2. 4-ply 18/9 6/3 eq:-1.121 (-0.007)
Player: 0.00% (G:0.00% B:0.00%)
Opponent: 100.00% (G:12.10% B:0.00%)

But what is wrong with my proof that the choices are equivalent?
Since our opponent is guaranteed to bear off in four rolls, we are simply trying
to max our probability of saving the gammon.

If our opponent takes exactly two rolls to bear off then 18/6 loses a gammon to 54,
and 18/9 6/3 loses a gammon to 21.
If the opponent takes more than two rolls to bear off, both plays save the gammon.

So what am I missing??
What makes this particularly puzzling is that it's hard to see how a miscounting could
drop as little equity as 0.7%. If the plays were between 2% to 5% apart, I'd be less puzzled.

I can't test it right now, but the "weird stuff" might have to do with
play #1 using values from the bear-off database and play #2 using the
neural net (at first).

--
Zorba

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zorba@21:1/5 to Zorba on Tue Jan 4 05:54:02 2022

On 3-1-2022 7:49, Zorba wrote:

On 5-12-2021 0:17, peps...@gmail.com wrote:

     1. 4-ply       18/6                         eq:-1.114
       Player:   0.00% (G:0.00% B:0.00%)
       Opponent: 100.00% (G:11.40% B:0.00%)

     2. 4-ply       18/9 6/3                     eq:-1.121 (-0.007)
       Player:   0.00% (G:0.00% B:0.00%)
       Opponent: 100.00% (G:12.10% B:0.00%)

I can't test it right now, but the "weird stuff" might have to do with
play #1 using values from the bear-off database and play #2 using the
neural net (at first).

My instincts were right. There is some kind of problem with XG's
bear-off database or the way it is being used.

I got the exact same evaluations as Paul initially, but then I tweaked
my XG a little and... surprise, now I get these results:

XGID=-ECE--A-----------A-aaa---:0:0:1:33:0:0:0:0:10

1. 1-ply 18/6 eq:-1,1209
Player: 0,00% (G:0,00% B:0,00%)
Opponent: 100,00% (G:12,09% B:0,00%)

2. 1-ply 18/9 6/3 eq:-1,1326 (-0,0117)
Player: 0,02% (G:0,00% B:0,00%)
Opponent: 99,98% (G:13,29% B:0,00%)

NOTE: 18/6 is now already correct at 1-ply. As expected: Database
lookup. 18/6 remains exactly the same at all settings now, also as expected.

18/9 6/3 is some way off, which means the race neural nets have a bit of trouble with this one. Not uncommon, neural nets are not that well
suited for the technicalities of races it seems, with their often tiny
equity differences. The neural net even "guesses" you can still win this
one 0.02% of the time!

Next:

1. 4-ply 18/6 eq:-1,1209
Player: 0,00% (G:0,00% B:0,00%)
Opponent: 100,00% (G:12,09% B:0,00%)

2. 4-ply 18/9 6/3 eq:-1,1211 (-0,0002)
Player: 0,00% (G:0,00% B:0,00%)
Opponent: 100,00% (G:12,11% B:0,00%)

Note how this is different from Paul's, notably for 18/6 which has the
correct value here. This is because of the tweak.

18/9 6/3 is now almost identical in equity but not quite. Slightly disappointing perhaps but it makes sense: This move leads to (initial) evaluations and thus moves, made by the neural net which is imprecise.
Because of the 4-ply lookahead and because you quickly progress to
positions in the database, the bot gets it almost right, but there is
potential for an inferior move somewhere early in the decision tree.
Apparently there is such an inferior move, leading to 0,02% more gammon
losses for 18/9 6/3.

Then:

1. 5-ply 18/9 6/3 eq:-1,1209
Player: 0,00% (G:0,00% B:0,00%)
Opponent: 100,00% (G:12,09% B:0,00%)

2. 5-ply 18/6 eq:-1,1209
Player: 0,00% (G:0,00% B:0,00%)
Opponent: 100,00% (G:12,09% B:0,00%)

Now it fully understands the position. 6-ply and 7-ply are identical to
this. It looks like 5-ply is needed to get 18/9 6/3 right and 4-ply is
not enough.

Now, what did I tweak and why does the original XG mess up 18/6 which
ought to be a straightforward database lookup, regardless of what
settings you use?

In the Analysis settings tab, you can change the bearoff database (by
letting it calculate an extended version, instead of 6 points).

I gave it a try and created the 15 checkers over 7 points database. This
goes reasonably fast and does not need much memory and diskspace. Then I
tried again to evaluate 18/6 and suddenly it was correct!

Success. But also strange behavior, because the 6 points database ought
to be enough to give exact evaluations after 18/6 and should actually
produce the exact same values for that position because it's basically a
subset off the larger 7 points database.

So I switched back to the original 6 points database in the analysis
settings et voila, now that one works as expected just as well.

Which leads me to think that either the original database has errors, or
it gets corrupted in memory somehow, or it is wrongly indexed perhaps.
Creating a new database seems to solve the problem, even if you go back
to using the original 6 points settings.

--
Zorba

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zorba@21:1/5 to Timothy Chow on Tue Jan 4 16:53:52 2022

On 5-12-2021 14:33, Timothy Chow wrote:

----------
After 18/6
----------

XGID=-ECE--B-------------aaa---:1:1:-1:13:0:0:3:0:10

X:Player 1   O:Player 2
Score is X:0 O:0. Unlimited Game, Jacoby Beaver
+12-11-10--9--8--7-------6--5--4--3--2--1-+
|                  |   |    O O O       |
|                  |   |                  |
|                  |   |                  |
|                  |   |                  |
|                  |   |                  |
|                  |BAR|                  |
|                  |   |          X     X |
|                  |   |          X     X |
|                  |   |          X X X | +---+
|                  |   | X        X X X | | 2 |
|                  |   | X        X X X | +---+
+13-14-15-16-17-18------19-20-21-22-23-24-+
Pip count X: 38 O: 12 X-O: 0-0
Cube: 2, X own cube
O to play 13

    1. 1-ply       4/Off                        eq:+1.000
      Player:   100.00% (G:0.00% B:0.00%)
      Opponent: 0.00% (G:0.00% B:0.00%)

    2. 1-ply       5/4 3/Off                    eq:+1.000
      Player:   100.00% (G:0.00% B:0.00%)
      Opponent: 0.00% (G:0.00% B:0.00%)

    3. 1-ply       5/2 3/2                      eq:+1.000
      Player:   100.00% (G:0.00% B:0.00%)
      Opponent: 0.00% (G:0.00% B:0.00%)

    4. 1-ply       5/2 4/3                      eq:+1.000
      Player:   100.00% (G:0.00% B:0.00%)
      Opponent: 0.00% (G:0.00% B:0.00%)

    5. 1-ply       5/1                          eq:+1.000
      Player:   100.00% (G:0.00% B:0.00%)
      Opponent: 0.00% (G:0.00% B:0.00%)

    6. 1-ply       4/1 3/2                      eq:+1.000
      Player:   100.00% (G:0.00% B:0.00%)
      Opponent: 0.00% (G:0.00% B:0.00%)

eXtreme Gammon Version: 2.19.207.pre-release

It's a problem with XG's bear-off database. See also my other post. Let
XG calculate a 7pts bear-off database and use it and apply the settings
(needs to be done after every program start!) and you'll get correct
results:

1. 1-ply 4/Off eq:+1,0216
Player: 100,00% (G:2,16% B:0,00%)
Opponent: 0,00% (G:0,00% B:0,00%)

2. 1-ply 5/4 3/Off eq:+1,0170 (-0,0046)
Player: 100,00% (G:1,70% B:0,00%)
Opponent: 0,00% (G:0,00% B:0,00%)

3. 1-ply 5/2 3/2 eq:+1,0077 (-0,0139)
Player: 100,00% (G:0,77% B:0,00%)
Opponent: 0,00% (G:0,00% B:0,00%)

4. 1-ply 4/1 3/2 eq:+1,0062 (-0,0154)
Player: 100,00% (G:0,62% B:0,00%)
Opponent: 0,00% (G:0,00% B:0,00%)

5. 1-ply 5/2 4/3 eq:+1,0062 (-0,0154)
Player: 100,00% (G:0,62% B:0,00%)
Opponent: 0,00% (G:0,00% B:0,00%)

6. 1-ply 5/1 eq:+1,0062 (-0,0154)
Player: 100,00% (G:0,62% B:0,00%)
Opponent: 0,00% (G:0,00% B:0,00%)

XGID=-ECE--B-------------aaa---:1:1:-1:13:0:0:3:0:10

eXtreme Gammon Version: 2.10

--
Zorba

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	296
Nodes:	16 (2 / 14)
Uptime:	84:51:02
Calls:	6,658
Calls today:	4
Files:	12,203
Messages:	5,333,608
Posted today:	1

Try to be right even when it makes very little difference

Who's Online

Recent Visitors

System Info