• XG's predilection for blotty boards

    From Timothy Chow@21:1/5 to All on Tue Oct 25 21:26:01 2022
    In the position below, I played 8/4 8/2, and was surprised that XGR+
    said that my play was 0.057 worse than its play of 8/2 5/1. (Story
    continues below.)

    XGID=---AABEBBb---B-a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | X O O O | | O O O O |
    | X O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | O X X | | X X |
    | O X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 103 O: 104 X-O: 0-0
    Cube: 1
    X to play 64

    1. XG Roller+ 8/2 5/1 eq:+0.391
    Player: 62.83% (G:0.67% B:0.00%)
    Opponent: 37.17% (G:0.28% B:0.01%)

    2. XG Roller+ 7/1 6/2 eq:+0.340 (-0.052)
    Player: 61.98% (G:0.27% B:0.04%)
    Opponent: 38.02% (G:0.22% B:0.02%)

    3. XG Roller+ 8/4 8/2 eq:+0.334 (-0.057)
    Player: 62.05% (G:0.33% B:0.01%)
    Opponent: 37.95% (G:0.28% B:0.00%)

    4. XG Roller+ 8/2 7/3 eq:+0.333 (-0.058)
    Player: 61.95% (G:0.38% B:0.01%)
    Opponent: 38.05% (G:0.30% B:0.00%)

    5. XG Roller+ 8/4 7/1 eq:+0.332 (-0.059)
    Player: 61.93% (G:0.33% B:0.01%)
    Opponent: 38.07% (G:0.34% B:0.01%)

    6. XG Roller+ 8/2 6/2 eq:+0.331 (-0.060)
    Player: 61.79% (G:0.32% B:0.01%)
    Opponent: 38.21% (G:0.22% B:0.00%)

    7. XG Roller+ 7/3 7/1 eq:+0.327 (-0.064)
    Player: 61.61% (G:0.48% B:0.01%)
    Opponent: 38.39% (G:0.36% B:0.00%)

    8. XG Roller+ 7/1 5/1 eq:+0.308 (-0.084)
    Player: 60.96% (G:0.31% B:0.01%)
    Opponent: 39.04% (G:0.24% B:0.00%)

    eXtreme Gammon Version: 2.19.211.pre-release

    I tried appealing to XGR++, and it narrowed the gap between the two
    plays, but still insisted that creating five blots in its board was
    the best play.

    XGID=---AABEBBb---B-a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | X O O O | | O O O O |
    | X O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | O X X | | X X |
    | O X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 103 O: 104 X-O: 0-0
    Cube: 1
    X to play 64

    1. XG Roller++ 8/2 5/1 eq:+0.375
    Player: 62.31% (G:0.51% B:0.01%)
    Opponent: 37.69% (G:0.33% B:0.01%)

    2. XG Roller++ 8/4 8/2 eq:+0.351 (-0.024)
    Player: 62.02% (G:0.39% B:0.00%)
    Opponent: 37.98% (G:0.34% B:0.00%)

    3. XG Roller++ 7/1 6/2 eq:+0.343 (-0.031)
    Player: 61.82% (G:0.29% B:0.00%)
    Opponent: 38.18% (G:0.29% B:0.00%)

    4. XG Roller++ 8/4 7/1 eq:+0.342 (-0.033)
    Player: 61.71% (G:0.35% B:0.00%)
    Opponent: 38.29% (G:0.36% B:0.00%)

    5. XG Roller++ 7/3 7/1 eq:+0.341 (-0.034)
    Player: 61.61% (G:0.38% B:0.01%)
    Opponent: 38.39% (G:0.29% B:0.01%)

    6. XG Roller++ 8/2 6/2 eq:+0.338 (-0.037)
    Player: 61.61% (G:0.27% B:0.01%)
    Opponent: 38.39% (G:0.23% B:0.00%)

    7. XG Roller++ 8/2 7/3 eq:+0.337 (-0.038)
    Player: 61.58% (G:0.28% B:0.00%)
    Opponent: 38.42% (G:0.28% B:0.00%)

    eXtreme Gammon Version: 2.19.211.pre-release

    Finally, I decided to do a rollout, with stronger parameters than
    usual. I was pleased to see that sanity was restored. But this
    position illustrates what I believe is a systematic weakness in XG,
    which is that it doesn't evaluate blotty boards very well. See also
    this old BGOnline post:

    http://timothychow.net/cg/www.bgonline.org/forums/164769.html

    XGID=---AABEBBb---B-a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | X O O O | | O O O O |
    | X O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | O X X | | X X |
    | O X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 103 O: 104 X-O: 0-0
    Cube: 1
    X to play 64

    1. Rollout¹ 8/4 8/2 eq:+0.343
    Player: 61.74% (G:0.31% B:0.01%)
    Opponent: 38.26% (G:0.27% B:0.01%)
    Confidence: ±0.008 (+0.335..+0.352) - [41.7%]

    2. Rollout¹ 8/4 7/1 eq:+0.342 (-0.002)
    Player: 61.70% (G:0.30% B:0.01%)
    Opponent: 38.30% (G:0.28% B:0.01%)
    Confidence: ±0.008 (+0.334..+0.350) - [24.7%]

    3. Rollout¹ 7/3 7/1 eq:+0.342 (-0.002)
    Player: 61.38% (G:0.49% B:0.04%)
    Opponent: 38.62% (G:0.37% B:0.01%)
    Confidence: ±0.008 (+0.334..+0.349) - [23.1%]

    4. Rollout¹ 7/1 6/2 eq:+0.338 (-0.006)
    Player: 61.54% (G:0.32% B:0.01%)
    Opponent: 38.46% (G:0.31% B:0.01%)
    Confidence: ±0.008 (+0.330..+0.346) - [5.2%]

    5. Rollout¹ 8/2 6/2 eq:+0.338 (-0.006)
    Player: 61.49% (G:0.26% B:0.01%)
    Opponent: 38.51% (G:0.22% B:0.01%)
    Confidence: ±0.008 (+0.330..+0.346) - [4.8%]

    6. Rollout¹ 8/2 7/3 eq:+0.334 (-0.010)
    Player: 61.47% (G:0.29% B:0.01%)
    Opponent: 38.53% (G:0.30% B:0.01%)
    Confidence: ±0.007 (+0.327..+0.341) - [0.4%]

    7. Rollout¹ 8/2 5/1 eq:+0.326 (-0.018)
    Player: 61.13% (G:0.44% B:0.02%)
    Opponent: 38.87% (G:0.48% B:0.02%)
    Confidence: ±0.008 (+0.318..+0.334) - [0.0%]

    ¹ 1296 Games rolled with Variance Reduction.
    Dice Seed: 271828
    Moves and cube decisions: XG Roller+
    Search interval: Large

    eXtreme Gammon Version: 2.19.211.pre-release

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pepstein5@gmail.com@21:1/5 to Tim Chow on Wed Oct 26 00:52:24 2022
    On Wednesday, October 26, 2022 at 2:26:04 AM UTC+1, Tim Chow wrote:
    In the position below, I played 8/4 8/2, and was surprised that XGR+
    said that my play was 0.057 worse than its play of 8/2 5/1. (Story
    continues below.)

    XGID=---AABEBBb---B-a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | X O O O | | O O O O |
    | X O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | O X X | | X X |
    | O X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 103 O: 104 X-O: 0-0
    Cube: 1
    X to play 64

    1. XG Roller+ 8/2 5/1 eq:+0.391
    Player: 62.83% (G:0.67% B:0.00%)
    Opponent: 37.17% (G:0.28% B:0.01%)

    2. XG Roller+ 7/1 6/2 eq:+0.340 (-0.052)
    Player: 61.98% (G:0.27% B:0.04%)
    Opponent: 38.02% (G:0.22% B:0.02%)

    3. XG Roller+ 8/4 8/2 eq:+0.334 (-0.057)
    Player: 62.05% (G:0.33% B:0.01%)
    Opponent: 37.95% (G:0.28% B:0.00%)

    4. XG Roller+ 8/2 7/3 eq:+0.333 (-0.058)
    Player: 61.95% (G:0.38% B:0.01%)
    Opponent: 38.05% (G:0.30% B:0.00%)

    5. XG Roller+ 8/4 7/1 eq:+0.332 (-0.059)
    Player: 61.93% (G:0.33% B:0.01%)
    Opponent: 38.07% (G:0.34% B:0.01%)

    6. XG Roller+ 8/2 6/2 eq:+0.331 (-0.060)
    Player: 61.79% (G:0.32% B:0.01%)
    Opponent: 38.21% (G:0.22% B:0.00%)

    7. XG Roller+ 7/3 7/1 eq:+0.327 (-0.064)
    Player: 61.61% (G:0.48% B:0.01%)
    Opponent: 38.39% (G:0.36% B:0.00%)

    8. XG Roller+ 7/1 5/1 eq:+0.308 (-0.084)
    Player: 60.96% (G:0.31% B:0.01%)
    Opponent: 39.04% (G:0.24% B:0.00%)

    eXtreme Gammon Version: 2.19.211.pre-release

    I tried appealing to XGR++, and it narrowed the gap between the two
    plays, but still insisted that creating five blots in its board was
    the best play.

    XGID=---AABEBBb---B-a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | X O O O | | O O O O |
    | X O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | O X X | | X X |
    | O X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 103 O: 104 X-O: 0-0
    Cube: 1
    X to play 64

    1. XG Roller++ 8/2 5/1 eq:+0.375
    Player: 62.31% (G:0.51% B:0.01%)
    Opponent: 37.69% (G:0.33% B:0.01%)

    2. XG Roller++ 8/4 8/2 eq:+0.351 (-0.024)
    Player: 62.02% (G:0.39% B:0.00%)
    Opponent: 37.98% (G:0.34% B:0.00%)

    3. XG Roller++ 7/1 6/2 eq:+0.343 (-0.031)
    Player: 61.82% (G:0.29% B:0.00%)
    Opponent: 38.18% (G:0.29% B:0.00%)

    4. XG Roller++ 8/4 7/1 eq:+0.342 (-0.033)
    Player: 61.71% (G:0.35% B:0.00%)
    Opponent: 38.29% (G:0.36% B:0.00%)

    5. XG Roller++ 7/3 7/1 eq:+0.341 (-0.034)
    Player: 61.61% (G:0.38% B:0.01%)
    Opponent: 38.39% (G:0.29% B:0.01%)

    6. XG Roller++ 8/2 6/2 eq:+0.338 (-0.037)
    Player: 61.61% (G:0.27% B:0.01%)
    Opponent: 38.39% (G:0.23% B:0.00%)

    7. XG Roller++ 8/2 7/3 eq:+0.337 (-0.038)
    Player: 61.58% (G:0.28% B:0.00%)
    Opponent: 38.42% (G:0.28% B:0.00%)

    eXtreme Gammon Version: 2.19.211.pre-release

    Finally, I decided to do a rollout, with stronger parameters than
    usual. I was pleased to see that sanity was restored. But this
    position illustrates what I believe is a systematic weakness in XG,
    which is that it doesn't evaluate blotty boards very well. See also
    this old BGOnline post:

    http://timothychow.net/cg/www.bgonline.org/forums/164769.html

    XGID=---AABEBBb---B-a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | X O O O | | O O O O |
    | X O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | O X X | | X X |
    | O X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 103 O: 104 X-O: 0-0
    Cube: 1
    X to play 64

    1. Rollout¹ 8/4 8/2 eq:+0.343
    Player: 61.74% (G:0.31% B:0.01%)
    Opponent: 38.26% (G:0.27% B:0.01%)
    Confidence: ±0.008 (+0.335..+0.352) - [41.7%]

    2. Rollout¹ 8/4 7/1 eq:+0.342 (-0.002)
    Player: 61.70% (G:0.30% B:0.01%)
    Opponent: 38.30% (G:0.28% B:0.01%)
    Confidence: ±0.008 (+0.334..+0.350) - [24.7%]

    3. Rollout¹ 7/3 7/1 eq:+0.342 (-0.002)
    Player: 61.38% (G:0.49% B:0.04%)
    Opponent: 38.62% (G:0.37% B:0.01%)
    Confidence: ±0.008 (+0.334..+0.349) - [23.1%]

    4. Rollout¹ 7/1 6/2 eq:+0.338 (-0.006)
    Player: 61.54% (G:0.32% B:0.01%)
    Opponent: 38.46% (G:0.31% B:0.01%)
    Confidence: ±0.008 (+0.330..+0.346) - [5.2%]

    5. Rollout¹ 8/2 6/2 eq:+0.338 (-0.006)
    Player: 61.49% (G:0.26% B:0.01%)
    Opponent: 38.51% (G:0.22% B:0.01%)
    Confidence: ±0.008 (+0.330..+0.346) - [4.8%]

    6. Rollout¹ 8/2 7/3 eq:+0.334 (-0.010)
    Player: 61.47% (G:0.29% B:0.01%)
    Opponent: 38.53% (G:0.30% B:0.01%)
    Confidence: ±0.007 (+0.327..+0.341) - [0.4%]

    7. Rollout¹ 8/2 5/1 eq:+0.326 (-0.018)
    Player: 61.13% (G:0.44% B:0.02%)
    Opponent: 38.87% (G:0.48% B:0.02%)
    Confidence: ±0.008 (+0.318..+0.334) - [0.0%]

    ¹ 1296 Games rolled with Variance Reduction.
    Dice Seed: 271828
    Moves and cube decisions: XG Roller+
    Search interval: Large


    I strongly suspect flawed thinking, on your part.
    I disagree that you've found evidence of a weakness.
    For example, it's perfectly possible that the rollout
    was somehow biased against the blotty play and that
    XG's original play only loses 0.01 instead of 0.018.

    But let's assume that the rollout is correct, and that XG's play does indeed lose 0.018 equity. How bad is this?
    Well, suppose that you were playing O against a world-class human X.
    And suppose that you were able to pay X to make the play of 8/2 5/1 and you were able to negotiate a price for this. The price would normally be much more than just 0.018.
    XG is simply trying to maximise the equity, and the above thought experiment (if correct)
    shows that XG understands the play much better than most humans do, by being wrong about
    the blotty play by only 0.018.

    Your fallacy is to mark out the zero-equity level as being particularly significant.
    The difference betwee losing zero equity (by the optimal play) and losing 0.018 equity
    is no more significant than the difference between losing 0.1 equity and 0.118 equity.

    Suppose there was some position which everyone (humans and bots) systematically got wrong.
    However, some got it wrong by 0.1 and some got it wrong by 0.118.
    Would you make a big deal out of this discrepancy between the 0.1 errors and the 0.118 errors?
    I bet you wouldn't.

    So your thinking seems flawed and inconsistent to me.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothy Chow@21:1/5 to peps...@gmail.com on Wed Oct 26 08:26:30 2022
    On 10/26/2022 3:52 AM, peps...@gmail.com wrote:
    I disagree that you've found evidence of a weakness.

    This is just one example out of many similar examples I've
    encountered over the years. When neither player is likely to
    leave a blot in the next couple of rolls, XG frequently makes
    all kind of nutty plays, making a complete mess of its board
    for no good reason.

    But let's assume that the rollout is correct, and that XG's play does indeed lose 0.018 equity. How bad is this?

    The issue isn't that XG's play loses 0.018 equity. The issue
    is that when we pass from XGR+ to a rollout, there's a swing from
    -0.057 to +0.018, showing that XG doesn't understand the position
    very well. This would be true even if XGR+ is "right" and the rollout
    is "wrong." If Alice says yes and Bob says no, at most one of them
    can be right, and if they disagree strongly then at least one of them
    is misinformed.

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stick Rice@21:1/5 to Tim Chow on Wed Oct 26 12:46:31 2022
    On Wednesday, October 26, 2022 at 8:26:33 AM UTC-4, Tim Chow wrote:
    On 10/26/2022 3:52 AM, peps...@gmail.com wrote:
    I disagree that you've found evidence of a weakness.
    This is just one example out of many similar examples I've
    encountered over the years. When neither player is likely to
    leave a blot in the next couple of rolls, XG frequently makes
    all kind of nutty plays, making a complete mess of its board
    for no good reason.
    But let's assume that the rollout is correct, and that XG's play does indeed
    lose 0.018 equity. How bad is this?
    The issue isn't that XG's play loses 0.018 equity. The issue
    is that when we pass from XGR+ to a rollout, there's a swing from
    -0.057 to +0.018, showing that XG doesn't understand the position
    very well. This would be true even if XGR+ is "right" and the rollout
    is "wrong." If Alice says yes and Bob says no, at most one of them
    can be right, and if they disagree strongly then at least one of them
    is misinformed.

    ---
    Tim Chow

    But there is good reason to make a mess of our board here. By far the most likely scenario is for this game to turn into a race. (which we're currently leading semi comfortably) So...I make a racing driven play. It's a fine line in maintaining some
    timing so we are able to clear the midpoint without leaving a shot and distributing perfectly for the race bear in/bear off but that's why I'd have played 7/1 6/2 OtB and maintain it's best.

    Stick

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pepstein5@gmail.com@21:1/5 to Tim Chow on Wed Oct 26 14:07:44 2022
    On Wednesday, October 26, 2022 at 1:26:33 PM UTC+1, Tim Chow wrote:
    On 10/26/2022 3:52 AM, peps...@gmail.com wrote:
    I disagree that you've found evidence of a weakness.
    This is just one example out of many similar examples I've
    encountered over the years. When neither player is likely to
    leave a blot in the next couple of rolls, XG frequently makes
    all kind of nutty plays, making a complete mess of its board
    for no good reason.
    But let's assume that the rollout is correct, and that XG's play does indeed
    lose 0.018 equity. How bad is this?
    The issue isn't that XG's play loses 0.018 equity. The issue
    is that when we pass from XGR+ to a rollout, there's a swing from
    -0.057 to +0.018, showing that XG doesn't understand the position
    very well. This would be true even if XGR+ is "right" and the rollout
    is "wrong." If Alice says yes and Bob says no, at most one of them
    can be right, and if they disagree strongly then at least one of them
    is misinformed.

    Assuming Alice and Bob are in a relationship, it's quite likely that when one of them says "yes", the other always says "no" (and vice versa) on principle, regardless of what they really think.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothy Chow@21:1/5 to Stick Rice on Wed Oct 26 23:44:37 2022
    On 10/26/2022 3:46 PM, Stick Rice wrote:

    But there is good reason to make a mess of our board here. By far the most likely scenario is for this game to turn into a race. (which we're currently leading semi comfortably) So...I make a racing driven play. It's a fine line in maintaining some
    timing so we are able to clear the midpoint without leaving a shot and distributing perfectly for the race bear in/bear off but that's why I'd have played 7/1 6/2 OtB and maintain it's best.

    How is dumping checkers on low points good for the race?
    Generally speaking, we should be trying to avoid wastage,
    and not worry about gaps on the 1pt and 2pt. Suppose we
    remove the four checkers in the outfield that are creating
    contact. Admittedly the resulting position is artificial,
    but it illustrates the point that 7/1 6/2 doesn't seem to
    be good for the race.

    XGID=---AABEBB------a-abbcc--a-:0:0:1:64:0:0:0:0:10

    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | O O O | | O O O O |
    | O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | X X | | X X |
    | X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 77 O: 72 X-O: 0-0
    Cube: 1
    X to play 64

    1. Rollout¹ 8/4 7/1 eq:+0.095
    Player: 53.26% (G:0.00% B:0.00%)
    Opponent: 46.74% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.091..+0.098) - [75.0%]

    2. Rollout¹ 8/4 8/2 eq:+0.093 (-0.002)
    Player: 53.22% (G:0.00% B:0.00%)
    Opponent: 46.78% (G:0.00% B:0.00%)
    Confidence: ±0.003 (+0.089..+0.096) - [25.0%]

    3. Rollout¹ 8/2 7/3 eq:+0.080 (-0.015)
    Player: 52.76% (G:0.00% B:0.00%)
    Opponent: 47.24% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.076..+0.083) - [0.0%]

    4. Rollout¹ 7/3 7/1 eq:+0.071 (-0.023)
    Player: 52.37% (G:0.00% B:0.00%)
    Opponent: 47.63% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.068..+0.075) - [0.0%]

    5. Rollout¹ 7/1 6/2 eq:+0.064 (-0.030)
    Player: 52.11% (G:0.00% B:0.00%)
    Opponent: 47.89% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.061..+0.068) - [0.0%]

    ¹ 1296 Games rolled with Variance Reduction.
    Dice Seed: 271828
    Moves: 3-ply, cube decisions: XG Roller


    eXtreme Gammon Version: 2.19.211.pre-release

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stick Rice@21:1/5 to Tim Chow on Thu Oct 27 06:53:28 2022
    On Wednesday, October 26, 2022 at 11:44:39 PM UTC-4, Tim Chow wrote:
    On 10/26/2022 3:46 PM, Stick Rice wrote:

    But there is good reason to make a mess of our board here. By far the most likely scenario is for this game to turn into a race. (which we're currently leading semi comfortably) So...I make a racing driven play. It's a fine line in maintaining some
    timing so we are able to clear the midpoint without leaving a shot and distributing perfectly for the race bear in/bear off but that's why I'd have played 7/1 6/2 OtB and maintain it's best.
    How is dumping checkers on low points good for the race?
    Generally speaking, we should be trying to avoid wastage,
    and not worry about gaps on the 1pt and 2pt. Suppose we
    remove the four checkers in the outfield that are creating
    contact. Admittedly the resulting position is artificial,
    but it illustrates the point that 7/1 6/2 doesn't seem to
    be good for the race.

    XGID=---AABEBB------a-abbcc--a-:0:0:1:64:0:0:0:0:10
    Score is X:0 O:0. Unlimited Game
    +13-14-15-16-17-18------19-20-21-22-23-24-+
    | O O O | | O O O O |
    | O | | O O O |
    | | | O O |
    | | | |
    | | | |
    | |BAR| |
    | | | X |
    | | | X |
    | | | X |
    | X X | | X X |
    | X X | | X X X X |
    +12-11-10--9--8--7-------6--5--4--3--2--1-+
    Pip count X: 77 O: 72 X-O: 0-0
    Cube: 1
    X to play 64
    1. Rollout¹ 8/4 7/1 eq:+0.095
    Player: 53.26% (G:0.00% B:0.00%)
    Opponent: 46.74% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.091..+0.098) - [75.0%]

    2. Rollout¹ 8/4 8/2 eq:+0.093 (-0.002)
    Player: 53.22% (G:0.00% B:0.00%)
    Opponent: 46.78% (G:0.00% B:0.00%)
    Confidence: ±0.003 (+0.089..+0.096) - [25.0%]

    3. Rollout¹ 8/2 7/3 eq:+0.080 (-0.015)
    Player: 52.76% (G:0.00% B:0.00%)
    Opponent: 47.24% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.076..+0.083) - [0.0%]

    4. Rollout¹ 7/3 7/1 eq:+0.071 (-0.023)
    Player: 52.37% (G:0.00% B:0.00%)
    Opponent: 47.63% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.068..+0.075) - [0.0%]

    5. Rollout¹ 7/1 6/2 eq:+0.064 (-0.030)
    Player: 52.11% (G:0.00% B:0.00%)
    Opponent: 47.89% (G:0.00% B:0.00%)
    Confidence: ±0.004 (+0.061..+0.068) - [0.0%]
    ¹ 1296 Games rolled with Variance Reduction.
    Dice Seed: 271828
    Moves: 3-ply, cube decisions: XG Roller
    eXtreme Gammon Version: 2.19.211.pre-release

    ---
    Tim Chow

    As I said, it's a fine line distributing for the race and keeping some timing so we are able to clear the midpoint without leaving a shot. Putting one checker on a lower point does no real harm race wise.

    Stick

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothy Chow@21:1/5 to Stick Rice on Sun Oct 30 17:41:23 2022
    On 10/27/2022 9:53 AM, Stick Rice wrote:

    As I said, it's a fine line distributing for the race and keeping some timing so we are able to clear the midpoint without leaving a shot. Putting one checker on a lower point does no real harm race wise.

    Okay. At least it seems we agree that there's no good reason
    for XGR+ to insist that 5/1 is best by a clear margin.

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MK@21:1/5 to Tim Chow on Tue Nov 1 14:54:34 2022
    On October 30, 2022 at 3:41:25 PM UTC-6, Tim Chow wrote:

    On 10/27/2022 9:53 AM, Stick Rice wrote:

    As I said, it's a fine line distributing for the race.....

    At least it seems we agree that there's no good reason
    for XGR+ to insist that 5/1 is best by a clear margin.

    I'm glad to see comments like this but also sad to see
    that examples like this don't really do you any lasting
    good or get you anywhere because you all ignore the
    implications of your own acknowledgements.

    After finding so many cases like this, how can you be
    sure that there aren't many thousands more of them
    that you haven't encountered or recognized yet? How
    many straws does it take to break the camel's back??

    MK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MK@21:1/5 to Tim Chow on Tue Nov 1 14:42:03 2022
    On October 26, 2022 at 6:26:33 AM UTC-6, Tim Chow wrote:

    The issue isn't that XG's play loses 0.018 equity.
    The issue is that when we pass from XGR+ to a
    rollout, there's a swing from -0.057 to +0.018...

    This is a very interesting example. It's not a case
    where the top two or there plays trade places but
    the rankings of all plays scramble all over the place
    in XGR+ XGR++ and rollout.

    In addition to your correctly making the important
    point that 8/4 8/2 goes from -0.057 to +0.018, the
    "raw equity" goes from +0.334 in XGR+ to +0.351 in
    XGR++ and back to +0.343 in rollout, while the top
    play 8/2 5/1 goes even more drastically from +0.391
    in XGR+ to +0.375 in XGR++ and then further down
    to +0.326 in rollout, i.e. -0.065 difference accross the
    three evaluations.

    This would be true even if XGR+ is "right" and the
    rollout is "wrong."

    True indeed and this comment adds to the credibility
    of your objectivity on the subject.

    If Alice says yes and Bob says no, at most one of
    them can be right, and if they disagree strongly
    then at least one of them is misinformed.

    Then how would you decide which one is right?

    Let me be asking this question first by renaming
    Alice and Bob as Gnubg and XG?

    Second and more importantly, in your example it's
    not two people (or bots) contradicting each other.
    It's the same bot XG contradicting itself. I'm sure
    the same would be true for Gnubg also.

    Thus the question becomes "how do you *trust*"
    that XG or Gnubg is right in any given evaluation?

    Would you go by a ratio? And if so, what would be
    your treshold? Would it be enough for you if the bot
    was 10% right? 20%? 30%?

    An that, of course, assuming that you can decide if
    XGR+ or XGR++ or XG-rollout is "right"...

    I'm just wondering what would it take for you folks
    to some day say enough is enough, these bots are
    just unpredictable, unreliable pieces of shit...??

    MK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Woodhead@21:1/5 to All on Wed Nov 2 08:08:44 2022
    On 2/11/2022 7:42 am, MK wrote:

    I'm just wondering what would it take for you folks
    to some day say enough is enough, these bots are
    just unpredictable, unreliable pieces of shit...??

    It would take a better bot.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pepstein5@gmail.com@21:1/5 to All on Tue Nov 1 15:49:56 2022
    On Tuesday, November 1, 2022 at 9:42:04 PM UTC, MK wrote:
    On October 26, 2022 at 6:26:33 AM UTC-6, Tim Chow wrote:

    The issue isn't that XG's play loses 0.018 equity.
    The issue is that when we pass from XGR+ to a
    rollout, there's a swing from -0.057 to +0.018...

    This is a very interesting example. It's not a case
    where the top two or there plays trade places but
    the rankings of all plays scramble all over the place
    in XGR+ XGR++ and rollout.

    In addition to your correctly making the important
    point that 8/4 8/2 goes from -0.057 to +0.018, the
    "raw equity" goes from +0.334 in XGR+ to +0.351 in
    XGR++ and back to +0.343 in rollout, while the top
    play 8/2 5/1 goes even more drastically from +0.391
    in XGR+ to +0.375 in XGR++ and then further down
    to +0.326 in rollout, i.e. -0.065 difference accross the
    three evaluations.
    This would be true even if XGR+ is "right" and the
    rollout is "wrong."
    True indeed and this comment adds to the credibility
    of your objectivity on the subject.
    If Alice says yes and Bob says no, at most one of
    them can be right, and if they disagree strongly
    then at least one of them is misinformed.
    Then how would you decide which one is right?

    Let me be asking this question first by renaming
    Alice and Bob as Gnubg and XG?

    Second and more importantly, in your example it's
    not two people (or bots) contradicting each other.
    It's the same bot XG contradicting itself. I'm sure
    the same would be true for Gnubg also.

    Thus the question becomes "how do you *trust*"
    that XG or Gnubg is right in any given evaluation?

    Would you go by a ratio? And if so, what would be
    your treshold? Would it be enough for you if the bot
    was 10% right? 20%? 30%?

    An that, of course, assuming that you can decide if
    XGR+ or XGR++ or XG-rollout is "right"...

    I'm just wondering what would it take for you folks
    to some day say enough is enough, these bots are
    just unpredictable, unreliable pieces of shit...??


    I think we're impressed by the bots because they're so
    clearly better than the best humans. I think that's what
    commands respect.
    That seemed to be the case with chess computers.
    They were laughed at when all experts could beat all
    chess computers easily, and respected around the time
    that they could compete with the world's strongest grandmasters.

    And I don't think you ever claim to actually be able to beat a bot consistently.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothy Chow@21:1/5 to All on Tue Nov 1 19:59:43 2022
    On 11/1/2022 5:42 PM, MK wrote:
    Thus the question becomes "how do you *trust*"
    that XG or Gnubg is right in any given evaluation?

    It's a reasonable question.

    I would say that if a bot disagrees with itself then that is a
    good reason *not* to trust it.

    If it mostly agrees with itself when you perform various cross-
    checks, then that doesn't prove that it is trustworthy, just as
    when a lawyer cross-examines a witness and finds no contradictions,
    it doesn't prove the witness is telling the truth. But as Paul
    said, if the bot plays well overall, generally outperforming human
    beings, then that's some evidence that it "knows what it is doing."

    One can of course insist on adopting a skeptical posture under
    all circumstances. This might mean that you avoid getting fooled
    by lies, but it also means that you risk missing the truth. It's
    up to every individual to decide how to make that tradeoff.

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pepstein5@gmail.com@21:1/5 to Tim Chow on Tue Nov 1 17:23:36 2022
    On Tuesday, November 1, 2022 at 11:59:48 PM UTC, Tim Chow wrote:
    On 11/1/2022 5:42 PM, MK wrote:
    Thus the question becomes "how do you *trust*"
    that XG or Gnubg is right in any given evaluation?
    It's a reasonable question.

    I would say that if a bot disagrees with itself then that is a
    good reason *not* to trust it.

    If it mostly agrees with itself when you perform various cross-
    checks, then that doesn't prove that it is trustworthy, just as
    when a lawyer cross-examines a witness and finds no contradictions,
    it doesn't prove the witness is telling the truth. But as Paul
    said, if the bot plays well overall, generally outperforming human
    beings, then that's some evidence that it "knows what it is doing."

    One can of course insist on adopting a skeptical posture under
    all circumstances. This might mean that you avoid getting fooled
    by lies, but it also means that you risk missing the truth. It's
    up to every individual to decide how to make that tradeoff.

    And an individual might make that tradeoff very differently, depending
    on the matter that is being evaluated. They might be very skeptical
    about statistical claims about non-randomness of dice, but not at all
    skeptical about beliefs that conform to the religious or philosophical traditions that they identify with.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MK@21:1/5 to peps...@gmail.com on Tue Nov 1 18:40:31 2022
    On November 1, 2022 at 6:23:38 PM UTC-6, peps...@gmail.com wrote:

    On November 1, 2022 at 11:59:48 PM UTC, Tim Chow wrote:

    as Paul said, if the bot plays well overall, generally
    outperforming human beings, then that's some
    evidence that it "knows what it is doing."

    If this was tested and proven, I wouldn't object to it.
    One big problem is that the bots' "performance" has
    never been blind-tested. All of the compared human
    gamblegammon players try to play like the few bots
    (which are all descendents of TD-Gammon v.2), so
    much so that lately they have started to compete in
    lowering their PR's (as computed by the same bots)
    instead of achieving more wins against humans or
    bots. This is dog chasing its tail...

    And an individual might make that tradeoff very
    differently, depending on the matter that is being
    evaluated.

    Yes. My example would be people who believe that
    human players would but bot players wouldn't cheat.

    They might be very skeptical about statistical claims
    about non-randomness of dice, but not at all skeptical
    about beliefs that conform to the religious or
    philosophical traditions that they identify with.

    I don't think sceptical is the counterpart of beliver. A
    believer can believe in what may be true or what may
    be false. Often, no proof will change one's belief. For
    example, no amount of mutant bot experiments will
    be enough to convince "cube skill theory believers"
    that it's bullshit. If you hear someone say "I can't
    believe their eyes", consider believing that they can't...

    MK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MK@21:1/5 to Simon Woodhead on Tue Nov 1 19:02:39 2022
    On November 1, 2022 at 4:08:47 PM UTC-6, Simon Woodhead wrote:

    On 2/11/2022 7:42 am, MK wrote:

    I'm just wondering what would it take for you folks
    to some day say enough is enough, these bots are
    just unpredictable, unreliable pieces of shit...??

    It would take a better bot.

    A better bot would surely do that with the caveat
    that the better bot can also be merely a better (or
    worse depending on whether shit means positive
    or negative) piece of shit... ;)

    Even so, I've always said that we need and we can
    easily develop better bots with today's computing
    power.

    One way would be to go back to TD-Gammon v.01,
    (i.e. prior to the version Tesauro's bastardizing his
    own bot in seeking validation/recognition from bg
    gamblers), and do it the right way from there on...

    BTW: I'm not asking you to do it. So, don't tell me to
    do it myself.

    MK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MK@21:1/5 to peps...@gmail.com on Tue Nov 1 19:32:59 2022
    On November 1, 2022 at 4:49:57 PM UTC-6, peps...@gmail.com wrote:

    I think we're impressed by the bots because
    they're so clearly better than the best humans.

    There is no proof of this. At least nothing that
    you yourself would call "rigorous". ;)

    And I don't think you ever claim to actually be
    able to beat a bot consistently.

    I have made that claim. I conducted numerous
    experiments and played quite a number of long
    sessions to show that I could achieve it, (which
    I shared at my web site, some accompanied by
    youtube videos recorded in real time), see:

    http://montanaonline.net/backgammon/xg.php

    But you don't need to trust me if you don't want
    to. That's why I urged you all for years to do your
    own experiments. Though half-ass, the one that
    Axel has done showed that even a crude "mutant"
    can do well beyond expectations against a strong
    bot. If you do better, more extensive, "rigorous" ;)
    experiments, the proof will surely become clearer.

    MK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)