• Campaign to Protect the Reputation of Racing Algorithm Inventors

    From pepstein5@gmail.com@21:1/5 to All on Tue Apr 19 15:36:36 2022
    As I understand it, a racing algo specialist has two tasks:
    1) Convert game-winning-probability into cube action.
    2) Find the game-winning probability.

    For 1), the exact answer is that this of course varies.
    There's an optional take position where the taker's gwc is only 18.75% but this is unusually low. However, assuming that, for human simplification, some
    type of answer is needed, is there a consensus that 68/70/76 are the
    correct boundaries?

    Given that we have our key boundaries as above, I think that, if those boundaries are agreed upon, it's unfair to blame the Racing Algo Inventor
    for every wrong cube decision. The algo can say 78% and the figures
    give 78% as being a drop. So, if we happen to have a position where
    78% is correct but we can take anyway, I don't think the algo can be blamed.

    I think that when the algos are wrong, we need to clearly mark whether the boundaries (for example 68/70/76) are wrong, or whether the approximated probability is itself wrong.

    Division of responsibility is needed here for accurate diagnosis.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothy Chow@21:1/5 to peps...@gmail.com on Tue Apr 19 20:37:24 2022
    On 4/19/2022 6:36 PM, peps...@gmail.com wrote:
    As I understand it, a racing algo specialist has two tasks:
    1) Convert game-winning-probability into cube action.
    2) Find the game-winning probability.

    This isn't how all algorithms work. Some simply ignore match play and
    focus only on delivering a verdict on the cube action. In principle,
    such an algorithm could (for example) correctly judge that Position A
    is a double and Position B is not a double even though Position B has
    higher game-winning chances. Any algorithm that tries to account for
    Jacoby paradoxes will have this property.

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pepstein5@gmail.com@21:1/5 to Tim Chow on Wed Apr 20 00:25:21 2022
    On Wednesday, April 20, 2022 at 1:37:30 AM UTC+1, Tim Chow wrote:
    On 4/19/2022 6:36 PM, peps...@gmail.com wrote:
    As I understand it, a racing algo specialist has two tasks:
    1) Convert game-winning-probability into cube action.
    2) Find the game-winning probability.
    This isn't how all algorithms work. Some simply ignore match play and
    focus only on delivering a verdict on the cube action. In principle,
    such an algorithm could (for example) correctly judge that Position A
    is a double and Position B is not a double even though Position B has
    higher game-winning chances. Any algorithm that tries to account for
    Jacoby paradoxes will have this property.

    Which algos don't work like that?
    I think Isight works like that.
    Therefore, when Isight makes errors, we have to distinguish between two types of errors.
    1) Are the thresholds correct? (For example 68/70/76 in money games).
    2) Does the Isight probability accurately capture the real-world probability?

    Failing to separate 1 and 2 and regarding them all as "errors" is too crude (although better than doing nothing).
    I'm surprised at the 76 threshold. That seems too low for races where the recube vig can be considerable.
    I would have expected 78 or so.
    Given that the parameters have been very well-tuned, just using 68/70/78 and making no other changes would be bound to make the model worse.
    The 68/70/76 parameters alone would make me suspect weak passes.
    I'm sure we don't get that, because of the high performance and tuning but it could be that there is some other compensatory
    factor that is preventing the weak passes. For example, it could be that some advantages aren't weighed highly enough.
    So perhaps we get a tendency to pass because of the 76 being too low, but a tendency to take because of the advantages being weighted too low.
    And these flaws cancel to get the action correct.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothy Chow@21:1/5 to peps...@gmail.com on Wed Apr 20 09:14:26 2022
    On 4/20/2022 3:25 AM, peps...@gmail.com wrote:
    Which algos don't work like that?
    I think Isight works like that.

    Most older algorithms don't compute winning chances. They
    just tell you to count pips, make some adjustments, and then
    make a corresponding cube decision, assuming a money game.
    Kleinman was one of the few analysts who actually tried to
    provide a rule for estimating winning chances.

    You're right about Isight, but when it first came out, it
    was exceptional in this regard.

    ---
    Tim Chow

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pepstein5@gmail.com@21:1/5 to Tim Chow on Wed Apr 20 08:01:05 2022
    On Wednesday, April 20, 2022 at 2:14:38 PM UTC+1, Tim Chow wrote:
    On 4/20/2022 3:25 AM, peps...@gmail.com wrote:
    Which algos don't work like that?
    I think Isight works like that.
    Most older algorithms don't compute winning chances. They
    just tell you to count pips, make some adjustments, and then
    make a corresponding cube decision, assuming a money game.
    Kleinman was one of the few analysts who actually tried to
    provide a rule for estimating winning chances.

    You're right about Isight, but when it first came out, it
    was exceptional in this regard.

    I think it's exceptional (in a good sense) in every regard.
    The thing is that the number of people who know the relevant
    areas of maths and modelling and software development and backgammon is
    very small.
    So when such a person has time to work on a backgammon project, we can expect great symphonies,
    fine wine, and cheese of the right vintage.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Axel Reichert@21:1/5 to peps...@gmail.com on Wed Apr 20 19:26:03 2022
    "peps...@gmail.com" <pepstein5@gmail.com> writes:

    Therefore, when Isight makes errors, we have to distinguish between
    two types of errors.
    1) Are the thresholds correct? (For example 68/70/76 in money games).
    2) Does the Isight probability accurately capture the real-world probability?

    The 80 in the formula

    p = 80 - l/3 + 2 * Delta l

    must be matched with the thresholds 68, 70, and 76 (for money). Using

    p = 70 - l/3 + 2 * Delta l

    together with thresholds 58, 60, and 66 obviously gives exactly the same
    cube decisions. So for money sessions, the choice of 80 is arbitrary, as
    long as matching thresholds are used.

    If you want to apply the method in match play, for which it was
    designed, then these fixed thresholds are useless and have to replaced
    by the correct doubling point, redoubling point, and take point for the
    current match score. This requires that p (which so far could have been
    called just a "metric") should represent the game winning chances as
    good as possible.

    So you can shift around the constant such that the error for CPW is
    minimized, and then shift the thresholds accordingly such that you are
    back at the same results for the (money-based) endgame positions. (This
    is historically not how I arrived at my method, but would have worked as
    well. See sections 6.2 and 6.3 of my article, certainly the toughest
    part of it, for the real history.)

    I'm surprised at the 76 threshold. That seems too low for races where
    the recube vig can be considerable. I would have expected 78 or so.

    Fair enough, see

    https://www.bkgm.com/articles/CubeHandlingInRaces/#%3Csmall%3ECPW%3C/small%3E_decision_criteria

    Given that the parameters have been very well-tuned, just using
    68/70/78 and making no other changes would be bound to make the model
    worse.

    Yes. The total equity loss will increase by a whopping 14 %.

    The 68/70/76 parameters alone would make me suspect weak passes. I'm
    sure we don't get that, because of the high performance and tuning but
    it could be that there is some other compensatory factor that is
    preventing the weak passes.

    To quote Newton: "Hypotheses non fingo". (-:

    Best regards

    Axel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)