• bcd32 vs. bcd32_ctr

    From Karl-Uwe Frank@21:1/5 to All on Tue Oct 13 19:24:38 2015
    Simple as that, the whole struggle with the counter is useless because
    the algorithm bcd32 is sufficient as it is.

    The problems occur when I tried to modify bcd32 in a way that it could
    be seeded with all state vars a,b,c,d = zero. My thoughts becoming to complicated as I started to improve the algorithm.

    In fact it is sufficient to change the seeding function of bcd32 like

    //---------------------------------

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];

    t = a + b + c + d;
    }

    // The PRNG Function
    uint32_t bcd32() {
    a = a + (d >> 5);
    b = a + (b ^ c);
    c = a + (b << 13);
    d = a + (d ^ t);
    t = a + t;

    return (b ^ c ^ d);
    }
    //---------------------------------


    If bcd32 is seeded like a=b=c=d=0 the first output will look like

    n d t a b c ==> output
    0) fffffffe fffffffe ffffffff ffffffff ffffdfff ==> ffffdffe | 4294959102
    1) 07fffffe 07fffffc 07fffffe 08001ffe 0bffbffe ==> 04005ffe | 67133438
    2) 083fffff 103ffff9 083ffffd 0c3f9ffd fc3f9ffd ==> f83fffff | 4164943871
    3) 20820002 18c1fff5 0881fffc f881fffc 48817ffc ==> 90828002 | 2424471554
    4) 41ca0ff3 22480ff1 09860ffc b9868ffc db858ffc ==> 23c90ff3 | 600379379
    5) 6f16607d 2ddc706c 0b94607b 6d97607b f7a3c07b ==> f522c07d | 4112695421
    6) 51d7238f 3ce983ea 0f0d137e a941b37e 457cd37e ==> bdea438f | 3186246543
    7) 7eda6cff 4e855084 119bcc9a fdd92c9a 372f0c9a ==> b42c4cff | 3022802175
    8) 45f1dc7c 6417f085 1592a001 e088c001 2d92c001 ==> 88ebdc7c | 2297158780
    9) 39a85bdd 7bda1f69 17c22ee4 e4dc2ee4 9d9eaee4 ==> 40eadbdd | 1089133533
    10) 5c01b676 9569912b 198f71c2 92d1f1c2 57c7b1c2 ==> 9917f676 | 2568484470
    11) e5d7a6d2 b1d910a0 1c6f7f75 e185bf75 d45e1f75 ==> d00c06d2 | 3490449106
    12) 77acf31d d5774d4b 239e3cab 5979dcab 5f339cab ==> 71e6b31d | 1910944541
    13) ca376299 fcd2f18e 275ba443 2da5e443 e3e40443 ==> 04768299 | 74875545
    14) 6492f26e 2a8050e5 2dad5f57 fbef3f57 15983f57 ==> 8ae5f26e | 2330325614

    Therefore bcd32_ctr is not necessary any more and I have removed the
    source code and example implementations.

    Cheers,
    Karl-Uwe


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to All on Tue Oct 13 23:27:09 2015
    Some more randomness test results for bcd32 performed on file size = 512MB



    ./bcd32_BitStat 30 65432912 | ./BitStatistic/bitstat

    stdin contains 536870912 bytes (0 bits)

    BIT ANALYSIS

    Value Count Percentage
    ----- ----- ----------
    one -2147472305 inf
    zero 2147472305 inf

    Bit Count Percentage
    --- ----- ----------
    msb 7 268436892 50.000267
    6 268431308 49.999229
    5 268432117 49.999378
    4 268431718 49.999302
    3 268439362 50.000729
    2 268446212 50.002003
    1 268438473 50.000561
    lsb 0 268438909 50.000645

    1/0 ratio 1.000011
    Deviance from unity 0.000011




    ./bcd32_BitStat 30 65432912 | ./BitStatistic/ent

    Entropy = 8.000000 bits per byte.

    Optimum compression would reduce the size
    of this 536870912 byte file by 0 percent.

    Chi square distribution for 536870912 samples is 242.23, and randomly
    would exceed this value 70.74 percent of the times.

    Arithmetic mean value of data bytes is 127.4997 (127.5 = random).
    Monte Carlo value for Pi is 3.141565171 (error 0.00 percent).
    Serial correlation coefficient is 0.000020 (totally uncorrelated = 0.0).



    ./smallcrush_TestSeries.sh bcd32 65432912 100

    ====================================================
    Tests Runs in Total = 100

    Start time: Di 13 Okt 2015 18:41:16 CEST ---------------------------------------------------

    Test No.: 1 18:41:16

    Seed = 65432912 | a = 4075642575, b = 1108630788, c = 1189532544, d = 1548197085, t = 3627035696

    ---------------------------------------------------
    Test No.: 2 18:41:29

    Seed = 65432913 | a = 4075625768, b = 1391106037, c = 664698970, d =
    385657096, t = 2222120575

    ..................

    Test No.: 98 19:13:56

    Seed = 65433009 | a = 4074012296, b = 591442530, c = 1820283394, d =
    450967796, t = 2641738720

    ---------------------------------------------------
    Test No.: 99 19:14:17

    Seed = 65433010 | a = 4073995489, b = 873917779, c = 1295449820, d = 1435911454, t = 3384307246

    ---------------------------------------------------
    Test No.: 100 19:14:38

    Seed = 65433011 | a = 4073978682, b = 1156393028, c = 770616246, d =
    273371465, t = 1979392125

    ---------------------------------------------------

    Tests passed = 99
    Tests not passed = 1

    ****************************************************
    Files with results that not passed all tests ****************************************************

    ./Test_bcd32_Results_smallcrush/smallcrush_bcd32.result_59.txt

    Start time : Di 13 Okt 2015 18:41:16 CEST
    Finished time: Di 13 Okt 2015 19:14:59 CEST



    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Wed Oct 14 12:22:29 2015
    On Tuesday, October 13, 2015 at 1:24:40 PM UTC-4, Karl-Uwe Frank wrote:
    In fact it is sufficient to change the seeding function of bcd32 like
    ...
    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];
    t = a + b + c + d;

    What happens if the user, in an effort to force in a bunch of 1s, tries the seed array {0xFFFFFFFF,0,0,0} or the equivalent {-1,0,0,0} ?

    As a general rule, you can't prevent a bad starting state simply by doing state = g(state), where g is an invertible function in the state space. This only changes which state is bad (from the caller's view).

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to All on Tue Oct 13 20:44:07 2015
    Just made a simple routine that checks whether one of the internal state
    vars (a,b,c,d,t) reach zero. These are some sample results

    ./bcd32_checkstate_for_zero a b c d test_size



    ./bcd32_checkstate_for_zero 0xBC614E 0 0 0 $((1024*3072000))

    n d t a b c | output
    1151951809) 26fbdf73 00000000 7af6abc9 4a2cebc9 186fcbc9 | 74b8ff73
    2930225574) 00000000 7dfbaecf a0cad48d f2ebb48d 175c748d | e5b7c000
    3474537209) 21129b76 00000000 00354521 7a16c521 d8d96521 | 83dd3b76
    3764129993) bf206dac 0897e570 e4000000 9058e000 00000000 | 2f788dac
    4152803910) b531cc67 00000000 f283bebb 6b7bfebb 725b1ebb | ac112c67
    4180213076) 00000000 f1e45086 5679cf7f f979cf7f 9069af7f | 69106000
    5179381638) a2e7a4c1 00000000 95d142b3 c16b02b3 f627a2b3 | 95ab04c1
    6459122892) c7b37c69 7ab07924 00000000 d323e000 7c000000 | 68909c69
    8115701864) 867bd6be ce1eab9a 37f9a000 00000000 37f9a000 | b18276be
    8730239465) 00000000 c3d05fe6 7cff6d65 ac046d65 0aac0d65 | a6a86000
    8745219331) 00000000 c1881ec6 dfdb6447 f4c72447 c4644447 | 30a36000
    11703934411) 9398caa9 75d4a03c 00000000 56e46000 8c000000 | 497caaa9
    12706551343) f2576ea6 9533e9a0 00000000 0b3ca000 94000000 | 6d6bcea6


    ./bcd32_checkstate_for_zero 0xBC614E 0x5464E 0xF1206 0x2126E
    $((1024*3072000))

    n d t a b c | output
    641266947) 00000000 3661b201 3f5947e5 6e12f1e1 9d9567e5 | f3879604
    2316073425) d54afe6b 4bd5c38a 517f7cdc 00000000 517f7cdc | 843582b7
    2714883364) 6c2724fb a20351cc 00000000 11632cf4 659e8000 | 18da880f
    2755312843) a6a87cb6 22e32ac7 13748000 0bff645c 00000000 | ad5718ea
    3494798426) 850df051 4024c1b3 bf7d8000 91ca0414 00000000 | 14c7f445
    3966565173) 055c7b27 8e59849e 00000000 a1664564 c8ac8000 | 6c96be43
    5402999536) 00000000 466a752d 312ce23d 257b3a39 9874023d | bd0f3804
    6402806576) 95e5b555 87fe6dbd 38b240ec 00000000 38b240ec | ad57f5b9
    6471224971) 279dc434 00000000 cfc8112d 11e57a99 7f1b312d | 49638f80
    8386943339) 00000000 d7054d33 d0d8fc62 979eaffe a6d8bc62 | 3146139c
    8392976099) 661aa7fc 00000000 21d49cfc 9f64ece8 bf719cfc | 460fd7e8
    9678528260) 7b1608a1 5fef54a8 7021da8c 00000000 7021da8c | 0b37d22d
    10035166508) 00000000 5ce7c3cf 00b55282 51a06e0e 0e771282 | 5fd77c8c
    11485643189) 318a626b 00000000 e5241beb 15497f87 1514fbeb | 31d7e607
    12043286512) e80a8dc4 8980d690 7ca3b04c 00000000 7ca3b04c | 94a93d88



    ./bcd32_checkstate_for_zero 0 0 0 0 $((1024*3072000))

    n d t a b c | output
    960246790) bad50310 00000000 56bd9ca1 cf675ca1 4251bca1 | 37e3e310
    1580525801) 2c788833 75f01f1a a4000000 a05ae000 00000000 | 8c226833
    1673121719) d70dc807 e9d9ab92 64000000 506ce000 00000000 | 87612807
    4705637618) 2a1135aa 374f5944 f89ee000 00000000 f89ee000 | d28fd5aa
    5591941724) 24e4f868 00000000 f2cc334e 7e1cd34e 8d35f34e | d7cdd868
    6008635647) b8e9cf75 00000000 1fb287d3 4136c7d3 f8ace7d3 | 0173ef75
    6327684656) b7908a29 f13c7644 00000000 fbdb2000 64000000 | 284baa29
    6707171810) a9885446 0259f9d4 d7c22000 00000000 d7c22000 | 7e4a7446
    7099586368) 6b19bc6c 970486b2 34000000 32266000 00000000 | 593fdc6c
    9868569959) 00000000 e3bbc9c4 819cdb24 993fbb24 79015b24 | e03ee000
    11022559484) 00000000 b5c7f338 d8f06bd8 87e88bd8 ea6b6bd8 | 6d83e000
    12475896319) 0481bef0 396cdcc6 2f8f6000 00000000 2f8f6000 | 2b0edef0

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Unruh@21:1/5 to rsharris.bx.psu.edu@gmail.com on Wed Oct 14 20:01:28 2015
    On 2015-10-14, rsharris.bx.psu.edu@gmail.com <rsharris.bx.psu.edu@gmail.com> wrote:
    On Tuesday, October 13, 2015 at 1:24:40 PM UTC-4, Karl-Uwe Frank wrote:
    In fact it is sufficient to change the seeding function of bcd32 like
    ...
    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];
    t = a + b + c + d;

    What happens if the user, in an effort to force in a bunch of 1s, tries the seed array {0xFFFFFFFF,0,0,0} or the equivalent {-1,0,0,0} ?

    As a general rule, you can't prevent a bad starting state simply by doing state = g(state), where g is an invertible function in the state space. This only changes which state is bad (from the caller's view).

    But if g is sufficiently weird (Ie it is sufficiently hard to invert it)
    and the bad states are sufficiently rare, then the caller may have no
    way of finding our using that state. And if the state space is
    sufficiently large, the probablity of hitting it by accident can be
    extremely small ( eg such that the computer being hit by an asteroid
    while using it for that problem is higher).



    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Wed Oct 14 23:45:16 2015
    On 14.10.15 21:22, rsharris.bx.psu.edu@gmail.com wrote:
    On Tuesday, October 13, 2015 at 1:24:40 PM UTC-4, Karl-Uwe Frank wrote:
    In fact it is sufficient to change the seeding function of bcd32 like
    ...
    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];
    t = a + b + c + d;

    What happens if the user, in an effort to force in a bunch of 1s,
    tries the seed array {0xFFFFFFFF,0,0,0} or the equivalent {-1,0,0,0}
    ?

    As a general rule, you can't prevent a bad starting state simply by
    doing state = g(state), where g is an invertible function in the
    state space. This only changes which state is bad (from the caller's
    view).

    Bob H

    Many thanks again for pointing out my permanent mistakes about seeding
    bcd32. Yes you're right and stupid as I am I clearly overlooked the
    possibility of a seeding of the PRNG as you mention above, which will
    crash the output right away. Somehow meanwhile I think that I'm a
    laughing stock on that issue.

    So obviously a seeding of bcd32 with all state vars = zero will produce
    zero as output all the way. But for example xorshift must not be seeded
    with all state vars=zero. And this hold also, if I'm right, for the Mersenne-Twister. The question is now, why would a user seed a PRNG with
    zero?

    And of course if there would be a check for improper seeding in the
    seeding function like

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    if ((seed[0] < 0) || (seed[0] == 0xffffffff)) seed[0] = 1;

    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];

    t = a + b + c + d;
    }

    it would prevent the problematic case of all vars=0.

    Just found this interesting article about seeding a PRNG, with reads in
    the beginning

    "Properly seeding random number generators doesn't always get the
    attention it deserves. Quite often, people do a terrible job, supplying low-quality seed data (such as the system time or process id) or
    combining multiple poor sources in a half-baked way. C++11 provides std::seed_seq as a way to encourage the use of better seeds, but if you
    haven't thought about what's really going on when you use it, you may be
    in for a few surprises.

    In contrast to C++11, some languages, such as popular scripting
    languages like JavaScript, Python, or Perl, take care of good seeding
    for you (provided you're using their built-in RNGs)."

    http://www.pcg-random.org/posts/cpp-seeding-surprises.html

    So, apart from the very good PRNG function of bcd32, it is of some
    importance that the whole PRNG is properly implemented by using the
    adequate seeding function.

    Cheers,
    Karl-Uwe


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Thu Oct 15 15:54:14 2015
    On 14.10.15 21:22, rsharris.bx.psu.edu@gmail.com wrote:
    On Tuesday, October 13, 2015 at 1:24:40 PM UTC-4, Karl-Uwe Frank wrote:
    In fact it is sufficient to change the seeding function of bcd32 like
    ...
    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];
    t = a + b + c + d;

    What happens if the user, in an effort to force in a bunch of 1s,
    tries the seed array {0xFFFFFFFF,0,0,0} or the equivalent {-1,0,0,0}
    ?

    As a general rule, you can't prevent a bad starting state simply by
    doing state = g(state), where g is an invertible function in the
    state space. This only changes which state is bad (from the caller's
    view).

    Bob H

    Reflecting again on your response I agree with you that the user should
    decide how he seeds the PRNG. But to my current understanding there is
    only *one* bad seed state for bcd16/bcd32 which is a=b=c=d=0. The PRNG
    produce a very good pseudo-random output if seeded with 1 up to 128bit
    (even the 16bit version). But perhaps the user may like to seed it with a=b=d=0, c=1, or a=b=c=0, d=1 or any other such combination. Therefore
    the check for improper seeding should only be done like

    // The Seeding Function
    void seed_bcd16(uint16_t seed[]) {
    if ((seed[0]+seed[1]+seed[2]+seed[3]) == 0) {
    fprintf(stderr, "Please seed the PRNG with at least 1 bit.\n");
    exit(0);
    }

    a = seed[0] ^ 0xffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];

    t = a + b + c + d;
    }

    Clearly the PRNG consist of a seeding *and* a output function what I
    consider a proper and legitimate approach.

    For example seeding xorshift with all state vars=0 will crash the PRNG.
    Should it hence be considered useless? No.

    Or for example the Mersenne-Twister. If its huge internal state array is
    seeded with all zero the PRNG will crash.
    Should it hence be considered useless? No.

    Cheers,
    Karl-Uwe

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to William Unruh on Thu Oct 15 02:24:13 2015
    On 14.10.15 22:01, William Unruh wrote:
    On 2015-10-14,
    rsharris.bx.psu.edu@gmail.com<rsharris.bx.psu.edu@gmail.com> wrote:
    On Tuesday, October 13, 2015 at 1:24:40 PM UTC-4, Karl-Uwe Frank
    wrote:
    In fact it is sufficient to change the seeding function of bcd32
    like ... a = seed[0] ^ 0xffffffff; b = seed[1]; c = seed[2]; d =
    seed[3]; t = a + b + c + d;

    What happens if the user, in an effort to force in a bunch of 1s,
    tries the seed array {0xFFFFFFFF,0,0,0} or the equivalent
    {-1,0,0,0} ?

    As a general rule, you can't prevent a bad starting state simply by
    doing state = g(state), where g is an invertible function in the
    state space. This only changes which state is bad (from the
    caller's view).

    But if g is sufficiently weird (Ie it is sufficiently hard to invert
    it) and the bad states are sufficiently rare, then the caller may
    have no way of finding our using that state. And if the state space
    is sufficiently large, the probablity of hitting it by accident can
    be extremely small ( eg such that the computer being hit by an
    asteroid while using it for that problem is higher).


    Maybe I'm also a crank as AOB because I struggle with the proper seeding.

    But the PRNG function works very well and so I decided to give this
    16bit version of bcd32 a try

    //----------------------------------
    // The Seeding Function
    void seed_bcd16(uint16_t seed[]) {
    if (a > 0xfffe) a = 1;

    a = seed[0] ^ 0xffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];

    t = a + b + c + d;
    }

    // The PRNG Function
    uint16_t bcd16() {
    a = a + (d >> 3);
    b = a + (b ^ c);
    c = a + (b << 7);
    d = a + (d ^ t);
    t = a + t;

    return (b ^ c ^ d);
    }
    //----------------------------------

    and this are two test results



    a = 24271, b = 23812, c = 54144, d = 40157, t = 11312

    Entropy = 7.999999 bits per byte.

    Optimum compression would reduce the size
    of this 268435456 byte file by 0 percent.

    Chi square distribution for 268435456 samples is 255.08, and randomly
    would exceed this value 48.67 percent of the times.

    Arithmetic mean value of data bytes is 127.4955 (127.5 = random).
    Monte Carlo value for Pi is 3.141679513 (error 0.00 percent).
    Serial correlation coefficient is 0.000005 (totally uncorrelated = 0.0).



    a = 24271, b = 23812, c = 54144, d = 40157, t = 11312

    stdin contains 268435456 bytes (-2147483648 bits)

    BIT ANALYSIS

    Value Count Percentage
    ----- ----- ----------
    one 1073732828 49.999580
    zero 1073750820 50.000416

    Bit Count Percentage
    --- ----- ----------
    msb 7 134203087 49.994545
    6 134227367 50.003590
    5 134217173 49.999794
    4 134227511 50.003643
    3 134210568 49.997334
    2 134212595 49.998085
    1 134219111 50.000511
    lsb 0 134215416 49.999138

    1/0 ratio 0.999983
    Deviance from unity -0.000017



    Do you consider the algorithm weak, wrong, biased or useless?

    If interested you can find my simplistic and faulty source code of the
    test routine over here
    http://www.freecx.co.uk/bcd32/bcd32_BitStat_16bit.c


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to William Unruh on Wed Oct 14 23:55:54 2015
    On 14.10.15 22:01, William Unruh wrote:
    On 2015-10-14, rsharris.bx.psu.edu@gmail.com<rsharris.bx.psu.edu@gmail.com> wrote:
    On Tuesday, October 13, 2015 at 1:24:40 PM UTC-4, Karl-Uwe Frank wrote:
    In fact it is sufficient to change the seeding function of bcd32 like
    ...
    a = seed[0] ^ 0xffffffff;
    b = seed[1];
    c = seed[2];
    d = seed[3];
    t = a + b + c + d;

    What happens if the user, in an effort to force in a bunch of 1s,
    tries the seed array {0xFFFFFFFF,0,0,0} or the equivalent {-1,0,0,0} ?

    As a general rule, you can't prevent a bad starting state simply
    by
    doing state = g(state), where g is an invertible function in the state
    space. This only changes which state is bad (from the caller's view).

    But if g is sufficiently weird (Ie it is sufficiently hard to invert it)
    and the bad states are sufficiently rare, then the caller may have no
    way of finding our using that state. And if the state space is
    sufficiently large, the probablity of hitting it by accident can be
    extremely small ( eg such that the computer being hit by an asteroid
    while using it for that problem is higher).


    Well than, share your idea on how to prevent the PRNG seeded with all
    vars = 0, even if weird. Or perhaps you're aware of another approach on
    how to keep the user from improper seeding.



    Bob H


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Thu Oct 15 11:03:11 2015
    On Wednesday, October 14, 2015 at 5:45:18 PM UTC-4, Karl-Uwe Frank wrote:
    Somehow meanwhile I think that I'm a laughing stock on that issue.

    I don't believe anyone thinks of you as a laughing stock.

    The reason I've been posting the things I have is to try to give you a gentle nudge toward considering analysis techniques that might be beneficial in the design of a PRNG.

    When I did the loop analysis of your ctr variable, I actually thought the variable performed pretty well. The shorter loops I observed occurred with low probability, the loops that it usually ended up in were on the order of what would be expected from
    a truly random mapping. My analysis was brief, though, only the tip of the iceberg.

    The point I've been trying to make about seeding is not that there is this one initial state (all zeros) that's a problem. A more thorough analysis would try to characterize the degenerate states, estimate what fraction of the state space they occupy,
    etc, and how the output is affected in those states.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Thu Oct 15 23:20:46 2015
    On 15.10.15 20:03, rsharris.bx.psu.edu@gmail.com wrote:
    On Wednesday, October 14, 2015 at 5:45:18 PM UTC-4, Karl-Uwe Frank
    wrote:
    Somehow meanwhile I think that I'm a laughing stock on that issue.

    I don't believe anyone thinks of you as a laughing stock.

    The reason I've been posting the things I have is to try to give you
    a gentle nudge toward considering analysis techniques that might be beneficial in the design of a PRNG.

    When I did the loop analysis of your ctr variable, I actually thought
    the variable performed pretty well. The shorter loops I observed
    occurred with low probability, the loops that it usually ended up in
    were on the order of what would be expected from a truly random
    mapping. My analysis was brief, though, only the tip of the
    iceberg.

    The point I've been trying to make about seeding is not that there is
    this one initial state (all zeros) that's a problem. A more thorough analysis would try to characterize the degenerate states, estimate
    what fraction of the state space they occupy, etc, and how the output
    is affected in those states.

    Bob H

    Thanks, that's very kind of you. Yes, your postings let my start
    thinking a lot and also I'm trying to verify the quality of my algorithm differently. As you may have noticed, I dropped the idea of a counter completely because I really believe that the PRNG formulae itself is
    stable and does not run into any bias. As you may have noticed also, I
    can't deliver any formal mathematical proof of the formulae. What I'm
    currently doing is to verify it empirically be long test runs, having
    faith in those tools available to measure the quality of the random output.

    Really, I would like to know how to characterise any degenerate state in
    a more formal way. Well, what I have observed so far, is that on certain
    seed combinations the output of the 16bit version has a very small
    tendency towards some values, if the output after seeding is about
    several 100th of KB only. On the larger scale, like output in the 10th,
    100th of MB this tendency vanish.


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Fri Oct 16 11:40:22 2015
    On 15.10.15 20:03, rsharris.bx.psu.edu@gmail.com wrote:

    When I did the loop analysis of your ctr variable, I actually thought
    the variable performed pretty well. The shorter loops I observed
    occurred with low probability, the loops that it usually ended up in
    were on the order of what would be expected from a truly random
    mapping. My analysis was brief, though, only the tip of the
    iceberg.

    I like to get back to this especially. Currently I'm using mostly
    stochastic and empirical tests to verify the randomness output. The most advanced test suite is TestU01 from Pierre L'Ecuyer, Université de
    Montréal (http://simul.iro.umontreal.ca/testu01/tu01.html). Are you
    aware of this test very interesting test battery?

    In his documentation it reads

    http://simul.iro.umontreal.ca/testu01/guideshorttestu01.pdf (Page 2)

    "There is no universal test or battery of tests that can guarantee, when passed, that a given generator is fully reliable for all kinds of
    simulations. Passing many tests improves one’s confidence in the RNG,
    although it never proves that the RNG is foolproof. In fact, no RNG can
    pass every conceivable statistical test. One could say that a bad RNG is
    one that fails simple tests, and a good RNG is one that fails only very complicated tests that are extremely hard to find or impractical to run."

    The interesting point though is, that my algorithm *with* counter passed
    all tests and therefore could be considered a good and useful PRNG.

    But in heavy contrast, the test you made by just "tipping the iceberg"
    revealed a crave design error, because the counter running very soon
    into a loop.

    This is in fact really fascinating. Perhaps you are using some new
    analysis techniques which can reveal design mistakes that TestU01 and
    other test batteries don't recognise as such.

    So my question now is, how did you manage to find the looping counter in bcd32_ctr?

    Cheers,
    Karl-Uwe



    The point I've been trying to make about seeding is not that there is
    this one initial state (all zeros) that's a problem. A more thorough analysis would try to characterize the degenerate states, estimate
    what fraction of the state space they occupy, etc, and how the output
    is affected in those states.

    Bob H


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to All on Fri Oct 16 08:58:20 2015
    But in heavy contrast, the test you made by just "tipping the iceberg" revealed a grave design error, because the counter running very soon
    into a loop.

    I don't know if it is a "grave" error. The two main loops had period ≈112K and something like 99% of the time these were the loops it got to. 112K is about 17 bits, so in some sense you could say your ctr update function will behave like a 17 bit
    counter 99% of the time (i.e. for 99% of randomly chosen initial values).

    The other loops were shorter-- about 7K and 5K I think. There might be other shorter loops, but in 1000 attempts (1000 randomly chosen initial values) I did not get into any other loops. You could work out algebraically whether there are any period-1
    loops, maybe even for period-2 loops.

    One thing missing from my analysis is whether those loops, even the short ones, cause a problem for the rest of your generator. They might not.

    This is in fact really fascinating. Perhaps you are using some new
    analysis techniques which can reveal design mistakes that TestU01 and
    other test batteries don't recognise as such.

    The analysis isn't a new technique. I think you are only applying the test batteries to the output, while I'm looking at certain components of your generator.

    So my question now is, how did you manage to find the looping counter in bcd32_ctr?

    Floyd's cycle-finding algorithm. https://en.wikipedia.org/wiki/Cycle_detection. From initial state s, do
    x = y = s
    repeat { x = f(x); y=f(f(x)); } until (x==y)
    at this point x is guaranteed to be in a cycle and you can just run over the whole cycle again
    y = x; n = 0;
    repeat { y = f(y); n = n+1; } until (y==x)
    to find the cycle length. f() is your ctr update function.

    I did that from some number of starting states. As I counted the loop length I also kept track of the smallest value in the loop, and for each different loop I recorded the length and smallest value.

    So one thing you might try is to do something like that to identify loops in your ctr. Then for each different loop, do some number of tests that start with ctr in that loop and run the output of your generator through the test battery to see if any ctr
    loop induces bias in your output.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Mon Oct 19 21:41:56 2015
    On 16.10.15 17:58, rsharris.bx.psu.edu@gmail.com wrote:

    So my question now is, how did you manage to find the looping
    counter in bcd32_ctr?

    Floyd's cycle-finding algorithm. https://en.wikipedia.org/wiki/Cycle_detection.
    So one thing you might try is to do something like that to identify
    loops in your ctr. Then for each different loop, do some number of
    tests that start with ctr in that loop and run the output of your
    generator through the test battery to see if any ctr loop induces
    bias in your output.

    Bob H

    Thanks for sharing. I think I will not use a counter currently but the
    function might also help finding potential problem with a,b,c or d.

    Furthermore it my perhaps help estimating the period length of the PRNG.

    Cheers,
    Karl-Uwe

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Tue Oct 20 11:41:19 2015
    On Monday, October 19, 2015 at 3:41:58 PM UTC-4, Karl-Uwe Frank wrote:
    Thanks for sharing. I think I will not use a counter currently but the function might also help finding potential problem with a,b,c or d.

    For a large state space you may want to stop the cycle-detector's iteration after some sensible value, e.g. corresponding to a minute.

    If it were me, I'd try it on your a,d,t update function. I already identified a couple of short loops in that, but there might be others.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Wed Oct 21 09:58:02 2015
    On 20.10.15 20:41, I wrote:
    If it were me, I'd try it on your a,d,t update function.

    and On Wednesday, October 21, 2015 at 4:05:37 AM UTC-4, Karl-Uwe Frank replied:
    One additional question: did you initialise a,d,t through the seeding function or did you set them to arbitrary selected values directly?

    "Did I"? I didn't do anything.

    Any possible combination of a,d,t can be achieved by *some* call to the seeding function (and with equal probability), so it wouldn't make any difference.

    Also, since your a,d,t update function is invertible, cycle-detection can be simplified. The first loop is unnecessary because every state is part of a cycle.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Wed Oct 21 01:06:52 2015
    On 20.10.15 20:41, rsharris.bx.psu.edu@gmail.com wrote:
    On Monday, October 19, 2015 at 3:41:58 PM UTC-4, Karl-Uwe Frank
    wrote:
    Thanks for sharing. I think I will not use a counter currently but
    the function might also help finding potential problem with a,b,c
    or d.

    For a large state space you may want to stop the cycle-detector's
    iteration after some sensible value, e.g. corresponding to a minute.

    If it were me, I'd try it on your a,d,t update function. I already identified a couple of short loops in that, but there might be
    others.

    Bob H

    Many thanks for testing and your advice, I will try to verify that -
    even though it sounds not that promising.

    Cheers,
    Karl-Uwe

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Wed Nov 4 05:56:19 2015
    On Tuesday, November 3, 2015 at 10:25:55 AM UTC-5, Karl-Uwe Frank wrote:
    Did you find a short loop inside of 300,000,000, 1,500,000,000 or 4,294,967,295 outputs or on a higher volume?

    Very short loops.

    Did you just test one of the three vars (a,d,t) when searching for a
    loop or a combination of all three at once?

    The state space is the combination of all three at once. So I naturally was looking for loops in the combination of all three at once.

    Perhaps you can name me just one initial setup for a,b,c,d which produce such a short loop you found.

    I previously posted this in your original thread, on 6 October. What follows is just more detail.

    (a,d,t) will repeat with period 2 if a=d=0 and t<32. Hence one initial setup for a,b,c,d which produces such a short loop is a=0 b=0xB5A52342 c=0x4A5ADCD1 d=0. This gives the initial state a=d=0, t=0x00000013.

    When a=d=0 and t<32, it is easy to see that the next state has a and t unchanged while d takes the initial value of t. The following state just reverses that. So d just flip flops between two values while a and t are stuck.

    Obviously a=0, d=t<32 is in the same loops.

    Similarly, if we start with a=0, d<32 and t<32, we'll also be stuck in a period-2 loop, as d bounces back and forth between two values.

    As also indicated in earlier discussions, when a remains zero the c,d state just loops with period 4. I can't say for certain that is true for all initial values of b,c (when a is zero) but it has been true for all b,c that I have tried.

    If that last conjecture is true, all of these states -- something like 2^10 a,d,t states and 2^42 initial a,b,c,d settings -- will produce repeated output (of the entire a,b,c,d,t machine) every 4th value.

    The reason for all this, of course, is that if a gets stuck at zero t doesn't change, and on successive calls d gets xored by t then gets xored by t again so it goes back to its initial state. And the least significant five bits of d have no effect on
    future state -- d>>5 throws them away -- which is why a stays at zero.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Wed Nov 4 06:04:42 2015
    On Tuesday, November 3, 2015 at 10:25:55 AM UTC-5, Karl-Uwe Frank wrote:
    I need to get back on this issue because I would like to ask you for a
    more specific explanation on how you found those loops.

    I forgot to answer that part in my recent post.

    I found those pretty quickly after your original post a month ago. I looked at how a single bit might disperse on successive calls, so I tried initial states that were all zero except for one bit. I think the first or second thing I tried revealed a
    period-2 loop in a,d,t. (from one of my early posts: "IIRC, I tried a=b=c=d=0 and t=1, and it did something interesting"). Then I looked more closely at your update function and realized why and what other conditions would produce the same kind of loop.

    There are probably other loops, the question would be how long are they.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Wed Oct 21 10:05:36 2015
    On 20.10.15 20:41, rsharris.bx.psu.edu@gmail.com wrote:

    If it were me, I'd try it on your a,d,t update function. I already identified a couple of short loops in that, but there might be
    others.

    One additional question: did you initialise a,d,t through the seeding
    function or did you set them to arbitrary selected values directly?



    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Tue Nov 3 16:25:53 2015
    On 20.10.15 20:41, rsharris.bx.psu.edu@gmail.com wrote:
    On Monday, October 19, 2015 at 3:41:58 PM UTC-4, Karl-Uwe Frank
    wrote:
    Thanks for sharing. I think I will not use a counter currently but
    the function might also help finding potential problem with a,b,c
    or d.

    For a large state space you may want to stop the cycle-detector's
    iteration after some sensible value, e.g. corresponding to a minute.

    If it were me, I'd try it on your a,d,t update function. I already identified a couple of short loops in that, but there might be
    others.

    Bob H

    I need to get back on this issue because I would like to ask you for a
    more specific explanation on how you found those loops.

    Did you find a short loop inside of 300,000,000, 1,500,000,000 or
    4,294,967,295 outputs or on a higher volume?

    Did you just test one of the three vars (a,d,t) when searching for a
    loop or a combination of all three at once?

    Currently I have tested several thousand seeds generating a bcd32 output
    of minimum 100,000,000 and 300,000,000 cycles for each seed, but can't
    find any loop (periodic recurrence) for "a" (currently just searching
    for "a" only).

    There are recurrences of the search value of "a", but there is no loop
    as far as I have tested.

    The routine I use checks first for any recurrence of "a" in the given
    range (100,000,000 or 300,000,000 for example) and if a recurrence is
    found it extend the test length by adding 4294967295 plus eight times
    the first recurrence distance.

    The source code is over here as well as some test results

    http://www.freecx.co.uk/bcd32/loop_finder/

    Whenever a recurrence is found, the resulting distance values as well as
    the initial seed values for a,b,c,d,t are written to the test result
    file accordingly. What my function find and record may look like this


    a=0xf6a0195d; b=0x771e9a7c; c=0x7a307ffc; d=0x0622b810; t=0xee11ebe5;

    Testing for a=f6a0195d

    n d t a b c | output | distance
    228869896) 726fe22d 94b63b22 f6a0195d 5181dfdd 329bb95d | 117584ad |
    228869896
    817026155) 95adc37d 4f6596c4 f6a0195d 1ad9afdd 2c9bb95d | a3efd5fd |
    588156259
    1528911457) 557b491d adff40ba f6a0195d c86425dd 7b5bb95d | e644d59d |
    711885302
    2266774020) 2f35ec2c c35f7723 f6a0195d eb60dbdd 121bb95d | d64e8eac |
    737862563


    With an initial search range of 300,000,000 more recurrences are found
    than with 100,000,000, but still no loop.

    Perhaps you can name me just one initial setup for a,b,c,d which produce
    such a short loop you found.

    Cheers,
    Karl-Uwe





    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Thu Nov 5 21:55:06 2015
    On 04.11.15 17:39, rsharris.bx.psu.edu@gmail.com wrote:

    And when it sees the same interval, for a, twice in a row, it would
    make the unsubstantiated claim that there *is* a loop, without
    bothering to check whether the other variables have repeated.

    By the way, if a variable has a recurrence with the same distance more
    than twice it must considered a potential loop and the initial seed
    combination should be verified more deeply. The function currently bail
    out if a potential loop is found, but could be easily modified to extend
    the search. But I will more likely extend the whole search algorithm as
    you have recommended in an earlier post.

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Wed Nov 4 08:39:08 2015
    On Tuesday, November 3, 2015 at 10:25:55 AM UTC-5, Karl-Uwe Frank wrote:
    The routine I use checks first for any recurrence of "a" in the given
    range (100,000,000 or 300,000,000 for example) and if a recurrence is
    found it extend the test length by adding 4294967295 plus eight times
    the first recurrence distance.

    The source code is over here as well as some test results

    http://www.freecx.co.uk/bcd32/loop_finder/

    Looking at your source code, if I understand it correctly, I think it won't report most loops even if it is in one. And what loops it does report probably aren't loops.

    Consider this hypothetical loop, which repeats with period 4006.

    n d t a
    0000 11111111 22222222 33333333
    1000 00000000 22222222 33333333
    2001 11111111 00000000 33333333
    3003 11111111 22222222 00000000
    4006 11111111 22222222 33333333

    Your program, looking at the intervals between re-occurrences of a, would see intervals of 1000, 1001, and 2005. These intervals would repeat over and over and the program, not seeing the same interval twice in a row, would make the unsubstantiated claim
    that there is no loop.

    The only loop I think it will report is one where the value of a only occurs once within the loop.

    The same would be true if you were looking at re-occurrences of d or t instead of a, obviously.

    And when it sees the same interval, for a, twice in a row, it would make the unsubstantiated claim that there *is* a loop, without bothering to check whether the other variables have repeated.

    I suppose there could be some way to work that into a valid loop detector, but it would be much simpler to just check whether the whole state repeats (a,d,t or a,b,c,d,t depending on viewpoint). The cycle detection algorithm I posted earlier is simple to
    implement, easily proved correct, and (possibly with variations) widely used.

    But since your a,d,t operation is invertible you can simply just run until you see your initial a,d,t state again. You are 100% guaranteed to get back to the initial state (a,d,t not necessarily a,b,c,d,t). The only question is how soon.

    On an unrelated topic; from the seeding part of that program:
    if ((a + b + c + d) == 0) raise exception
    a = a ^ 0xffffffff;
    t = a + b + c + d;

    The test for the exception is in the wrong place. And is incomplete. The test does not prevent the initial state of a=b=c=d=t=0, because you modify a after making the test.

    And it doesn't matter what b and c are. The initial state to avoid would be a=0, d<32, t<32, regardless of b and c, because you'll end up in a loop with period 4.

    The simplest correction may be to just do away with the test/exception and set a to some value you know is good. In other words, take the 128 bits of psuedo-randomness you're currently sticking into a,b,c,d and instead stick it into b,c,d,t and set a=
    0x15BADB04 (or some other non-zero value). This assumes states with a=15BADB04 aren't in short loops, of course, which is something you'd want to test/verify.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Thu Nov 5 20:44:06 2015
    On 04.11.15 17:39, rsharris.bx.psu.edu@gmail.com wrote:
    On Tuesday, November 3, 2015 at 10:25:55 AM UTC-5, Karl-Uwe Frank
    wrote:
    The routine I use checks first for any recurrence of "a" in the
    given range (100,000,000 or 300,000,000 for example) and if a
    recurrence is found it extend the test length by adding 4294967295
    plus eight times the first recurrence distance.

    The source code is over here as well as some test results

    http://www.freecx.co.uk/bcd32/loop_finder/

    Looking at your source code, if I understand it correctly, I think it
    won't report most loops even if it is in one. And what loops it does
    report probably aren't loops.

    Consider this hypothetical loop, which repeats with period 4006.

    n d t a
    0000 11111111 22222222 33333333
    1000 00000000 22222222 33333333
    2001 11111111 00000000 33333333
    3003 11111111 22222222 00000000
    4006 11111111 22222222 33333333

    Your program, looking at the intervals between re-occurrences of a,
    would see intervals of 1000, 1001, and 2005. These intervals would
    repeat over and over and the program, not seeing the same interval
    twice in a row, would make the unsubstantiated claim that there is no
    loop.

    The only loop I think it will report is one where the value of a only
    occurs once within the loop.

    The same would be true if you were looking at re-occurrences of d or
    t instead of a, obviously.

    Yes that's true, my function is currently only checking for a loop in
    one defined variable, in the current case "a". In your hypothetical
    example there is no obvious simple direct distant loop of "a". I have
    placed a "force a loop" example in the source to demonstrate what I
    consider a loop that should be found. On the other hand, I've used this function to find loops for the version bcd32_ctr and it found a lot
    loops in the counter variable (ctr).

    Perhaps I need to extend the function in keeping track of more than the
    simple direct distance between the last and the current re-occurrence of
    the defined test variable in order to find those potential more distant
    loops, as described in your example.


    And when it sees the same interval, for a, twice in a row, it would
    make the unsubstantiated claim that there *is* a loop, without
    bothering to check whether the other variables have repeated.

    Of course there is a loop in large parts of the internal state in your hypothetical example which my current function definitively not detect.


    I suppose there could be some way to work that into a valid loop
    detector, but it would be much simpler to just check whether the
    whole state repeats (a,d,t or a,b,c,d,t depending on viewpoint). The
    cycle detection algorithm I posted earlier is simple to implement,
    easily proved correct, and (possibly with variations) widely used.

    A whole or partly state loop detection might really be more useful, I
    will try to extend my function that way.


    But since your a,d,t operation is invertible you can simply just run
    until you see your initial a,d,t state again. You are 100% guaranteed
    to get back to the initial state (a,d,t not necessarily a,b,c,d,t).
    The only question is how soon.

    That might be very interesting and somehow easy to implement.


    On an unrelated topic; from the seeding part of that program:
    if ((a + b + c + d) == 0) raise exception
    a = a ^ 0xffffffff;
    t = a + b + c + d;

    The test for the exception is in the wrong place. And is incomplete.
    The test does not prevent the initial state of a=b=c=d=t=0, because
    you modify a after making the test.

    Oh, huh, you're absolutly right! This test is useless :-(


    And it doesn't matter what b and c are. The initial state to avoid
    would be a=0, d<32, t<32, regardless of b and c, because you'll end
    up in a loop with period 4.

    The simplest correction may be to just do away with the
    test/exception and set a to some value you know is good. In other
    words, take the 128 bits of psuedo-randomness you're currently
    sticking into a,b,c,d and instead stick it into b,c,d,t and set
    a=0x15BADB04 (or some other non-zero value). This assumes states with a=15BADB04 aren't in short loops, of course, which is something you'd
    want to test/verify.

    Bob H

    This idea of yours is most brilliant as it is simple and effective!

    Never thought of such an easy solution without reducing the initial
    entropy of 128bit. This way there is no need for the unpleasant "if" and
    "xor a with some value" in the seeding function. Also it's possible to
    omit the "t=a+b+c+d" calculation.

    I'm currently running tests for random quality using your proposed
    change of the seeding function as well as you initial value a=0x15BADB04.

    In comparison I'm testing against a=0x9E3779B9 - lets wait and see, but
    so far the results are very promising in general.

    Cheers,
    Karl-Uwe



    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Fri Nov 6 08:31:57 2015
    On Thursday, November 5, 2015 at 2:44:08 PM UTC-5, Karl-Uwe Frank wrote:
    I've used this function to find loops for the version bcd32_ctr and it
    found a lot loops in the counter variable (ctr).

    And missed many.

    Your ctr variable does not depend on any other variables. So when you see a re-occurence of a value, you are in a cycle. You will see the same interval between re-occurences of that value, every time. That eliminates the problem of your program falsely
    reporting a loop, in this special case where the one variable you are looking at doesn't depend on any other state variables.

    But your program will miss loops in ctr because it is only testing for a re-occurence of the initial value. Since your ctr update is not invertible, ctr can (and does) go through non-cyclic values before it enters a cycle. Unless your program happens to
    start within a cycle, it won't report one.

    For example, consider what happens when you start with ctr=DD56D3A3. On the next step (n=2) it enters a cycle that has period 250. Of course, you can't know that until you see n=252.

    n ctr
    1 DD56D3A3
    2 7901AE18
    …
    251 88017E4E
    252 7901AE18

    Your program is never going to see DD56D3A3 again, so it's not going to report a loop, even though it spent all but one step firmly entrenched in a cycle.

    Many other initial values outside the cycle -- at least 1900 but probably many many more -- will get you into that same short loop. Even if those 2150 (1900 outside, 250 inside) are the only ones that get into that cycle, you only have a ≈12% chance
    of realizing that you are in that loop when you are spinning your wheels in it.

    That assumes I implemented your ctr update correctly -- [(ctr+1) rotated left 29] + (ctr+1). But even if I got it wrong, the same logic will be true for whatever the update function is, as long as it can't be inverted.

    In your hypothetical example there is no obvious simple direct distant
    loop of "a".

    I disagree. The whole d,t,a state repeats every 4006 steps. Obviously "a" will also repeat every 4006 steps.

    If you would like a non-hypothetical example, make a PRNG that has two single bit state variables called a and b. The update is temp=b; b=a; a=(a+temp)&1. This will generate the fibonacci sequence modulo 2 (assuming you don't start with a=b=0) -- the
    sequence will be 0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1 … and obviously has a period of 3 (this is true for both a and b). Your program will detect the loop only if it sees 0 as the initial value.

    … if a variable has a recurrence with the same distance more than twice it must considered a potential loop and the initial seed combination should
    be verified more deeply. The function currently bail out if a potential loop is found ...

    The confusion was mine. I misinterpreted what the program means when it announces LOOP FOUND as it "bails out".

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Sat Nov 7 03:53:28 2015
    On 07.11.15 00:31, rsharris.bx.psu.edu@gmail.com wrote:
    On Friday, November 6, 2015 at 5:00:21 PM UTC-5, Karl-Uwe Frank wrote:
    For example, consider what happens when you start with ctr=DD56D3A3.
    On the next step (n=2) it enters a cycle that has period 250. Of
    course, you can't know that until you see n=252.

    n ctr
    1 DD56D3A3
    2 7901AE18
    ...
    251 88017E4E
    252 7901AE18

    Your program is never going to see DD56D3A3 again, so it's not going
    to report a loop, even though it spent all but one step firmly
    entrenched in a cycle.

    Well, would your recommended cycle-detector find any loop?

    Yes! That's how I found the example. My actual starting value was something like 1200 steps before the one listed there as n=1. As I posted earlier, it is simple to implement and easy to prove correct. Hardly takes any memory. Runs about 1/3 the
    rate of the generator by itself. It's well enough known that it has a wikipedia page.

    I used that cycle detector again today, to look at your ctr function. I realize you are no longer using that, but it's a nice example of a function that can run for a while before it enters a loop.

    I tried 10,000 random starting values and let it run until it got into a cycle. The cycles it got into are shown below, along with their cycle length.

    02F6FD77 period 250
    007BA0F7 period 853
    008A1BDA period 853
    008C3FC9 period 1812
    0006B6E1 period 2244
    000F5672 period 3069
    0024B858 period 3069
    000B95A9 period 4480
    00061FEE period 9730
    001A84BD period 9730
    00011A83 period 29566

    Among the things that stood out are the cycles that have the same period. This suggests they are isomorphic. So looked for a reason. To make a long story short it looks like your ctr update function, which I will write as f(x) = rotl(x+1,29) + x+1 is
    such that if x+y == 0xFFFFFFFD then there is a much-better-than-expected chance that f(x)+f(y) == 0xFFFFFFFD. So for example the two period-3069 cycles are essentially the same, the value in one just being the negative of the value in the other, offset
    by 5.

    I also found a fixed point, using algebra. f(0xFFFFFFFE) = 0xFFFFFFFE. Fortunately there is no other x s.t. f(x) = 0xFFFFFFFE. So unless you start in that state, you won't enter it. But if you start in it you won't leave it.

    It should be possible to to work out by hand whether there are any period-2 loops and whether they can be entered.

    Bob H

    It really works well as you explained in your example. Interesting
    though the loop value 29566 which I have found in every test so far.

    And yes, I have omitted the counter (ctr) since the first mentioning of
    those loops you found. Clearly the counter was a destructive addition.

    Meanwhile I have run several thousand tests for randomness quality using
    your proposed change of the seeding function


    // The Seeding Function
    void seed_bcd32(uint32_t seed) {
    srand( seed );

    a = 0x15BADB04;
    b = <some arbirary value>;
    c = <some arbirary value>;
    d = <some arbirary value>;
    t = <some arbirary value>;
    }

    // The PRNG Function
    uint32_t bcd32() {
    a = a + (d >> 5);
    b = a + (b ^ c);
    c = a + (b << 13);
    d = a + (d ^ t);
    t = a + t;

    return (b ^ c ^ d);
    }


    And the results are extremely good. Interesting in this case, that using a=0x9E3779B9 given the same seeds produces more failures - not that they
    are grave, but they appear more often. It seems that you have chosen the
    value a=0x15BADB04 very carefully.


    Please allow me to summarise

    1) changing the seeding function as you proposed does produce the same
    good quality pseudo-randomness - as expected

    2) the change of the seeding function eliminates the deadly
    initialisation of "a" with zero

    3) omitting the counter does eliminate another potential weakness

    4) if any state variable (a,b,c,d,t) reaching zero it does not weaken
    the algorithm because they do not stay at zero (in fact every state var
    will reach zero from time to time)


    Question remaining

    1) are there any "deadly" periodic recurrences of the internal state
    that will force the whole output into a loop?

    2) does the 100% recurrence of the initial state for (a,d,t) affect the
    output quality in a negative way?

    3) will the cycle detector function you are using find loops in one of
    the vars (a,d,t)?


    Meanwhile I have found that "The gap test is used to determine the
    significance of the interval between recurrence of the same digit."

    http://www.eg.bucknell.edu/~xmeng/Course/CS6337/Note/master/node46.html

    Clearly it is not the same as looking for a loop of parts of the
    internal state, but it measures the output and my algorithm survive the
    gap test perfectly.

    Perhaps it might be worth a try in passing the internal state vars
    (a,d,t) somehow to a gap test - or a modified version of the loop
    detector you're using.

    Cheers,
    Karl-Uwe








    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Fri Nov 6 23:00:19 2015
    On 06.11.15 17:31, rsharris.bx.psu.edu@gmail.com wrote:
    On Thursday, November 5, 2015 at 2:44:08 PM UTC-5, Karl-Uwe Frank
    wrote:
    I've used this function to find loops for the version bcd32_ctr and
    it found a lot loops in the counter variable (ctr).

    And missed many.

    Yes, that's clearly the case due to the nature on just only checking for
    a re-occurrence of the initial value - as you also mention later on.


    Your ctr variable does not depend on any other variables. So when you
    see a re-occurence of a value, you are in a cycle. You will see the
    same interval between re-occurences of that value, every time. That eliminates the problem of your program falsely reporting a loop, in
    this special case where the one variable you are looking at doesn't
    depend on any other state variables.

    But your program will miss loops in ctr because it is only testing
    for a re-occurence of the initial value. Since your ctr update is not invertible, ctr can (and does) go through non-cyclic values before it
    enters a cycle. Unless your program happens to start within a cycle,
    it won't report one.

    Sorry for my mistake, but I forgot that the loop finder of the
    ctr-version was slightly different. Just uploaded the source including
    some examples of found loops. And it seems that it will always find a
    loop of the ctr, sometimes it take a whole 4294967295 outputs.

    http://www.freecx.co.uk/bcd32/loop_finder/bcd32_ctr_check_loop.c


    For example, consider what happens when you start with ctr=DD56D3A3.
    On the next step (n=2) it enters a cycle that has period 250. Of
    course, you can't know that until you see n=252.

    n ctr
    1 DD56D3A3
    2 7901AE18
    …
    251 88017E4E
    252 7901AE18

    Your program is never going to see DD56D3A3 again, so it's not going
    to report a loop, even though it spent all but one step firmly
    entrenched in a cycle.

    Many other initial values outside the cycle -- at least 1900 but
    probably many many more -- will get you into that same short loop.
    Even if those 2150 (1900 outside, 250 inside) are the only ones that
    get into that cycle, you only have a ≈12% chance of realizing that
    you are in that loop when you are spinning your wheels in it.

    That assumes I implemented your ctr update correctly -- [(ctr+1)
    rotated left 29] + (ctr+1). But even if I got it wrong, the same
    logic will be true for whatever the update function is, as long as it
    can't be inverted.

    Well, would your recommended cycle-detector find any loop? If yes it
    might be worth that I modify my function acordingly. Clearly I need to
    improve it notheless if I can spare some time.


    In your hypothetical example there is no obvious simple direct
    distant loop of "a".

    I disagree. The whole d,t,a state repeats every 4006 steps. Obviously
    "a" will also repeat every 4006 steps.

    Right, that would my simple function detect.


    If you would like a non-hypothetical example, make a PRNG that has
    two single bit state variables called a and b. The update is temp=b;
    b=a; a=(a+temp)&1. This will generate the fibonacci sequence modulo
    2 (assuming you don't start with a=b=0) -- the sequence will be 0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1 … and obviously has a period of 3
    (this is true for both a and b). Your program will detect the loop
    only if it sees 0 as the initial value.

    … if a variable has a recurrence with the same distance more than
    twice it must considered a potential loop and the initial seed
    combination should be verified more deeply. The function currently
    bail out if a potential loop is found ...

    The confusion was mine. I misinterpreted what the program means when
    it announces LOOP FOUND as it "bails out".

    Sorry, should read "Potential Loop found", I've uploaded a updated version.

    Bob H

    Cheers,
    Karl-Uwe

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Sat Nov 7 04:13:00 2015
    On Friday, November 6, 2015 at 9:53:30 PM UTC-5, Karl-Uwe Frank wrote:
    Interesting though the loop value 29566 which I have found in every test so far.

    There is a period-29566 cycle which most randomly chosen starting points will end up in. It's not the only cycle though. I think if you have run more than 100 tests and they all ended up in that cycle, something is wrong.

    It seems that you have chosen the value a=0x15BADB04 very carefully.

    I didn't. It is just my favorite constant because it looks is "is bad boy".

    2) does the 100% recurrence of the initial state for (a,d,t) affect the output quality in a negative way?

    This is a consequence of the a,d,t update being invertible. For PRNG this is often considered a desirable property.

    3) will the cycle detector function you are using find loops in one of the vars (a,d,t)?

    I have no idea what that means. Are you thinking that, for example, a will repeat with one period while d and t repeat with a longer period? I guess that is what happens when a=0, d<32, t<32, since a and t repeat with period=1 and d repeats with period
    2. I don't know if Floyd's cycle detector can be modified to look for that.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to Karl-Uwe Frank on Fri Nov 6 15:31:57 2015
    On Friday, November 6, 2015 at 5:00:21 PM UTC-5, Karl-Uwe Frank wrote:
    For example, consider what happens when you start with ctr=DD56D3A3.
    On the next step (n=2) it enters a cycle that has period 250. Of
    course, you can't know that until you see n=252.

    n ctr
    1 DD56D3A3
    2 7901AE18
    ...
    251 88017E4E
    252 7901AE18

    Your program is never going to see DD56D3A3 again, so it's not going
    to report a loop, even though it spent all but one step firmly
    entrenched in a cycle.

    Well, would your recommended cycle-detector find any loop?

    Yes! That's how I found the example. My actual starting value was something like 1200 steps before the one listed there as n=1. As I posted earlier, it is simple to implement and easy to prove correct. Hardly takes any memory. Runs about 1/3 the rate
    of the generator by itself. It's well enough known that it has a wikipedia page.

    I used that cycle detector again today, to look at your ctr function. I realize you are no longer using that, but it's a nice example of a function that can run for a while before it enters a loop.

    I tried 10,000 random starting values and let it run until it got into a cycle. The cycles it got into are shown below, along with their cycle length.

    02F6FD77 period 250
    007BA0F7 period 853
    008A1BDA period 853
    008C3FC9 period 1812
    0006B6E1 period 2244
    000F5672 period 3069
    0024B858 period 3069
    000B95A9 period 4480
    00061FEE period 9730
    001A84BD period 9730
    00011A83 period 29566

    Among the things that stood out are the cycles that have the same period. This suggests they are isomorphic. So looked for a reason. To make a long story short it looks like your ctr update function, which I will write as f(x) = rotl(x+1,29) + x+1 is
    such that if x+y == 0xFFFFFFFD then there is a much-better-than-expected chance that f(x)+f(y) == 0xFFFFFFFD. So for example the two period-3069 cycles are essentially the same, the value in one just being the negative of the value in the other, offset
    by 5.

    I also found a fixed point, using algebra. f(0xFFFFFFFE) = 0xFFFFFFFE. Fortunately there is no other x s.t. f(x) = 0xFFFFFFFE. So unless you start in that state, you won't enter it. But if you start in it you won't leave it.

    It should be possible to to work out by hand whether there are any period-2 loops and whether they can be entered.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rsharris.bx.psu.edu@gmail.com@21:1/5 to All on Sat Nov 7 14:18:51 2015
    There is a period-29566 cycle which most randomly chosen starting
    points will end up in. It's not the only cycle though. I think if
    you have run more than 100 tests and they all ended up in that cycle,
    something is wrong.

    Not that much but the results mostly look like this one

    4294967295) 2021ee5f 85873670 0a11d53c 4fff3c6a f19f153c | 5cbbdc5f |
    4294967295
    4294996861) 78cf9471 ce40f024 d12d7ef9 8cf873cf dfa75ef9 | 5cbbdc5f |
    29566
    4295026427) 58aba37a 2181b9fd c7e035e9 240a2e43 0da895e9 | 5cbbdc5f |
    29566
    4295055993) fce64fe5 534ebea6 48fe8951 0d6e245b 0d89e951 | 5cbbdc5f |
    29566

    Does the first column mean you ran 4 billion steps of your generator every time?

    The cycle detection should run pretty fast since you're working in C. My implementation is pure python, which I expect to be at least a factor of 4 slower than C. Yet, I ran the test from 10 thousand randomly chosen starting points and it took only a
    few hours. But I stopped, each time, when it detected a cycle.

    Of those 10K, about 80% ended up in the period-29566 cycle. Four other cycles -- with periods 9730, 9730, 3069 and 3069 -- were each discovered about 400-500 times. Some other rarer ones were encountered.

    There are also other cycles but they were observed more rarely. And a different program looking only for short periods identified periods of 6, 6, 8, 8, 21, 21, 66, and 66 in addition to the period-1 and period-250 mentioned earlier.

    didn't know ... what invertible meant.

    Invertible means the process can theoretically be run in reverse, because for any state s there is only one state s' that you can give your update function and get s.

    That your t,d,a update is invertible is obvious and easily computed, like this:

    t = t - a
    d = (d - a) ^ t
    a = a - (d >> 5)

    That your ctr update is not invertible is obvious because (among other reasons) there are two states, DD56D3A3 and 88017E4E, that produce the same result.

    Sometimes you may be able to prove the update is invertible even though you can't easily write code to run it in reverse.

    Bob H

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Sat Nov 7 22:44:17 2015
    On 07.11.15 13:13, rsharris.bx.psu.edu@gmail.com wrote:
    On Friday, November 6, 2015 at 9:53:30 PM UTC-5, Karl-Uwe Frank
    wrote:
    Interesting though the loop value 29566 which I have found in every
    test so far.

    There is a period-29566 cycle which most randomly chosen starting
    points will end up in. It's not the only cycle though. I think if
    you have run more than 100 tests and they all ended up in that cycle, something is wrong.

    Not that much but the results mostly look like this one

    4294967295) 2021ee5f 85873670 0a11d53c 4fff3c6a f19f153c | 5cbbdc5f |
    4294967295
    4294996861) 78cf9471 ce40f024 d12d7ef9 8cf873cf dfa75ef9 | 5cbbdc5f |
    29566
    4295026427) 58aba37a 2181b9fd c7e035e9 240a2e43 0da895e9 | 5cbbdc5f |
    29566
    4295055993) fce64fe5 534ebea6 48fe8951 0d6e245b 0d89e951 | 5cbbdc5f |
    29566


    It seems that you have chosen the value a=0x15BADB04 very
    carefully.

    I didn't. It is just my favorite constant because it looks is "is bad
    boy".

    The simplest correction may be to just do away with the
    test/exception and ==> set a to some value you know is good <== ...
    and set a=0x15BADB04

    Ha, I really like that :-))


    2) does the 100% recurrence of the initial state for (a,d,t) affect
    the output quality in a negative way?

    This is a consequence of the a,d,t update being invertible. For PRNG
    this is often considered a desirable property.

    Okay, that's a good signed then. Really didn't know about this, not even
    what invertible meant.


    3) will the cycle detector function you are using find loops in one
    of the vars (a,d,t)?

    I have no idea what that means. Are you thinking that, for example,
    a will repeat with one period while d and t repeat with a longer
    period? I guess that is what happens when a=0, d<32, t<32, since a
    and t repeat with period=1 and d repeats with period 2. I don't know
    if Floyd's cycle detector can be modified to look for that.

    Due to the new seeding function a=b=c=d=0 will never occur from now on,
    thanks to your correction.

    I'm only undecided if a loop of a,d or t will harm the output quality.
    My conjecture is however that it will not have any negative affect,
    since even the version with the heavy looping ctr survive the very
    stringed tests of TestU01. And the chances are low for a=0, d<32, t<32,
    I suppose now.

    Bob H

    Cheers,
    Karl-Uwe

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl.Frank@Freecx.co.uk@21:1/5 to Karl-Uwe Frank on Sun Nov 8 23:02:13 2015
    On 08.11.15 21:39, Karl-Uwe Frank wrote:
    On 08.11.15 18:57, Karl-Uwe Frank wrote:

    This means that the seeding function with

    a = 0x15BADB04;
    b = 0x9E3779B9;
    c = 0;
    d = 0;
    t = a + b + c + d;

    represent the very basic setup.

    This initial setup passed the Crush-Test


    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:14:08.00

    All tests were passed


    Changing the seeding function this way

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = seed[0];
    b = seed[1];
    c = seed[2];
    d = seed[3];

    if (a==0) a = 0x15BADB04;
    if (b==0) b = 0x9E3779B9;

    t = a + b + c + d;
    }

    will keep the known good basic state (which hopefully will survive the BigCrush-Test) *and* allow the passing of an arbitrary seed with 128bit entropy.


    OMG - tired, need to sleep and think with a clear mind - because
    obviously this seeding will not prevent d<32 and a=1 running into a
    "dead" loop ...

    Will

    a = 0x15BADB04;
    b = 0;
    c = 0;
    d = 0;
    t = 0x9E3779B9;

    giving a try with TestU01 tomorow ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to rsharris.bx.psu.edu@gmail.com on Sun Nov 8 18:57:47 2015
    On 07.11.15 23:18, rsharris.bx.psu.edu@gmail.com wrote:
    There is a period-29566 cycle which most randomly chosen
    starting points will end up in. It's not the only cycle though.
    I think if you have run more than 100 tests and they all ended up
    in that cycle, something is wrong.

    Not that much but the results mostly look like this one

    4294967295) 2021ee5f 85873670 0a11d53c 4fff3c6a f19f153c | 5cbbdc5f
    | 4294967295 4294996861) 78cf9471 ce40f024 d12d7ef9 8cf873cf
    dfa75ef9 | 5cbbdc5f | 29566 4295026427) 58aba37a 2181b9fd c7e035e9
    240a2e43 0da895e9 | 5cbbdc5f | 29566 4295055993) fce64fe5 534ebea6
    48fe8951 0d6e245b 0d89e951 | 5cbbdc5f | 29566

    Does the first column mean you ran 4 billion steps of your generator
    every time?

    No not on every test, only the last 100 I did. Before I found other
    short cycles after a few outputs, but mostly due to the nature of my too
    simple loop finder it has to do all 4.2 billion first before finding a
    loop. But the counter version (bcd32_ctr) is history no, just a mindless "panic-fix" - but nonetheless a very good opportunity to learn something
    about how to analyse a PRNG.


    The cycle detection should run pretty fast since you're working in C.
    My implementation is pure python, which I expect to be at least a
    factor of 4 slower than C. Yet, I ran the test from 10 thousand
    randomly chosen starting points and it took only a few hours. But I
    stopped, each time, when it detected a cycle.

    I figured out that Python is more than 10 times slower and pypy 4 times
    slower than C, but for those 4 billion output and tests my computer need
    only round about 20 seconds, so it doesn't really hurt. ;-)


    Of those 10K, about 80% ended up in the period-29566 cycle. Four
    other cycles -- with periods 9730, 9730, 3069 and 3069 -- were each discovered about 400-500 times. Some other rarer ones were
    encountered.

    There are also other cycles but they were observed more rarely. And
    a different program looking only for short periods identified periods
    of 6, 6, 8, 8, 21, 21, 66, and 66 in addition to the period-1 and
    period-250 mentioned earlier.

    Obviously your implementation of Floyd's cycle detector is able to find
    a loop pretty fast.


    didn't know ... what invertible meant.

    Invertible means the process can theoretically be run in reverse,
    because for any state s there is only one state s' that you can give
    your update function and get s.

    That your t,d,a update is invertible is obvious and easily computed,
    like this:

    t = t - a
    d = (d - a) ^ t
    a = a - (d>> 5)

    If I understand it correctly, using this formulae you may be able to
    reverse calculate the initial seeding values once you know how many
    times there was an output and the current values of a,d,t at that
    certain point?


    That your ctr update is not invertible is obvious because (among
    other reasons) there are two states, DD56D3A3 and 88017E4E, that
    produce the same result.

    Sometimes you may be able to prove the update is invertible even
    though you can't easily write code to run it in reverse.

    Thanks for the explanation and the example.


    I like to summarize again what I think I have understood so far

    1) the PRNG function consist of two parts

    a) the invertible keystream generator a,d,t
    b) the non-invertible bit mixer b,c


    2) the initial state of a,d,t does always run into a loop of yet unknown
    length


    Maybe my primitive loop finder (which should be called "initial state
    finder" perhaps) might be useful finding inevitable a,d,t re-occurrence
    because it will be the initial value set. I will try that later tonight
    and let the computer run until it's found.


    Still for me the question remain if there might be a certain combination
    of a,d,t that will force them into a short loop?

    And furthermore, unfortunately it does not pass the Crush-Test of
    TestU01 if we set the initial values to a=0x15BADB04,b=c=d=t=0 for
    example. As it seems the accumulation of a,b,c,d into t within the
    seeding has to be done.

    Therefore it is necessary changing the seeding function like

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = 0x15BADB04;
    b = 0x9E3779B9;
    c = seed[0];
    d = seed[1];
    t = a + b + c + d;
    }

    because otherwise it will not survive under the Crush-Test from TestU01.
    Now the seed can hold a max. of 64 bit only.

    When I run the Crush-Test earlier in the days of first publishing my
    algorithm the values of a,b,c,d were always seeded with proper 32bit
    values all the way and therefore it passes all stringed tests. Clearly I
    was too optimistic in the beginning about seeding the PRNG with any
    arbitrary 128bit value. If only initialising "a" with a fixed value and
    all the others zero the resulting output is not random enough. So at
    least "a" and "b" have to be pre-set with fixed start value, and also
    the accumulator "t" should be build in the seeding function.

    This means that the seeding function with

    a = 0x15BADB04;
    b = 0x9E3779B9;
    c = 0;
    d = 0;
    t = a + b + c + d;

    represent the very basic setup.

    This initial setup passed the Crush-Test


    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:14:08.00

    All tests were passed



    I will run a BigCrush-Test on the basic setup and publish the result
    later as it take more than 8 hours to complete.

    Bob H

    Cheers,
    Karl-Uwe

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to Karl-Uwe Frank on Sun Nov 8 21:39:36 2015
    On 08.11.15 18:57, Karl-Uwe Frank wrote:

    This means that the seeding function with

    a = 0x15BADB04;
    b = 0x9E3779B9;
    c = 0;
    d = 0;
    t = a + b + c + d;

    represent the very basic setup.

    This initial setup passed the Crush-Test


    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:14:08.00

    All tests were passed


    Changing the seeding function this way

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = seed[0];
    b = seed[1];
    c = seed[2];
    d = seed[3];

    if (a==0) a = 0x15BADB04;
    if (b==0) b = 0x9E3779B9;

    t = a + b + c + d;
    }

    will keep the known good basic state (which hopefully will survive the BigCrush-Test) *and* allow the passing of an arbitrary seed with 128bit entropy.



    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to Karl.Frank@Freecx.co.uk on Tue Nov 10 11:31:12 2015
    On 08.11.15 23:02, Karl.Frank@Freecx.co.uk wrote:
    On 08.11.15 21:39, Karl-Uwe Frank wrote:
    On 08.11.15 18:57, Karl-Uwe Frank wrote:

    This means that the seeding function with

    a = 0x15BADB04;
    b = 0x9E3779B9;
    c = 0;
    d = 0;
    t = a + b + c + d;

    represent the very basic setup.

    This initial setup passed the Crush-Test


    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:14:08.00

    All tests were passed


    Changing the seeding function this way

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = seed[0];
    b = seed[1];
    c = seed[2];
    d = seed[3];

    if (a==0) a = 0x15BADB04;
    if (b==0) b = 0x9E3779B9;

    t = a + b + c + d;
    }

    will keep the known good basic state (which hopefully will survive the
    BigCrush-Test) *and* allow the passing of an arbitrary seed with 128bit
    entropy.


    OMG - tired, need to sleep and think with a clear mind - because
    obviously this seeding will not prevent d<32 and a=1 running into a
    "dead" loop ...

    Will

    a = 0x15BADB04;
    b = 0;
    c = 0;
    d = 0;
    t = 0x9E3779B9;

    giving a try with TestU01 tomorow ...

    ...and it produced to many grave failures just in the Crush-Test...

    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:10:06.64
    The following tests gave p-values outside [0.001, 0.9990]:
    (eps means a value < 1.0e-300):
    (eps1 means a value < 1.0e-15):

    Test p-value
    ----------------------------------------------
    6 CollisionOver, t = 4 eps
    17 BirthdaySpacings, t = 8 5.4e-8
    86 HammingIndep, L = 30 4.7e-5
    ----------------------------------------------
    All other tests were passed



    And also this seeding failed the Crush-Test with grave errors

    a = 0x15BADB04;
    b = 0;
    c = 0;
    d = 0;

    t = a + b + c + d;

    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:19:20.69
    The following tests gave p-values outside [0.001, 0.9990]:
    (eps means a value < 1.0e-300):
    (eps1 means a value < 1.0e-15):

    Test p-value
    ----------------------------------------------
    6 CollisionOver, t = 4 eps
    17 BirthdaySpacings, t = 8 7.1e-8
    ----------------------------------------------
    All other tests were passed




    However, handling the seeding function this way

    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = 0x15BADB04;
    b = seed[0] + 0x9E3779B9;
    c = seed[1];
    d = seed[2];

    if (b==0) b = 0x9E3779B9;

    t = a + b + c + d;
    }

    with

    seed[0]=0 // ==> b=0x9E3779B9
    seed[1]=0
    seed[2]=0

    did pass TestU01, even the BigCrush.

    ========= Summary results of BigCrush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 160
    Total CPU time: 08:05:33.38

    All tests were passed

    ------------------------------------------------

    Additionally I've tested with

    seed[0]=0x61C88648 // ==> b=1
    seed[1]=0
    seed[2]=0

    and it passed the Crush-Test and the BigCrush too.

    ========= Summary results of Crush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 144
    Total CPU time: 01:15:20.61

    All tests were passed


    ========= Summary results of BigCrush =========

    Version: TestU01 1.2.3
    Generator: bcd32 32-bit
    Number of statistics: 160
    Total CPU time: 08:35:21.49

    All tests were passed



    Therefore it become evident that at least two of the internal state
    variables a,b,c,d *must* be initialised with some appropriate values in
    order to survive all the stringed tests of the battery TestU01. And t
    has to be the accumulation of a,b,c,d whatever the result may be. A
    fixed value does not work as the Crush and BigCrush test has revealed.

    Now the new seeding function will allow only 96bit of entropy in the max
    being seeded into b,c,d - which by the way fit the name of the PRNG
    nicely :-)

    Cheers,
    Karl-Uwe



    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to Karl-Uwe Frank on Tue Nov 10 12:51:52 2015
    On 10.11.15 11:31, Karl-Uwe Frank wrote:

    ------------------------------------------------

    Additionally I've tested with

    seed[0]=0x61C88648 // ==> b=1
    seed[1]=0
    seed[2]=0

    and it passed the Crush-Test and the BigCrush too.


    Bloody copy+paste error ....

    seed[0]=0x61C88647 // ==> b=1

    of course ...

    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karl-Uwe Frank@21:1/5 to All on Thu Nov 19 18:03:28 2015
    After a second thought, I agree with Mok-Kong Shen who pointed out in a different forum that adding the constant value to *b* is useless and the
    user should decide which values he like to seed into *b,c,d*. Therefore
    I have have dropped adding the constant value 0x9E3779B9 to *b* and the
    seeding function is sufficient like below


    // The Seeding Function
    void seed_bcd32(uint32_t seed[]) {
    a = 0x15BADB04;
    b = seed[0];
    c = seed[1];
    d = seed[2];

    if (b==0) b = 0x9E3779B9;

    t = a + b + c + d;
    }

    // The PRNG Function
    uint32_t bcd32() {
    a = a + (d >> 5);
    b = a + (b ^ c);
    c = a + (b << 13);
    d = a + (d ^ t);
    t = a + t;

    return (b ^ c ^ d);
    }


    Cheers,
    Karl-Uwe


    --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)