• Re: Dictionary order?

    From Skip Montanaro@21:1/5 to All on Mon Aug 1 15:48:45 2022

    So I decided to write a little test program to run on a variety of
    CPythons, to confirm what I was thinking.

    And instead I got a surprise.

    On 1.4 through 2.1 I got descending key order. I expected the keys to be scattered, but they weren't.

    On 2.2 through 3.5 I got ascending key order. I expected the keys to be scattered, but they weren't.

    On 3.6 through 3.10 I got insertion order, as expected.

    But why are 1.4 through 3.5 ordering so much?


    That's long in the past, but I seem to recall that key order was
    unspecified. That would give the implementer (likely Tim Peters much of the time) the freedom to do whatever worked best for performance or simplicity
    of implementation.

    Skip



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Stromberg@21:1/5 to All on Mon Aug 1 13:41:11 2022
    Hi folks.

    I'm still porting some code from Python 2.7 to 3.10.

    As part of that, I saw a list being extended with a dict.values(), and
    thought perhaps it wasn't ordered as intended on Python 2.7, even though
    the problem would conceivably just disappear on 3.10.

    So I decided to write a little test program to run on a variety of
    CPythons, to confirm what I was thinking.

    And instead I got a surprise.

    On 1.4 through 2.1 I got descending key order. I expected the keys to be scattered, but they weren't.

    On 2.2 through 3.5 I got ascending key order. I expected the keys to be scattered, but they weren't.

    On 3.6 through 3.10 I got insertion order, as expected.

    But why are 1.4 through 3.5 ordering so much? It's like they're a treap or red-black tree or something. I'm pretty sure dict's used to be ordered in
    a mostly-arbitrary way.

    What am I missing?

    Here's the little test program:

    #!/usr/local/cpython-2.7/bin/python2

    import sys

    keys = [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7, 12, 11]

    dict_ = {}
    for key in keys:
    dict_[key] = 1

    if list(dict_.keys()) == keys:
    # The order matches
    print('compact')
    sys.exit(0)
    else:
    # The order does not match
    print('list(dict_): %s, keys: %s' % (list(dict_.keys()), keys))
    sys.exit(1)

    Here's some output (irrelevant python's deleted) when run under https://stromberg.dnsalias.org/~strombrg/pythons/

    /usr/local/cpython-1.4/bin/python (1.4) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-1.5/bin/python (1.5.2) bad list(dict_): [15, 14, 12,
    11, 10, 9, 8, 7, 6, 5, 4, 2, 1], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-1.6/bin/python (1.6.1) bad list(dict_): [15, 14, 12,
    11, 10, 9, 8, 7, 6, 5, 4, 2, 1], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.0/bin/python (2.0.1) bad list(dict_): [15, 14, 12,
    11, 10, 9, 8, 7, 6, 5, 4, 2, 1], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.1/bin/python (2.1.0) bad list(dict_): [15, 14, 12,
    11, 10, 9, 8, 7, 6, 5, 4, 2, 1], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.2/bin/python (2.2.0) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.3/bin/python (2.3.0) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.4/bin/python (2.4.0) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.5/bin/python (2.5.6) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.6/bin/python (2.6.9) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-2.7/bin/python (2.7.16) bad list(dict_): [1, 2, 4, 5,
    6, 7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.0/bin/python (3.0.1) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.1/bin/python (3.1.5) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.2/bin/python (3.2.5) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.3/bin/python (3.3.7) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.4/bin/python (3.4.8) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.5/bin/python (3.5.5) bad list(dict_): [1, 2, 4, 5, 6,
    7, 8, 9, 10, 11, 12, 14, 15], keys: [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7,
    12, 11]
    /usr/local/cpython-3.6/bin/python (3.6.13) good compact /usr/local/cpython-3.7/bin/python (3.7.0) good compact /usr/local/cpython-3.8/bin/python (3.8.0) good compact /usr/local/cpython-3.9/bin/python (3.9.0) good compact /usr/local/cpython-3.10/bin/python (3.10.0) good compact

    BTW, usually with pythons (the script which can be found at the URL above),
    a little test program will be written to exit shell-true for success or shell-false for failure. But in this case I'm using the exit code not as success+failure but as compact+notcompact.

    Why are those keys so ordered?

    Also, I realize that the keys could come up ordered somehow by accident,
    but I tried with 30 values (not just 12), and still got the same
    weirdness. Naturally, as the number of key-value pairs goes up, the chance
    of accidental ordering goes way down.

    Thanks for reading!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to Skip Montanaro on Tue Aug 2 07:09:03 2022
    On Tue, 2 Aug 2022 at 06:50, Skip Montanaro <skip.montanaro@gmail.com> wrote:


    So I decided to write a little test program to run on a variety of CPythons, to confirm what I was thinking.

    And instead I got a surprise.

    On 1.4 through 2.1 I got descending key order. I expected the keys to be scattered, but they weren't.

    On 2.2 through 3.5 I got ascending key order. I expected the keys to be scattered, but they weren't.

    On 3.6 through 3.10 I got insertion order, as expected.

    But why are 1.4 through 3.5 ordering so much?


    That's long in the past, but I seem to recall that key order was
    unspecified. That would give the implementer (likely Tim Peters much of the time) the freedom to do whatever worked best for performance or simplicity
    of implementation.


    One thing that you might notice also is that using strings as keys
    will begin randomizing them with Python 3.3. But other than strings,
    it's always been "arbitrary" rather than "random".

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From 2QdxY4RzWzUUiLuE@potatochowder.com@21:1/5 to Dan Stromberg on Mon Aug 1 16:46:54 2022
    On 2022-08-01 at 13:41:11 -0700,
    Dan Stromberg <drsalists@gmail.com> wrote:

    keys = [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7, 12, 11]

    dict_ = {}
    for key in keys:
    dict_[key] = 1

    $ python
    Python 3.10.5 (main, Jun 6 2022, 18:49:26) [GCC 12.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    [hash(x) for x in range(20)]
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

    Just sayin'. :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Stromberg@21:1/5 to drsalists@gmail.com on Mon Aug 1 14:24:09 2022
    On Mon, Aug 1, 2022 at 1:41 PM Dan Stromberg <drsalists@gmail.com> wrote:

    On 1.4 through 2.1 I got descending key order. I expected the keys to be scattered, but they weren't.

    I just noticed that 1.4 was ascending order too - so it was closer to 2.2
    than 1.5.

    I guess that's kind of beside the point though - it's still more ordered
    than I'd've expected.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to 2QdxY4RzWzUUiLuE@potatochowder.com on Tue Aug 2 07:50:52 2022
    On Tue, 2 Aug 2022 at 07:48, <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:

    On 2022-08-01 at 13:41:11 -0700,
    Dan Stromberg <drsalists@gmail.com> wrote:

    keys = [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7, 12, 11]

    dict_ = {}
    for key in keys:
    dict_[key] = 1

    $ python
    Python 3.10.5 (main, Jun 6 2022, 18:49:26) [GCC 12.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    [hash(x) for x in range(20)]
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

    Just sayin'. :-)

    Yes, but I'm pretty sure that's been true for a LONG time. The hashes
    for small integers have been themselves for as long as I can remember.
    But the behaviour of the dictionary, when fed such keys, is what's
    changed.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From 2QdxY4RzWzUUiLuE@potatochowder.com@21:1/5 to Chris Angelico on Mon Aug 1 17:24:17 2022
    On 2022-08-02 at 07:50:52 +1000,
    Chris Angelico <rosuav@gmail.com> wrote:

    On Tue, 2 Aug 2022 at 07:48, <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:

    On 2022-08-01 at 13:41:11 -0700,
    Dan Stromberg <drsalists@gmail.com> wrote:

    keys = [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7, 12, 11]

    dict_ = {}
    for key in keys:
    dict_[key] = 1

    $ python
    Python 3.10.5 (main, Jun 6 2022, 18:49:26) [GCC 12.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    [hash(x) for x in range(20)]
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

    Just sayin'. :-)

    Yes, but I'm pretty sure that's been true for a LONG time. The hashes
    for small integers have been themselves for as long as I can remember.
    But the behaviour of the dictionary, when fed such keys, is what's
    changed.

    I'm not disputing either of those facts. I'm pointing out that the
    apparently arbitrary order of a mapping's keys becomes obvious when you
    look at the hashes of those keys.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Stromberg@21:1/5 to 2QdxY4RzWzUUiLuE@potatochowder.com on Mon Aug 1 16:42:58 2022
    On Mon, Aug 1, 2022 at 3:25 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:

    On 2022-08-02 at 07:50:52 +1000,
    Chris Angelico <rosuav@gmail.com> wrote:

    On Tue, 2 Aug 2022 at 07:48, <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:

    On 2022-08-01 at 13:41:11 -0700,
    Dan Stromberg <drsalists@gmail.com> wrote:

    keys = [5, 10, 15, 14, 9, 4, 1, 2, 8, 6, 7, 12, 11]

    dict_ = {}
    for key in keys:
    dict_[key] = 1

    $ python
    Python 3.10.5 (main, Jun 6 2022, 18:49:26) [GCC 12.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information. >>> [hash(x) for x in range(20)]
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

    Just sayin'. :-)

    Yes, but I'm pretty sure that's been true for a LONG time. The hashes
    for small integers have been themselves for as long as I can remember.
    But the behaviour of the dictionary, when fed such keys, is what's
    changed.

    I'm not disputing either of those facts. I'm pointing out that the apparently arbitrary order of a mapping's keys becomes obvious when you
    look at the hashes of those keys.


    It looks like the relationship no longer holds at around keys =
    list(range(250, 260))

    But i == hash(i) holds for the first million values at least.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Stromberg@21:1/5 to drsalists@gmail.com on Mon Aug 1 16:46:51 2022
    On Mon, Aug 1, 2022 at 4:42 PM Dan Stromberg <drsalists@gmail.com> wrote:


    Yes, but I'm pretty sure that's been true for a LONG time. The hashes
    for small integers have been themselves for as long as I can remember.
    But the behaviour of the dictionary, when fed such keys, is what's
    changed.

    I'm not disputing either of those facts. I'm pointing out that the
    apparently arbitrary order of a mapping's keys becomes obvious when you
    look at the hashes of those keys.


    It looks like the relationship no longer holds at around keys = list(range(250, 260))

    But i == hash(i) holds for the first million values at least.


    I could've been more clear. int dict keys stop being stored-in-order at
    near 256.

    But i == hash(i) holds for the first million values, and probably more.

    This suggests to me that there's something more than i == hash(i) going on inside dict's - but it doesn't much matter what it is for my purposes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Weatherby,Gerard@21:1/5 to All on Tue Aug 2 01:14:43 2022
    SSBkb27igJl0IHNlZSB3aGF0IGlzIHN1cnByaXNpbmcuIFRoZSBpbnRlcmZhY2UgZm9yIGEgZGlj dGlvbmFyeSBuZXZlciBzcGVjaWZpZWQgdGhlIG9yZGVyaW5nIG9mIHRoZSBrZXlzLCBzbyBJIHdv dWxkIG5vdCBiZSBzdXJwcmlzZWQgdG8gc2VlIGl0IHZhcnkgYmFzZWQgb24gcmVsZWFzZSwgcGxh dGZvcm0sIHZhbHVlcyBvZiBrZXlzIGluc2VydGVkLCBudW1iZXIgb2YgaXRlbXMgaW4gdGhlIGRp Y3Rpb25hcnksIGV0Yy4NCg0KDQoNCg0KDQrigJQNCkdlcmFyZCBXZWF0aGVyYnkgfCBBcHBsaWNh dGlvbiBBcmNoaXRlY3QgTk1SYm94IHwgTkFOIHwgRGVwYXJ0bWVudCBvZiBNb2xlY3VsYXIgQmlv bG9neSBhbmQgQmlvcGh5c2ljcw0KIFVDb25uIEhlYWx0aCAyNjMgRmFybWluZ3RvbiBBdmVudWUs IEZhcm1pbmd0b24sIENUIDA2MDMwLTY0MDYgdWNoYy5lZHUNCk9uIEF1ZyAxLCAyMDIyLCA3OjQ4 IFBNIC0wNDAwLCBEYW4gU3Ryb21iZXJnIDxkcnNhbGlzdHNAZ21haWwuY29tPiwgd3JvdGU6DQoq KiogQXR0ZW50aW9uOiBUaGlzIGlzIGFuIGV4dGVybmFsIGVtYWlsLiBVc2UgY2F1dGlvbiByZXNw b25kaW5nLCBvcGVuaW5nIGF0dGFjaG1lbnRzIG9yIGNsaWNraW5nIG9uIGxpbmtzLiAqKioNCg0K T24gTW9uLCBBdWcgMSwgMjAyMiBhdCA0OjQyIFBNIERhbiBTdHJvbWJlcmcgPGRyc2FsaXN0c0Bn bWFpbC5jb20+IHdyb3RlOg0KDQoNClllcywgYnV0IEknbSBwcmV0dHkgc3VyZSB0aGF0J3MgYmVl biB0cnVlIGZvciBhIExPTkcgdGltZS4gVGhlIGhhc2hlcw0KZm9yIHNtYWxsIGludGVnZXJzIGhh dmUgYmVlbiB0aGVtc2VsdmVzIGZvciBhcyBsb25nIGFzIEkgY2FuIHJlbWVtYmVyLg0KQnV0IHRo ZSBiZWhhdmlvdXIgb2YgdGhlIGRpY3Rpb25hcnksIHdoZW4gZmVkIHN1Y2gga2V5cywgaXMgd2hh dCdzDQpjaGFuZ2VkLg0KDQpJJ20gbm90IGRpc3B1dGluZyBlaXRoZXIgb2YgdGhvc2UgZmFjdHMu IEknbSBwb2ludGluZyBvdXQgdGhhdCB0aGUNCmFwcGFyZW50bHkgYXJiaXRyYXJ5IG9yZGVyIG9m IGEgbWFwcGluZydzIGtleXMgYmVjb21lcyBvYnZpb3VzIHdoZW4geW91DQpsb29rIGF0IHRoZSBo YXNoZXMgb2YgdGhvc2Uga2V5cy4NCg0KDQpJdCBsb29rcyBsaWtlIHRoZSByZWxhdGlvbnNoaXAg bm8gbG9uZ2VyIGhvbGRzIGF0IGFyb3VuZCBrZXlzID0NCmxpc3QocmFuZ2UoMjUwLCAyNjApKQ0K DQpCdXQgaSA9PSBoYXNoKGkpIGhvbGRzIGZvciB0aGUgZmlyc3QgbWlsbGlvbiB2YWx1ZXMgYXQg bGVhc3QuDQoNCg0KSSBjb3VsZCd2ZSBiZWVuIG1vcmUgY2xlYXIuIGludCBkaWN0IGtleXMgc3Rv cCBiZWluZyBzdG9yZWQtaW4tb3JkZXIgYXQNCm5lYXIgMjU2Lg0KDQpCdXQgaSA9PSBoYXNoKGkp IGhvbGRzIGZvciB0aGUgZmlyc3QgbWlsbGlvbiB2YWx1ZXMsIGFuZCBwcm9iYWJseSBtb3JlLg0K DQpUaGlzIHN1Z2dlc3RzIHRvIG1lIHRoYXQgdGhlcmUncyBzb21ldGhpbmcgbW9yZSB0aGFuIGkg PT0gaGFzaChpKSBnb2luZyBvbg0KaW5zaWRlIGRpY3QncyAtIGJ1dCBpdCBkb2Vzbid0IG11Y2gg bWF0dGVyIHdoYXQgaXQgaXMgZm9yIG15IHB1cnBvc2VzLg0KLS0NCmh0dHBzOi8vdXJsZGVmZW5z ZS5jb20vdjMvX19odHRwczovL21haWwucHl0aG9uLm9yZy9tYWlsbWFuL2xpc3RpbmZvL3B5dGhv bi1saXN0X187ISFDbl9VWF9wMyFqM3BfQXE1TW9HcURrNVhNc0tiNFNLczNVMW5mdU1PeDB3VmtT YV9oYlVSSjIydzZsUDhOckNPY19QWUFmRUxZT2RWbEM5eDZKekxmSU1JdzVzTGUkDQo=

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Dan Stromberg on Tue Aug 2 10:16:50 2022
    Dan Stromberg <drsalists@gmail.com> writes:
    What am I missing?

    Modern Python Dictionaries A confluence of a dozen great ideas
    - Raymond Hettinger (2017)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)