• random.SystemRandom().randint() inefficient

    From Cecil Westerhof@21:1/5 to All on Tue Jul 26 16:38:38 2022
    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Cecil Westerhof on Tue Jul 26 15:33:46 2022
    Cecil Westerhof <Cecil@decebal.nl> writes:
    I need to get a random integer.

    There are all kinds of trade-offs. When you need entropy,
    the function sometimes has to collect data from hardware,
    which takes some time. Pseudo-random integers sometimes can
    be calculated faster. As a compromise, you can get some
    entropy and use it to initialize ("seed") a pseudo-random
    number generator and then get some pseudo-random numbers
    until you seed it again. A linear-congruential generator
    should be fast enough, but in some cases might not be random
    enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Weatherby,Gerard@21:1/5 to All on Tue Jul 26 17:30:32 2022
    QWJzb2x1dGVseS4gVGhlIHRhc2sgKOKAnGdlbmVyYXRlIGEgcmFuZG9tIG51bWJlcuKAnSkgaXMg aWxsLWRlZmluZWQuDQoNCldhbnQgYSBmYXN0IHJhbmRvbSBudW1iZXI/IFVzZSA1NTIwMTU5MzMg KEkganVzdCByZXRyaWV2ZWQgaXQgZnJvbSByYW5kb20ub3JnKS4NCg0KV2FudCBhIHRydWUgcmFu ZG9tIG51bWJlciwgdXNlIHJhbmRvbS5vcmcgQVBJLiAoaHR0cHM6Ly9hcGkucmFuZG9tLm9yZy9w cmljaW5nKS4NCg0KU29tZXRoaW5nIGluIGJldHdlZW4sIGZvbGxvdyBhcHByb2FjaGVzIFN0ZWZh biBzdWdnZXN0cy4NCg0KDQoNCuKAlA0KR2VyYXJkIFdlYXRoZXJieSB8IEFwcGxpY2F0aW9uIEFy Y2hpdGVjdCBOTVJib3ggfCBOQU4gfCBEZXBhcnRtZW50IG9mIE1vbGVjdWxhciBCaW9sb2d5IGFu ZCBCaW9waHlzaWNzDQogVUNvbm4gSGVhbHRoIDI2MyBGYXJtaW5ndG9uIEF2ZW51ZSwgRmFybWlu Z3RvbiwgQ1QgMDYwMzAtNjQwNiB1Y2hjLmVkdQ0KT24gSnVsIDI2LCAyMDIyLCAxMTo0NSBBTSAt MDQwMCwgU3RlZmFuIFJhbSA8cmFtQHplZGF0LmZ1LWJlcmxpbi5kZT4sIHdyb3RlOg0KKioqIEF0 dGVudGlvbjogVGhpcyBpcyBhbiBleHRlcm5hbCBlbWFpbC4gVXNlIGNhdXRpb24gcmVzcG9uZGlu Zywgb3BlbmluZyBhdHRhY2htZW50cyBvciBjbGlja2luZyBvbiBsaW5rcy4gKioqDQoNCkNlY2ls IFdlc3RlcmhvZiA8Q2VjaWxAZGVjZWJhbC5ubD4gd3JpdGVzOg0KSSBuZWVkIHRvIGdldCBhIHJh bmRvbSBpbnRlZ2VyLg0KDQpUaGVyZSBhcmUgYWxsIGtpbmRzIG9mIHRyYWRlLW9mZnMuIFdoZW4g eW91IG5lZWQgZW50cm9weSwNCnRoZSBmdW5jdGlvbiBzb21ldGltZXMgaGFzIHRvIGNvbGxlY3Qg ZGF0YSBmcm9tIGhhcmR3YXJlLA0Kd2hpY2ggdGFrZXMgc29tZSB0aW1lLiBQc2V1ZG8tcmFuZG9t IGludGVnZXJzIHNvbWV0aW1lcyBjYW4NCmJlIGNhbGN1bGF0ZWQgZmFzdGVyLiBBcyBhIGNvbXBy b21pc2UsIHlvdSBjYW4gZ2V0IHNvbWUNCmVudHJvcHkgYW5kIHVzZSBpdCB0byBpbml0aWFsaXpl ICgic2VlZCIpIGEgcHNldWRvLXJhbmRvbQ0KbnVtYmVyIGdlbmVyYXRvciBhbmQgdGhlbiBnZXQg c29tZSBwc2V1ZG8tcmFuZG9tIG51bWJlcnMNCnVudGlsIHlvdSBzZWVkIGl0IGFnYWluLiBBIGxp bmVhci1jb25ncnVlbnRpYWwgZ2VuZXJhdG9yDQpzaG91bGQgYmUgZmFzdCBlbm91Z2gsIGJ1dCBp biBzb21lIGNhc2VzIG1pZ2h0IG5vdCBiZSByYW5kb20NCmVub3VnaC4NCg0KDQotLQ0KaHR0cHM6 Ly91cmxkZWZlbnNlLmNvbS92My9fX2h0dHBzOi8vbWFpbC5weXRob24ub3JnL21haWxtYW4vbGlz dGluZm8vcHl0aG9uLWxpc3RfXzshIUNuX1VYX3AzIWtqSjdVNmtrU1c4dU9UV0pDbkhlWG4wQWNj emQ2MTlhc21JclZNQV9OT3F2UF9KTXFxUTFScHA2MXNwdFF1Y2JPdUVGR2I2b3Q0OTlfRnFhUmFq aVM5cyQNCg==

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to python-list@python.org on Wed Jul 27 04:01:20 2022
    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?


    You're setting up a brand new SystemRandom instance just for a single
    random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to python-list@python.org on Wed Jul 27 06:13:24 2022
    On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?


    You're setting up a brand new SystemRandom instance just for a single random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    Thanks. I thought I did something wrong and I did.
    I will try to implement like you said and look what the result will
    be. (And share it.)

    Thanks! Don't feel bad; performance testing is *hard*, getting
    meaningful results takes a lot of of fiddling with parameters, and
    getting interesting AND meaningful results can sometimes seem about
    impossible.

    (As I understand it both do more, or less the same and should have
    comparable performance.)

    In normal production work? Yes (the SystemRandom object doesn't have
    any significant state - a seeded RNG could have a lot more overhead
    here). But for performance testing? The work of instantiating the
    class could be completely irrelevant, or it could be dominating your
    results. It's hard to say, hence the suggestion to try it without reinstantiating.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Chris Angelico on Tue Jul 26 21:26:11 2022
    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?


    You're setting up a brand new SystemRandom instance just for a single
    random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    Thanks. I thought I did something wrong and I did.
    I will try to implement like you said and look what the result will
    be. (And share it.)

    (As I understand it both do more, or less the same and should have
    comparable performance.)

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Chris Angelico on Tue Jul 26 23:47:59 2022
    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
    <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?


    You're setting up a brand new SystemRandom instance just for a single
    random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    Thanks. I thought I did something wrong and I did.
    I will try to implement like you said and look what the result will
    be. (And share it.)

    Thanks! Don't feel bad; performance testing is *hard*, getting
    meaningful results takes a lot of of fiddling with parameters, and
    getting interesting AND meaningful results can sometimes seem about impossible.

    (As I understand it both do more, or less the same and should have
    comparable performance.)

    In normal production work? Yes (the SystemRandom object doesn't have
    any significant state - a seeded RNG could have a lot more overhead
    here). But for performance testing? The work of instantiating the
    class could be completely irrelevant, or it could be dominating your
    results. It's hard to say, hence the suggestion to try it without reinstantiating.

    It had a very big influence. Original it took about three times more
    time to run my program. (The program was still running when I posted
    the original post and the difference was higher as I anticipated.)
    Removing that did cut about 45% of the execution time of the program.
    (So the initiation is quit expensive.)
    But it still takes about 50% more time. So I am still a bit
    flabbergasted.

    The new code:
    from random import SystemRandom
    system_random = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

    The first two statements are executed once.
    The last statement I think about 75 * 10 ** 6.

    So it seems that my first idea of using randbelow was the correct one.
    But if anyone could explain why SystemRandom is so much more
    expensive, I would be interested to know it.
    (Or am I still doing something wrong?)

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to python-list@python.org on Wed Jul 27 08:44:01 2022
    On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
    <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should >> >> this not be happening?


    You're setting up a brand new SystemRandom instance just for a single
    random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    Thanks. I thought I did something wrong and I did.
    I will try to implement like you said and look what the result will
    be. (And share it.)

    Thanks! Don't feel bad; performance testing is *hard*, getting
    meaningful results takes a lot of of fiddling with parameters, and
    getting interesting AND meaningful results can sometimes seem about impossible.

    (As I understand it both do more, or less the same and should have
    comparable performance.)

    In normal production work? Yes (the SystemRandom object doesn't have
    any significant state - a seeded RNG could have a lot more overhead
    here). But for performance testing? The work of instantiating the
    class could be completely irrelevant, or it could be dominating your results. It's hard to say, hence the suggestion to try it without reinstantiating.

    It had a very big influence. Original it took about three times more
    time to run my program. (The program was still running when I posted
    the original post and the difference was higher as I anticipated.)
    Removing that did cut about 45% of the execution time of the program.
    (So the initiation is quit expensive.)
    But it still takes about 50% more time. So I am still a bit
    flabbergasted.

    The new code:
    from random import SystemRandom
    system_random = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

    The first two statements are executed once.
    The last statement I think about 75 * 10 ** 6.

    So it seems that my first idea of using randbelow was the correct one.
    But if anyone could explain why SystemRandom is so much more
    expensive, I would be interested to know it.
    (Or am I still doing something wrong?)

    Hmm. There are still a lot of differences here. Are you able to make
    use of randrange() instead, to make them more consistent?

    According to the source code, secrets.randbelow is calling on an
    internal method _randbelow of the SystemRandom object, but randrange
    (if called with only one arg) will go straight into that same method.
    Here's my results:

    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange' 'randrange(10000)'
    1000000 loops, best of 5: 322 nsec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; r = SystemRandom()' 'r.randint(0, 10000)'
    200000 loops, best of 5: 1.92 usec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; r = SystemRandom()' 'r.randrange(10000)'
    200000 loops, best of 5: 1.87 usec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import
    randbelow' 'randbelow(10000)'
    200000 loops, best of 5: 1.64 usec per loop

    (The difference with the first one is that it isn't using the system
    RNG, so it has the limitations of an internal PRNG.)

    When you call randint, what happens is (1) the endpoint is incremented
    to transform it from inclusive-inclusive to inclusive-exclusive; (2)
    randrange is called with two args; (3) fast path 1 is skipped, fast
    path 2 is taken, and _randbelow gets called to get an actual random
    number, which gets zero added to it before returning.

    If, instead, you use randrange(len(to_try)), what would happen is (1)
    fast path 1 is used, and (2) _randbelow is called to get the random
    number.

    In secrets.randbelow, it's even better: (1) _randbelow is called to
    get the random number.

    I think that might be what's going on here. You're adding some tiny
    amounts of extra work, but if you were to make them more similar, the distinctions would evaporate. Here's a couple of other variants that
    use SystemRandom:

    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; randrange = SystemRandom().randrange' 'randrange(10000)'
    200000 loops, best of 5: 1.81 usec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; randrange = SystemRandom()._randbelow'
    'randrange(10000)'
    200000 loops, best of 5: 1.61 usec per loop

    Note that we shave a few usec by avoiding the attribute lookup, but
    even more by directly calling _randbelow, which is why
    secrets.randbelow is able to save a bit. But don't use internal
    methods; you can probably get pretty much equivalent performance from randrange, which is public.

    Hope that's of some value!

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dennis Lee Bieber@21:1/5 to All on Tue Jul 26 18:58:37 2022
    On Tue, 26 Jul 2022 16:38:38 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?

    Well, off the top of my head...

    For one generation of "index" you are first creating an instance of SystemRandom(), using it to generate your random integer, and then
    disposing of the instance.

    If you only need ONE random integer, the time difference probably doesn't matter. OTOH, if you need many during the run, using

    sr = SystemRandom()
    #stuff in some loop that generates multiple ints
    index = sr.randint(...)

    Hmmm, wonder if there is a speed difference between
    .randint(0, len(to_try) - 1)
    and
    .randint(1, len(to_try)) - 1


    --
    Wulfraed Dennis Lee Bieber AF6VN
    wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dennis Lee Bieber@21:1/5 to All on Tue Jul 26 19:02:21 2022
    On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:


    The new code:
    from random import SystemRandom
    system_random = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

    The first two statements are executed once.
    The last statement I think about 75 * 10 ** 6.

    So it seems that my first idea of using randbelow was the correct one.
    But if anyone could explain why SystemRandom is so much more
    expensive, I would be interested to know it.
    (Or am I still doing something wrong?)

    What happens with

    system_randint = SystemRandom().randint #no parens

    index = system_randint(...)

    which may remove the method lookup from the repetition.


    --
    Wulfraed Dennis Lee Bieber AF6VN
    wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to Dennis Lee Bieber on Wed Jul 27 09:43:20 2022
    On Wed, 27 Jul 2022 at 09:28, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:

    On Tue, 26 Jul 2022 16:38:38 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?

    Well, off the top of my head...

    For one generation of "index" you are first creating an instance of SystemRandom(), using it to generate your random integer, and then
    disposing of the instance.

    If you only need ONE random integer, the time difference probably doesn't matter. OTOH, if you need many during the run, using

    sr = SystemRandom()
    #stuff in some loop that generates multiple ints
    index = sr.randint(...)

    Hmmm, wonder if there is a speed difference between
    .randint(0, len(to_try) - 1)
    and
    .randint(1, len(to_try)) - 1


    Probably not significant, since the same amount of arithmetic gets
    done either way. But switching to single-arg randrange(len(to_try))
    will definitely help, and IMO is clearer as well (since the
    implication is selecting one from a group of items).

    Incidentally - if you are actually trying to select a specific item,
    you may want to consider random.choice.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Barry@21:1/5 to All on Wed Jul 27 07:38:34 2022
    On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?

    What is the OS that you are running on and its version?
    If it’s linux what is the kernel version?
    What version of python and where from?

    Barry


    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof
    --
    https://mail.python.org/mailman/listinfo/python-list


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Barry on Wed Jul 27 10:45:47 2022
    Barry <barry@barrys-emacs.org> writes:

    On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?

    What is the OS that you are running on and its version?
    If it’s linux what is the kernel version?
    What version of python and where from?

    That is always good information of-course.
    Debian 11.3
    5.10.0-13-amd64
    3.9.2

    What do you mean with where the python version is from?

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Chris Angelico on Wed Jul 27 11:13:08 2022
    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
    <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
    <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should >> >> >> this not be happening?


    You're setting up a brand new SystemRandom instance just for a single >> >> > random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    Thanks. I thought I did something wrong and I did.
    I will try to implement like you said and look what the result will
    be. (And share it.)

    Thanks! Don't feel bad; performance testing is *hard*, getting
    meaningful results takes a lot of of fiddling with parameters, and
    getting interesting AND meaningful results can sometimes seem about
    impossible.

    (As I understand it both do more, or less the same and should have
    comparable performance.)

    In normal production work? Yes (the SystemRandom object doesn't have
    any significant state - a seeded RNG could have a lot more overhead
    here). But for performance testing? The work of instantiating the
    class could be completely irrelevant, or it could be dominating your
    results. It's hard to say, hence the suggestion to try it without
    reinstantiating.

    It had a very big influence. Original it took about three times more
    time to run my program. (The program was still running when I posted
    the original post and the difference was higher as I anticipated.)
    Removing that did cut about 45% of the execution time of the program.
    (So the initiation is quit expensive.)
    But it still takes about 50% more time. So I am still a bit
    flabbergasted.

    The new code:
    from random import SystemRandom
    system_random = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

    The first two statements are executed once.
    The last statement I think about 75 * 10 ** 6.

    So it seems that my first idea of using randbelow was the correct one.
    But if anyone could explain why SystemRandom is so much more
    expensive, I would be interested to know it.
    (Or am I still doing something wrong?)

    Hmm. There are still a lot of differences here. Are you able to make
    use of randrange() instead, to make them more consistent?

    According to the source code, secrets.randbelow is calling on an
    internal method _randbelow of the SystemRandom object, but randrange
    (if called with only one arg) will go straight into that same method.
    Here's my results:

    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange' 'randrange(10000)'
    1000000 loops, best of 5: 322 nsec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; r = SystemRandom()' 'r.randint(0, 10000)'
    200000 loops, best of 5: 1.92 usec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; r = SystemRandom()' 'r.randrange(10000)'
    200000 loops, best of 5: 1.87 usec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import
    randbelow' 'randbelow(10000)'
    200000 loops, best of 5: 1.64 usec per loop

    (The difference with the first one is that it isn't using the system
    RNG, so it has the limitations of an internal PRNG.)

    When you call randint, what happens is (1) the endpoint is incremented
    to transform it from inclusive-inclusive to inclusive-exclusive; (2) randrange is called with two args; (3) fast path 1 is skipped, fast
    path 2 is taken, and _randbelow gets called to get an actual random
    number, which gets zero added to it before returning.

    If, instead, you use randrange(len(to_try)), what would happen is (1)
    fast path 1 is used, and (2) _randbelow is called to get the random
    number.

    In secrets.randbelow, it's even better: (1) _randbelow is called to
    get the random number.

    I think that might be what's going on here. You're adding some tiny
    amounts of extra work, but if you were to make them more similar, the distinctions would evaporate. Here's a couple of other variants that
    use SystemRandom:

    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; randrange = SystemRandom().randrange' 'randrange(10000)'
    200000 loops, best of 5: 1.81 usec per loop
    rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
    SystemRandom; randrange = SystemRandom()._randbelow'
    'randrange(10000)'
    200000 loops, best of 5: 1.61 usec per loop

    Note that we shave a few usec by avoiding the attribute lookup, but
    even more by directly calling _randbelow, which is why
    secrets.randbelow is able to save a bit. But don't use internal
    methods; you can probably get pretty much equivalent performance from randrange, which is public.

    The times I use are user and sys added. Real can widely differ, but
    user and sys added should be reasonable constant. (But is not always.)
    The SystemRandom versions without initiation again and again.
    This are my results until now for running the complete program:
    randbelow: 13 min
    randint: 21 min
    randrange: 15 min
    randrange2: 14 min

    randrange is:
    system_random = SystemRandom()
    index = system_random.randrange(len(to_try))

    randrange2 is:
    randrange = SystemRandom().randrange
    index = randrange(len(to_try))

    The strange thing is that the first time I executed randrange2 the
    time jumped to 20 minutes. But maybe that was because the 'entropy'
    was used up?

    I think I stick with the randbelow version.
    Or is there a reason that it is, or could be better to stay with the
    randrange version? It was introduced in python 3.6. Should I take into
    account that people can work with an older version?

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Dennis Lee Bieber on Wed Jul 27 11:17:05 2022
    Dennis Lee Bieber <wlfraed@ix.netcom.com> writes:

    On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:


    The new code:
    from random import SystemRandom
    system_random = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

    The first two statements are executed once.
    The last statement I think about 75 * 10 ** 6.

    So it seems that my first idea of using randbelow was the correct one.
    But if anyone could explain why SystemRandom is so much more
    expensive, I would be interested to know it.
    (Or am I still doing something wrong?)

    What happens with

    system_randint = SystemRandom().randint #no parens

    index = system_randint(...)

    which may remove the method lookup from the repetition.

    I had already switched to randrange. This went to 15 minutes from 21
    minutes.
    By removing the method lookup I could shave off another minute. So
    certainly noteworthy. (Should have thought about it myself.)

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Chris Angelico on Wed Jul 27 11:27:59 2022
    Chris Angelico <rosuav@gmail.com> writes:

    Incidentally - if you are actually trying to select a specific item,
    you may want to consider random.choice.

    Yes, I try to select a random element, but it has also to be removed.
    An element should be used at most once. This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael F. Stemper@21:1/5 to Cecil Westerhof on Wed Jul 27 08:19:35 2022
    On 26/07/2022 16.47, Cecil Westerhof wrote:
    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
    <python-list@python.org> wrote:

    Chris Angelico <rosuav@gmail.com> writes:

    On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
    <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should >>>>> this not be happening?


    You're setting up a brand new SystemRandom instance just for a single
    random number. For a fairer comparison, set up the instance, then
    generate far more than just a single number, and see how that goes.

    Thanks. I thought I did something wrong and I did.
    I will try to implement like you said and look what the result will
    be. (And share it.)

    Thanks! Don't feel bad; performance testing is *hard*, getting
    meaningful results takes a lot of of fiddling with parameters, and
    getting interesting AND meaningful results can sometimes seem about
    impossible.

    (As I understand it both do more, or less the same and should have
    comparable performance.)

    In normal production work? Yes (the SystemRandom object doesn't have
    any significant state - a seeded RNG could have a lot more overhead
    here). But for performance testing? The work of instantiating the
    class could be completely irrelevant, or it could be dominating your
    results. It's hard to say, hence the suggestion to try it without
    reinstantiating.

    It had a very big influence. Original it took about three times more
    time to run my program. (The program was still running when I posted
    the original post and the difference was higher as I anticipated.)
    Removing that did cut about 45% of the execution time of the program.
    (So the initiation is quit expensive.)
    But it still takes about 50% more time. So I am still a bit
    flabbergasted.

    The new code:
    from random import SystemRandom
    system_random = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>

    --
    Michael F. Stemper
    This sentence no verb.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Michael F. Stemper on Wed Jul 27 17:43:27 2022
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dennis Lee Bieber@21:1/5 to All on Wed Jul 27 12:13:38 2022
    On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:


    What do you mean with where the python version is from?

    Base Python.org download, ActiveState package download, Anaconda package download, native OS install/extra install via OS repository
    download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS Python)


    --
    Wulfraed Dennis Lee Bieber AF6VN
    wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MRAB@21:1/5 to Cecil Westerhof via Python-list on Wed Jul 27 17:30:48 2022
    On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]


    When you pop an element from the last, the elements after it need to be
    moved down, which takes time.

    Try shuffling the list and then popping the now randomly-ordered
    elements off the end.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to MRAB on Wed Jul 27 19:24:14 2022
    MRAB <python@mrabarnett.plus.com> writes:

    On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>
    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]


    When you pop an element from the last, the elements after it need to be
    moved down, which takes time.

    Try shuffling the list and then popping the now randomly-ordered
    elements off the end.

    Would shuffling not be a lot more expensive? Especially because I do
    not eat the whole list.

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Dennis Lee Bieber on Wed Jul 27 19:21:53 2022
    Dennis Lee Bieber <wlfraed@ix.netcom.com> writes:

    On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:


    What do you mean with where the python version is from?

    Base Python.org download, ActiveState package download, Anaconda package download, native OS install/extra install via OS repository
    download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS Python)

    Just the default Debian install.

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mats Wichmann@21:1/5 to Cecil Westerhof via Python-list on Wed Jul 27 11:37:44 2022
    On 7/27/22 02:45, Cecil Westerhof via Python-list wrote:
    Barry <barry@barrys-emacs.org> writes:

    What version of python and where from?

    That is always good information of-course.
    Debian 11.3
    5.10.0-13-amd64
    3.9.2

    What do you mean with where the python version is from?

    On Windows, the platform of a large proportion of people asking
    questions here, there are many possible Python builds (python.org,
    Microsoft Store, Conda, ActiveState, etc.) and it sometimes matters,
    thus that tends to be a standard question that gets asked here. In your
    case the implied answer is "distro package".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Barry@21:1/5 to All on Wed Jul 27 19:00:42 2022
    On 27 Jul 2022, at 17:09, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Barry <barry@barrys-emacs.org> writes:

    On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    I need to get a random integer. At first I tried it with:
    from secrets import randbelow
    index = randbelow(len(to_try))

    This works perfectly, but it took some time. So I thought I try:
    from random import SystemRandom
    index = SystemRandom().randint(0, len(to_try) - 1)

    A first indication is that the second version would take about two
    times as much time as the first. Is there a reason for this, or should
    this not be happening?

    What is the OS that you are running on and its version?
    If it’s linux what is the kernel version?
    What version of python and where from?

    That is always good information of-course.
    Debian 11.3
    5.10.0-13-amd64
    3.9.2

    What do you mean with where the python version is from?

    Because random number generator implementation depends on the
    OS since it is SystemRandom() that you are comparing with python’s Implementation.

    Barry

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MRAB@21:1/5 to Cecil Westerhof via Python-list on Wed Jul 27 19:55:26 2022
    On 27/07/2022 18:24, Cecil Westerhof via Python-list wrote:
    MRAB <python@mrabarnett.plus.com> writes:

    On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>
    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]


    When you pop an element from the last, the elements after it need to be
    moved down, which takes time.

    Try shuffling the list and then popping the now randomly-ordered
    elements off the end.

    Would shuffling not be a lot more expensive? Especially because I do
    not eat the whole list.

    You won't know until you time it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Roel Schroeven@21:1/5 to Cecil Westerhof via Python-list on Wed Jul 27 20:44:45 2022
    Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]
    Do you know in advance how many items you'll need, or maybe an upper
    limit on the amount? In that case it might be more efficient to use random.sample(to_try, k=nr_items_needed).

    --
    "Honest criticism is hard to take, particularly from a relative, a friend,
    an acquaintance, or a stranger."
    -- Franklin P. Jones

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alan Bawden@21:1/5 to All on Wed Jul 27 14:42:55 2022
    Cecil Westerhof <Cecil@decebal.nl> writes:

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.

    Instead of using pop to do that why not something like:

    def lazy_shuffle(seq):
    """
    Generate the elements of the given sequence in a random order.
    """
    # Delete the next line if you want to use a different version of
    # randrange:
    from random import randrange
    # Delete the next line if SEQ is already a mutable sequence and you
    # are willing to have it destroyed by this process:
    seq = list(seq)
    n = len(seq)
    while n:
    i = randrange(n)
    yield seq[i]
    n -= 1
    if i < n:
    seq[i] = seq[n]

    --
    Alan Bawden

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to MRAB on Wed Jul 27 21:17:41 2022
    MRAB <python@mrabarnett.plus.com> writes:

    When you pop an element from the last, the elements after it need to be
    moved down, which takes time.

    Try shuffling the list and then popping the now randomly-ordered
    elements off the end.
    Would shuffling not be a lot more expensive? Especially because I do
    not eat the whole list.

    You won't know until you time it.

    A first indication is that it doubles the needed time.

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Alan Bawden on Wed Jul 27 21:14:29 2022
    Alan Bawden <alan@csail.mit.edu> writes:

    Cecil Westerhof <Cecil@decebal.nl> writes:

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.

    Instead of using pop to do that why not something like:

    def lazy_shuffle(seq):
    """
    Generate the elements of the given sequence in a random order.
    """
    # Delete the next line if you want to use a different version of
    # randrange:
    from random import randrange
    # Delete the next line if SEQ is already a mutable sequence and you
    # are willing to have it destroyed by this process:
    seq = list(seq)
    n = len(seq)
    while n:
    i = randrange(n)
    yield seq[i]
    n -= 1
    if i < n:
    seq[i] = seq[n]

    That looks interesting.
    But there is on problem. (I think.)
    The list is only partly eaten and I will eat a lot of sequences. So a
    lot of sequences will be left in the runtime.
    Or is there a way to destroy the lazy_shuffle when it is not needed
    anymore?

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Roel Schroeven on Wed Jul 27 21:16:23 2022
    Roel Schroeven <roel@roelschroeven.net> writes:

    Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]
    Do you know in advance how many items you'll need, or maybe an upper
    limit on the amount? In that case it might be more efficient to use random.sample(to_try, k=nr_items_needed).

    Something else to try. :-)
    And yes: I will be using half of the list.

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alan Bawden@21:1/5 to Alan Bawden on Wed Jul 27 15:42:25 2022
    Cecil Westerhof <Cecil@decebal.nl> writes:

    Alan Bawden <alan@csail.mit.edu> writes:

    > Cecil Westerhof <Cecil@decebal.nl> writes:
    >
    > Yes, I try to select a random element, but it has also to be removed,
    > because an element should not be used more as once.
    >
    > Instead of using pop to do that why not something like:
    >
    > def lazy_shuffle(seq):
    > """
    > Generate the elements of the given sequence in a random order.
    > """
    > # Delete the next line if you want to use a different version of
    > # randrange:
    > from random import randrange
    > # Delete the next line if SEQ is already a mutable sequence and you
    > # are willing to have it destroyed by this process:
    > seq = list(seq)
    > n = len(seq)
    > while n:
    > i = randrange(n)
    > yield seq[i]
    > n -= 1
    > if i < n:
    > seq[i] = seq[n]

    That looks interesting.
    But there is on problem. (I think.)
    The list is only partly eaten and I will eat a lot of sequences. So a
    lot of sequences will be left in the runtime.
    Or is there a way to destroy the lazy_shuffle when it is not needed
    anymore?

    You don't have to worry about that. That's what Garbage Collection is
    for. After you drop the last reference to the generator, the GC will
    destroy it for you. Welcome to Python.

    BTW, what I wrote might be slightly faster if you replace it with:

    while n:
    i = randrange(n)
    yield seq[i]
    n -= 1
    seq[i] = seq[n]

    That will save you some time spent testing and branching.

    --
    Alan Bawden

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to python-list@python.org on Thu Jul 28 05:57:22 2022
    On Thu, 28 Jul 2022 at 05:36, Cecil Westerhof via Python-list <python-list@python.org> wrote:

    Roel Schroeven <roel@roelschroeven.net> writes:

    Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
    "Michael F. Stemper" <michael.stemper@gmail.com> writes:

    This is orthogonal to your question, but might be of some use to you:

    The combination of using len(to_try) as an argument to randint() and
    saving the output to a variable named "index" suggests that you might
    be setting up to select a random element from to_try, as in:
    something = to_try[index]

    If that is the case, you might want to consider using random.choice() instead:

    >>> from random import choice
    >>> to_try = [2,3,5,7,11,13,"seventeen",19]
    >>> choice(to_try)
    2
    >>> choice(to_try)
    'seventeen'
    >>> choice(to_try)
    13
    >>> choice(to_try)
    5
    >>>

    Yes, I try to select a random element, but it has also to be removed,
    because an element should not be used more as once.
    This is the code I use:
    # index = randbelow(len(to_try))
    index = randrange(len(to_try))
    found = permutation[to_try.pop(index)]
    Do you know in advance how many items you'll need, or maybe an upper
    limit on the amount? In that case it might be more efficient to use random.sample(to_try, k=nr_items_needed).

    Something else to try. :-)
    And yes: I will be using half of the list.


    A perfect job for random.sample() then, if that's *exactly* half the list.

    But if you don't know for sure how many you'll need, an easy way to
    make it more efficient would be to take the n'th element, then move
    the last element down to the n'th position, and finally pop off the
    last element. That way, you only ever pop the last element away. The
    list will get reordered arbitrarily while you do this, but if the only
    purpose of it is to remove elements from random consideration, that
    won't be a problem.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)