• More efficient code, but slower program

    From Cecil Westerhof@21:1/5 to All on Wed Jul 27 14:58:02 2022
    It is not very important, but I am just curious.

    Original I had in a program:
    values = [*range(100)]

    But because it is done quite often I expected that initialising:
    range_list = [*range(100)]

    and then use:
    values = range_list.copy()

    Would be more efficient. So I tried:
    timeit('values = [*range(100)]')
    1.6964535564184189

    and:
    timeit('new_values = values.copy()', 'values = [*range(100)]')
    0.6457642465829849

    That showed that it should make a positive difference.
    But when changing the program it took a little bit more time.
    I find the code with the copy a little bit better, so I kept it.
    But I am curious why the effect is the opposite of what I expected.
    It does not hurt to understand optimisation better, so I can do a
    better job when I need it.

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Cecil Westerhof on Wed Jul 27 13:29:25 2022
    Cecil Westerhof <Cecil@decebal.nl> writes:
    values = [*range(100)]

    In many cases, any iterable is just fine and a list is not
    required, just as peudo-random numbers often are just fine and
    real-world entropy is not required.

    I am using code such as the following for microbenchmarks:

    import timeit
    range_list = [*range(100)]
    sum_of_all_quotients = 0
    for i in range( 10 ):

    start_time = timeit.default_timer()
    for _ in range( 1 ):
    values = [*range(100)]
    delta = timeit.default_timer() - start_time

    start_time = timeit.default_timer()
    for _ in range( 1 ):
    values = range_list.copy()
    delta1 = timeit.default_timer() - start_time

    quotient = delta1/delta
    sum_of_all_quotients += quotient
    average_of_all_quotients = sum_of_all_quotients/( i + 1 )

    print( "average of all quotients of copy to range = ", average_of_all_quotients )

    Here, in one case, the first average printed is near 2 and
    the last one near .5, so the result is actually inverted.

    This can happen, when some piece of code is slower on first
    use, but faster when used in repetition. Reasons for such a
    behavior might include cache usage or runtime optimization.

    Usually one wants to write code for readability, and thinking
    too much about runtime efficiency optimizations is in vain,
    because one might get different results with a different
    version of Python or on a different machine.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to Stefan Ram on Wed Jul 27 17:48:47 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Cecil Westerhof <Cecil@decebal.nl> writes:
    values = [*range(100)]

    In many cases, any iterable is just fine and a list is not
    required, just as peudo-random numbers often are just fine and
    real-world entropy is not required.

    In this case both are. I must select (several times) a random element
    from the list. So I need the list.
    I also want the randomness to be as good as possible to make the
    'simulation' as good as possible.


    Usually one wants to write code for readability, and thinking
    too much about runtime efficiency optimizations is in vain,
    because one might get different results with a different
    version of Python or on a different machine.

    That is why I went for the less efficient code. ;-)

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dennis Lee Bieber@21:1/5 to All on Wed Jul 27 12:35:23 2022
    On Wed, 27 Jul 2022 14:58:02 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:

    It is not very important, but I am just curious.

    Original I had in a program:
    values = [*range(100)]

    But because it is done quite often I expected that initialising:
    range_list = [*range(100)]

    and then use:
    values = range_list.copy()

    Would be more efficient. So I tried:
    timeit('values = [*range(100)]')
    1.6964535564184189

    and:
    timeit('new_values = values.copy()', 'values = [*range(100)]')
    0.6457642465829849

    Were these done in the same program/session? If so, the first invocation may be initializing/caching the first 100 integers (Python tends
    to keep some number of integers in a permanent cache to speed later access
    to common values).

    Also rather than * unpacking of the range iterator into a [] list... just...

    list(range(100))
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
    21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
    40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
    59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
    78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
    97, 98, 99]


    ... will do it.

    Also, if your goal is to /remove/ and entry from the list via some index, you might consider if this is more effective than copying the list
    and THEN popping a value.

    full = list(range(100))
    import random
    idx = random.randint(0, len(full))
    idx
    74
    trim = full[:idx] + full[idx+1:]
    trim
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
    21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
    40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
    59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
    79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
    98, 99]
    trim == full
    False


    or

    trim2 = full[:]
    del trim2[idx]
    trim
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
    21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
    40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
    59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
    79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
    98, 99]


    The first does two partial copies skipping the item to be removed, and joins the results into a new list. The second does a full copy and DELETES
    the element to be removed from the copy.

    trim3 = full[:]
    trim3.remove(idx)
    trim3
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
    21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
    40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
    59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
    79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
    98, 99]


    This option works because the list is sequential integers and the index matches the values in the list (.remove() removes the first MATCHING
    element, so if the list can have duplicates is may not remove the one AT
    the index position).



    --
    Wulfraed Dennis Lee Bieber AF6VN
    wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From 2QdxY4RzWzUUiLuE@potatochowder.com@21:1/5 to Cecil Westerhof via Python-list on Wed Jul 27 11:24:12 2022
    On 2022-07-27 at 17:48:47 +0200,
    Regarding "Re: More efficient code, but slower program,"
    Cecil Westerhof via Python-list <python-list@python.org> wrote:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Cecil Westerhof <Cecil@decebal.nl> writes:
    values = [*range(100)]

    In many cases, any iterable is just fine and a list is not
    required, just as peudo-random numbers often are just fine and
    real-world entropy is not required.

    In this case both are. I must select (several times) a random element
    from the list. So I need the list.
    I also want the randomness to be as good as possible to make the
    'simulation' as good as possible.

    "[A]s good as possible" for simulations and tests may not require the cryptographic quality numbers from SystemRandom. Many/most pseudo
    random number generators are optimized for statistically normalized
    outputs, and are repeatable as a bonus (again, often a requirement for
    certain types of simulations; YMMV).

    Also, what if you shuffled the list first (e.g., with random.shuffle)
    and then iterated through it directly? Repeatedly removing arbitrary
    elements from the middle of a list is potentially expensive.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cecil Westerhof@21:1/5 to 2QdxY4RzWzUUiLuE@potatochowder.com on Wed Jul 27 19:33:07 2022
    2QdxY4RzWzUUiLuE@potatochowder.com writes:

    On 2022-07-27 at 17:48:47 +0200,
    Regarding "Re: More efficient code, but slower program,"
    Cecil Westerhof via Python-list <python-list@python.org> wrote:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Cecil Westerhof <Cecil@decebal.nl> writes:
    values = [*range(100)]

    In many cases, any iterable is just fine and a list is not
    required, just as peudo-random numbers often are just fine and
    real-world entropy is not required.

    In this case both are. I must select (several times) a random element
    from the list. So I need the list.
    I also want the randomness to be as good as possible to make the
    'simulation' as good as possible.

    "[A]s good as possible" for simulations and tests may not require the cryptographic quality numbers from SystemRandom. Many/most pseudo
    random number generators are optimized for statistically normalized
    outputs, and are repeatable as a bonus (again, often a requirement for certain types of simulations; YMMV).

    Also, what if you shuffled the list first (e.g., with random.shuffle)
    and then iterated through it directly? Repeatedly removing arbitrary elements from the middle of a list is potentially expensive.

    Someone else also mentioned that. I think I should put it on my list
    of things to do/try.

    --
    Cecil Westerhof
    Senior Software Engineer
    LinkedIn: http://www.linkedin.com/in/cecilwesterhof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)