Forum: >>> Magnum BBS <<<

More efficient code, but slower program

From Cecil Westerhof@21:1/5 to All on Wed Jul 27 14:58:02 2022

It is not very important, but I am just curious.

Original I had in a program:
values = [*range(100)]

But because it is done quite often I expected that initialising:
range_list = [*range(100)]

and then use:
values = range_list.copy()

Would be more efficient. So I tried:
timeit('values = [*range(100)]')
1.6964535564184189

and:
timeit('new_values = values.copy()', 'values = [*range(100)]')
0.6457642465829849

That showed that it should make a positive difference.
But when changing the program it took a little bit more time.
I find the code with the copy a little bit better, so I kept it.
But I am curious why the effect is the opposite of what I expected.
It does not hurt to understand optimisation better, so I can do a
better job when I need it.

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to Cecil Westerhof on Wed Jul 27 13:29:25 2022

Cecil Westerhof <Cecil@decebal.nl> writes:

values = [*range(100)]

In many cases, any iterable is just fine and a list is not
required, just as peudo-random numbers often are just fine and
real-world entropy is not required.

I am using code such as the following for microbenchmarks:

import timeit
range_list = [*range(100)]
sum_of_all_quotients = 0
for i in range( 10 ):

start_time = timeit.default_timer()
for _ in range( 1 ):
values = [*range(100)]
delta = timeit.default_timer() - start_time

start_time = timeit.default_timer()
for _ in range( 1 ):
values = range_list.copy()
delta1 = timeit.default_timer() - start_time

quotient = delta1/delta
sum_of_all_quotients += quotient
average_of_all_quotients = sum_of_all_quotients/( i + 1 )

print( "average of all quotients of copy to range = ", average_of_all_quotients )

Here, in one case, the first average printed is near 2 and
the last one near .5, so the result is actually inverted.

This can happen, when some piece of code is slower on first
use, but faster when used in repetition. Reasons for such a
behavior might include cache usage or runtime optimization.

Usually one wants to write code for readability, and thinking
too much about runtime efficiency optimizations is in vain,
because one might get different results with a different
version of Python or on a different machine.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Cecil Westerhof@21:1/5 to Stefan Ram on Wed Jul 27 17:48:47 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Cecil Westerhof <Cecil@decebal.nl> writes:

values = [*range(100)]

In many cases, any iterable is just fine and a list is not
required, just as peudo-random numbers often are just fine and
real-world entropy is not required.

In this case both are. I must select (several times) a random element
from the list. So I need the list.
I also want the randomness to be as good as possible to make the
'simulation' as good as possible.

Usually one wants to write code for readability, and thinking
too much about runtime efficiency optimizations is in vain,
because one might get different results with a different
version of Python or on a different machine.

That is why I went for the less efficient code. ;-)

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dennis Lee Bieber@21:1/5 to All on Wed Jul 27 12:35:23 2022

On Wed, 27 Jul 2022 14:58:02 +0200, Cecil Westerhof <Cecil@decebal.nl> declaimed the following:

It is not very important, but I am just curious.

Original I had in a program:
values = [*range(100)]

But because it is done quite often I expected that initialising:
range_list = [*range(100)]

and then use:
values = range_list.copy()

Would be more efficient. So I tried:
timeit('values = [*range(100)]')
1.6964535564184189

and:
timeit('new_values = values.copy()', 'values = [*range(100)]')
0.6457642465829849

Were these done in the same program/session? If so, the first invocation may be initializing/caching the first 100 integers (Python tends
to keep some number of integers in a permanent cache to speed later access
to common values).

Also rather than * unpacking of the range iterator into a [] list... just...

list(range(100))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99]

... will do it.

Also, if your goal is to /remove/ and entry from the list via some index, you might consider if this is more effective than copying the list
and THEN popping a value.

full = list(range(100))
import random
idx = random.randint(0, len(full))
idx

74

trim = full[:idx] + full[idx+1:]
trim

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99]

trim == full

False

or

trim2 = full[:]
del trim2[idx]
trim

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99]

The first does two partial copies skipping the item to be removed, and joins the results into a new list. The second does a full copy and DELETES
the element to be removed from the copy.

trim3 = full[:]
trim3.remove(idx)
trim3

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99]

This option works because the list is sequential integers and the index matches the values in the list (.remove() removes the first MATCHING
element, so if the list can have duplicates is may not remove the one AT
the index position).

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From 2QdxY4RzWzUUiLuE@potatochowder.com@21:1/5 to Cecil Westerhof via Python-list on Wed Jul 27 11:24:12 2022

On 2022-07-27 at 17:48:47 +0200,
Regarding "Re: More efficient code, but slower program,"
Cecil Westerhof via Python-list <python-list@python.org> wrote:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Cecil Westerhof <Cecil@decebal.nl> writes:

values = [*range(100)]

In many cases, any iterable is just fine and a list is not
required, just as peudo-random numbers often are just fine and
real-world entropy is not required.

In this case both are. I must select (several times) a random element
from the list. So I need the list.
I also want the randomness to be as good as possible to make the
'simulation' as good as possible.

"[A]s good as possible" for simulations and tests may not require the cryptographic quality numbers from SystemRandom. Many/most pseudo
random number generators are optimized for statistically normalized
outputs, and are repeatable as a bonus (again, often a requirement for
certain types of simulations; YMMV).

Also, what if you shuffled the list first (e.g., with random.shuffle)
and then iterated through it directly? Repeatedly removing arbitrary
elements from the middle of a list is potentially expensive.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Cecil Westerhof@21:1/5 to 2QdxY4RzWzUUiLuE@potatochowder.com on Wed Jul 27 19:33:07 2022

2QdxY4RzWzUUiLuE@potatochowder.com writes:

On 2022-07-27 at 17:48:47 +0200,
Regarding "Re: More efficient code, but slower program,"
Cecil Westerhof via Python-list <python-list@python.org> wrote:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Cecil Westerhof <Cecil@decebal.nl> writes:

values = [*range(100)]

In many cases, any iterable is just fine and a list is not
required, just as peudo-random numbers often are just fine and
real-world entropy is not required.

In this case both are. I must select (several times) a random element
from the list. So I need the list.
I also want the randomness to be as good as possible to make the
'simulation' as good as possible.

"[A]s good as possible" for simulations and tests may not require the cryptographic quality numbers from SystemRandom. Many/most pseudo
random number generators are optimized for statistically normalized
outputs, and are repeatable as a bonus (again, often a requirement for certain types of simulations; YMMV).

Also, what if you shuffled the list first (e.g., with random.shuffle)
and then iterated through it directly? Repeatedly removing arbitrary elements from the middle of a list is potentially expensive.

Someone else also mentioned that. I think I should put it on my list
of things to do/try.

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	293
Nodes:	16 (2 / 14)
Uptime:	241:52:10
Calls:	6,624
Files:	12,175
Messages:	5,320,145

More efficient code, but slower program

Who's Online

System Info