Forum: >>> Magnum BBS <<<

Quick question about CPython interpreter

From DFS@21:1/5 to All on Fri Oct 14 18:25:23 2022

---------------------------------------------------------------------
this does a str() conversion in the loop ---------------------------------------------------------------------
for i in range(cells.count()):
if text == str(ID):
break

---------------------------------------------------------
this does one str() conversion before the loop ---------------------------------------------------------
strID = str(ID)
for i in range(cells.count()):
if text == strID:
break

But does CPython interpret the str() conversion away and essentially do
it for me in the first example?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Torrie@21:1/5 to DFS on Mon Oct 17 09:38:56 2022

On 10/14/22 16:25, DFS wrote:

---------------------------------------------------------------------
this does a str() conversion in the loop ---------------------------------------------------------------------
for i in range(cells.count()):
if text == str(ID):
break

---------------------------------------------------------
this does one str() conversion before the loop ---------------------------------------------------------
strID = str(ID)
for i in range(cells.count()):
if text == strID:
break

But does CPython interpret the str() conversion away and essentially do
it for me in the first example?

No.

You can use the dis module to show you what CPython is doing under the hood.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Lowry-Duda@21:1/5 to All on Mon Oct 17 11:43:59 2022

One can use the `dis` module and investigate the generated python
bytecode. For me, I get

# file "dis1.py"
thing = 123
for i in range(10):
if "hi" == str(thing):
print("found")
break

The bytecode is then

1 0 LOAD_CONST 0 (123)
2 STORE_NAME 0 (thing)

2 4 LOAD_NAME 1 (range)
6 LOAD_CONST 1 (10)
8 CALL_FUNCTION 1
10 GET_ITER
>> 12 FOR_ITER 28 (to 42)
14 STORE_NAME 2 (i)

3 16 LOAD_CONST 2 ('hi')
18 LOAD_NAME 3 (str)
20 LOAD_NAME 0 (thing)
22 CALL_FUNCTION 1
24 COMPARE_OP 2 (==)
26 POP_JUMP_IF_FALSE 12

4 28 LOAD_NAME 4 (print)
30 LOAD_CONST 3 ('found')
32 CALL_FUNCTION 1
34 POP_TOP

5 36 POP_TOP
38 JUMP_ABSOLUTE 42
40 JUMP_ABSOLUTE 12
>> 42 LOAD_CONST 4 (None)
44 RETURN_VALUE

I note that line 22 calls the function str repeatedly, and no
optimization is done here.

# file "dis2.py"
thing = 123
strthing = str(thing)
for i in range(10):
if "hi" == strthing:
print("found")
break

This generates bytecode

1 0 LOAD_CONST 0 (123)
2 STORE_NAME 0 (thing)

2 4 LOAD_NAME 1 (str)
6 LOAD_NAME 0 (thing)
8 CALL_FUNCTION 1
10 STORE_NAME 2 (strthing)

3 12 LOAD_NAME 3 (range)
14 LOAD_CONST 1 (10)
16 CALL_FUNCTION 1
18 GET_ITER
>> 20 FOR_ITER 24 (to 46)
22 STORE_NAME 4 (i)

4 24 LOAD_CONST 2 ('hi')
26 LOAD_NAME 2 (strthing)
28 COMPARE_OP 2 (==)
30 POP_JUMP_IF_FALSE 20

5 32 LOAD_NAME 5 (print)
34 LOAD_CONST 3 ('found')
36 CALL_FUNCTION 1
38 POP_TOP

6 40 POP_TOP
42 JUMP_ABSOLUTE 46
44 JUMP_ABSOLUTE 20
>> 46 LOAD_CONST 4 (None)
48 RETURN_VALUE

In short, it seems the cpython interpreter doesn't (currently) perform
this sort of optimization.

- DLD

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MRAB@21:1/5 to David Lowry-Duda on Mon Oct 17 17:25:34 2022

On 2022-10-17 16:43, David Lowry-Duda wrote:

One can use the `dis` module and investigate the generated python
bytecode. For me, I get

# file "dis1.py"
thing = 123
for i in range(10):
if "hi" == str(thing):
print("found")
break

The bytecode is then

1 0 LOAD_CONST 0 (123)
2 STORE_NAME 0 (thing)

2 4 LOAD_NAME 1 (range)
6 LOAD_CONST 1 (10)
8 CALL_FUNCTION 1
10 GET_ITER
>> 12 FOR_ITER 28 (to 42)
14 STORE_NAME 2 (i)

3 16 LOAD_CONST 2 ('hi')
18 LOAD_NAME 3 (str)
20 LOAD_NAME 0 (thing)
22 CALL_FUNCTION 1
24 COMPARE_OP 2 (==)
26 POP_JUMP_IF_FALSE 12

4 28 LOAD_NAME 4 (print)
30 LOAD_CONST 3 ('found')
32 CALL_FUNCTION 1
34 POP_TOP

5 36 POP_TOP
38 JUMP_ABSOLUTE 42
40 JUMP_ABSOLUTE 12
>> 42 LOAD_CONST 4 (None)
44 RETURN_VALUE

I note that line 22 calls the function str repeatedly, and no
optimization is done here.

# file "dis2.py"
thing = 123
strthing = str(thing)
for i in range(10):
if "hi" == strthing:
print("found")
break

This generates bytecode

1 0 LOAD_CONST 0 (123)
2 STORE_NAME 0 (thing)

2 4 LOAD_NAME 1 (str)
6 LOAD_NAME 0 (thing)
8 CALL_FUNCTION 1
10 STORE_NAME 2 (strthing)

3 12 LOAD_NAME 3 (range)
14 LOAD_CONST 1 (10)
16 CALL_FUNCTION 1
18 GET_ITER
>> 20 FOR_ITER 24 (to 46)
22 STORE_NAME 4 (i)

4 24 LOAD_CONST 2 ('hi')
26 LOAD_NAME 2 (strthing)
28 COMPARE_OP 2 (==)
30 POP_JUMP_IF_FALSE 20

5 32 LOAD_NAME 5 (print)
34 LOAD_CONST 3 ('found')
36 CALL_FUNCTION 1
38 POP_TOP

6 40 POP_TOP
42 JUMP_ABSOLUTE 46
44 JUMP_ABSOLUTE 20
>> 46 LOAD_CONST 4 (None)
48 RETURN_VALUE

In short, it seems the cpython interpreter doesn't (currently) perform
this sort of optimization.

It can't optimise that because, say, 'print' could've been bound to a
function that rebinds 'str'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to MRAB on Mon Oct 17 16:41:40 2022

MRAB <python@mrabarnett.plus.com> writes:

It can't optimise that because, say, 'print' could've been bound to a >function that rebinds 'str'.

It would be possible to find out whether a call of a function
named "print" is to the standard function, but the overhead
to do this in the end might slow down the execution.

In general, it can be possible that there could be optimizer
stages after compilation. So, one might write a small micro-
benchmark to be sure.

import timeit

s = "string text"
dt0 = dt1 = 0

for i in range( 100 ):

start_time = timeit.default_timer()
for _ in range( 1000 ):
l = len( s )
pass
dt0 += timeit.default_timer() - start_time

start_time = timeit.default_timer()
l = len( s )
for _ in range( 1000 ):
pass
dt1 += timeit.default_timer() - start_time

print( f'{dt0 = :.10f} # with "len" in the loop' )
print( f'{dt1 = :.10f} # with "len" outside the loop' )
print( f'{dt0/dt1 = :.2f}' )

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Chris Angelico@21:1/5 to Stefan Ram on Tue Oct 18 04:59:05 2022

On Tue, 18 Oct 2022 at 03:51, Stefan Ram <ram@zedat.fu-berlin.de> wrote:

MRAB <python@mrabarnett.plus.com> writes:

It can't optimise that because, say, 'print' could've been bound to a >function that rebinds 'str'.

It would be possible to find out whether a call of a function
named "print" is to the standard function, but the overhead
to do this in the end might slow down the execution.

In general, it can be possible that there could be optimizer
stages after compilation. So, one might write a small micro-
benchmark to be sure.

You'd also have to ensure that the stringification of the ID doesn't
change (which it can it it isn't a core data type), and the easiest
way to do THAT is to call str() on the ID every time and see if it's
the same...

ChrisA

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	57:22:59
Calls:	6,712
Files:	12,243
Messages:	5,355,557

Quick question about CPython interpreter

Who's Online

System Info