Forum: >>> Magnum BBS <<<

Why I fail so bad to check for memory leak with this code?

From Marco Sulla@21:1/5 to All on Thu Jul 21 21:47:13 2022

I tried to check for memory leaks in a bunch of functions of mine using a simple decorator. It works, but it fails with this code, returning a random count_diff at every run. Why?

import tracemalloc
import gc
import functools
from uuid import uuid4
import pickle

def getUuid():
return str(uuid4())

def trace(func):
@functools.wraps(func)
def inner():
tracemalloc.start()

snapshot1 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

for i in range(100):
func()

gc.collect()

snapshot2 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

top_stats = snapshot2.compare_to(snapshot1, 'lineno')
tracemalloc.stop()

for stat in top_stats:
if stat.count_diff > 3:
raise ValueError(f"count_diff: {stat.count_diff}")

return inner

dict_1 = {getUuid(): i for i in range(1000)}

@trace
def func_76():
pickle.dumps(iter(dict_1))

func_76()

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MRAB@21:1/5 to Marco Sulla on Thu Jul 21 21:23:25 2022

On 21/07/2022 20:47, Marco Sulla wrote:

I tried to check for memory leaks in a bunch of functions of mine using a simple decorator. It works, but it fails with this code, returning a random count_diff at every run. Why?

import tracemalloc
import gc
import functools
from uuid import uuid4
import pickle

def getUuid():
return str(uuid4())

def trace(func):
@functools.wraps(func)
def inner():
tracemalloc.start()

snapshot1 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

for i in range(100):
func()

gc.collect()

snapshot2 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

top_stats = snapshot2.compare_to(snapshot1, 'lineno')
tracemalloc.stop()

for stat in top_stats:
if stat.count_diff > 3:
raise ValueError(f"count_diff: {stat.count_diff}")

return inner

dict_1 = {getUuid(): i for i in range(1000)}

@trace
def func_76():
pickle.dumps(iter(dict_1))

func_76()

It's something to do with pickling iterators because it still occurs
when I reduce func_76 to:

@trace
def func_76():
pickle.dumps(iter([]))

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marco Sulla@21:1/5 to MRAB on Thu Jul 21 22:51:17 2022

On Thu, 21 Jul 2022 at 22:28, MRAB <python@mrabarnett.plus.com> wrote:

It's something to do with pickling iterators because it still occurs
when I reduce func_76 to:

@trace
def func_76():
pickle.dumps(iter([]))

It's too strange. I found a bunch of true memory leaks with this
decorator. It seems to be reliable. It's correct with pickle and with
iter, but not when pickling iters.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marco Sulla@21:1/5 to All on Thu Jul 21 22:58:57 2022

This naif code shows no leak:

import resource
import pickle

c = 0

while True:
pickle.dumps(iter([]))

if (c % 10000) == 0:
max_rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
print(f"iteration: {c}, max rss: {max_rss} kb")

c += 1

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marco Sulla@21:1/5 to All on Fri Jul 22 00:39:26 2022

I've done this other simple test:

#!/usr/bin/env python3

import tracemalloc
import gc
import pickle

tracemalloc.start()

snapshot1 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

for i in range(10000000):
pickle.dumps(iter([]))

gc.collect()

snapshot2 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

top_stats = snapshot2.compare_to(snapshot1, 'lineno')
tracemalloc.stop()

for stat in top_stats:
print(stat)

The result is:

/home/marco/sources/test.py:14: size=3339 B (+3339 B), count=63 (+63), average=53 B
/home/marco/sources/test.py:9: size=464 B (+464 B), count=1 (+1),
average=464 B
/home/marco/sources/test.py:10: size=456 B (+456 B), count=1 (+1),
average=456 B
/home/marco/sources/test.py:13: size=28 B (+28 B), count=1 (+1), average=28
B

It seems that, after 10 million loops, only 63 have a leak, with only ~3
KB. It seems to me that we can't call it a leak, no? Probably pickle needs
a lot more cycles to be sure there's actually a real leakage.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MRAB@21:1/5 to Marco Sulla on Fri Jul 22 00:37:21 2022

On 21/07/2022 23:39, Marco Sulla wrote:

I've done this other simple test:

#!/usr/bin/env python3

import tracemalloc
import gc
import pickle

tracemalloc.start()

snapshot1 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

for i in range(10000000):
pickle.dumps(iter([]))

gc.collect()

snapshot2 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

top_stats = snapshot2.compare_to(snapshot1, 'lineno')
tracemalloc.stop()

for stat in top_stats:
print(stat)

The result is:

/home/marco/sources/test.py:14: size=3339 B (+3339 B), count=63 (+63), average=53 B
/home/marco/sources/test.py:9: size=464 B (+464 B), count=1 (+1),
average=464 B
/home/marco/sources/test.py:10: size=456 B (+456 B), count=1 (+1), average=456 B
/home/marco/sources/test.py:13: size=28 B (+28 B), count=1 (+1),
average=28 B

It seems that, after 10 million loops, only 63 have a leak, with only
~3 KB. It seems to me that we can't call it a leak, no? Probably
pickle needs a lot more cycles to be sure there's actually a real leakage.

If it was a leak, then the amount of memory used or the counts would
increase with increasing iterations. If that's not happening, if the
memory used and the counts stay the roughly the same, then it's probably
not a leak, unless it's a leak of something that happens only once, such
as creating a cache or buffer on first use.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Barry@21:1/5 to All on Fri Jul 22 08:00:01 2022

On 21 Jul 2022, at 21:54, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote: On Thu, 21 Jul 2022 at 22:28, MRAB <python@mrabarnett.plus.com> wrote:

It's something to do with pickling iterators because it still occurs
when I reduce func_76 to:

@trace
def func_76():
pickle.dumps(iter([]))

It's too strange. I found a bunch of true memory leaks with this
decorator. It seems to be reliable. It's correct with pickle and with
iter, but not when pickling iters.

With code as complex as python’s there will be memory allocations that occur that will not be directly related to the python code you test.

To put it another way there is noise in your memory allocation signal.

Usually the signal of a memory leak is very clear, as you noticed.

For rare leaks I would use a tool like valgrind.

Barry

--
https://mail.python.org/mailman/listinfo/python-list

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marco Sulla@21:1/5 to Barry on Fri Jul 22 21:40:47 2022

On Fri, 22 Jul 2022 at 09:00, Barry <barry@barrys-emacs.org> wrote:

With code as complex as python’s there will be memory allocations that

occur that will not be directly related to the python code you test.

To put it another way there is noise in your memory allocation signal.

Usually the signal of a memory leak is very clear, as you noticed.

For rare leaks I would use a tool like valgrind.

Thank you all, but I needed a simple decorator to automatize the memory
leak (and segfault) tests. I think that this version is good enough, I hope that can be useful to someone:

def trace(iterations=100):
def decorator(func):
def wrapper():
print(
f"Loops: {iterations} - Evaluating: {func.__name__}",
flush=True
)

tracemalloc.start()

snapshot1 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

for i in range(iterations):
func()

gc.collect()

snapshot2 = tracemalloc.take_snapshot().filter_traces(
(tracemalloc.Filter(True, __file__), )
)

top_stats = snapshot2.compare_to(snapshot1, 'lineno')
tracemalloc.stop()

for stat in top_stats:
if stat.count_diff * 100 > iterations:
raise ValueError(f"stat: {stat}")

return wrapper

return decorator

If the decorated function fails, you can try to raise the iterations
parameter. I found that in my cases sometimes I needed a value of 200 or 300

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	296
Nodes:	16 (2 / 14)
Uptime:	63:03:39
Calls:	6,654
Files:	12,200
Messages:	5,331,693

Why I fail so bad to check for memory leak with this code?

Who's Online

System Info