• Why I fail so bad to check for memory leak with this code?

    From Marco Sulla@21:1/5 to All on Thu Jul 21 21:47:13 2022
    I tried to check for memory leaks in a bunch of functions of mine using a simple decorator. It works, but it fails with this code, returning a random count_diff at every run. Why?

    import tracemalloc
    import gc
    import functools
    from uuid import uuid4
    import pickle

    def getUuid():
    return str(uuid4())

    def trace(func):
    @functools.wraps(func)
    def inner():
    tracemalloc.start()

    snapshot1 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    for i in range(100):
    func()

    gc.collect()

    snapshot2 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
    tracemalloc.stop()

    for stat in top_stats:
    if stat.count_diff > 3:
    raise ValueError(f"count_diff: {stat.count_diff}")

    return inner

    dict_1 = {getUuid(): i for i in range(1000)}

    @trace
    def func_76():
    pickle.dumps(iter(dict_1))

    func_76()

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MRAB@21:1/5 to Marco Sulla on Thu Jul 21 21:23:25 2022
    On 21/07/2022 20:47, Marco Sulla wrote:
    I tried to check for memory leaks in a bunch of functions of mine using a simple decorator. It works, but it fails with this code, returning a random count_diff at every run. Why?

    import tracemalloc
    import gc
    import functools
    from uuid import uuid4
    import pickle

    def getUuid():
    return str(uuid4())

    def trace(func):
    @functools.wraps(func)
    def inner():
    tracemalloc.start()

    snapshot1 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    for i in range(100):
    func()

    gc.collect()

    snapshot2 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
    tracemalloc.stop()

    for stat in top_stats:
    if stat.count_diff > 3:
    raise ValueError(f"count_diff: {stat.count_diff}")

    return inner

    dict_1 = {getUuid(): i for i in range(1000)}

    @trace
    def func_76():
    pickle.dumps(iter(dict_1))

    func_76()

    It's something to do with pickling iterators because it still occurs
    when I reduce func_76 to:

    @trace
    def func_76():
    pickle.dumps(iter([]))

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco Sulla@21:1/5 to MRAB on Thu Jul 21 22:51:17 2022
    On Thu, 21 Jul 2022 at 22:28, MRAB <python@mrabarnett.plus.com> wrote:

    It's something to do with pickling iterators because it still occurs
    when I reduce func_76 to:

    @trace
    def func_76():
    pickle.dumps(iter([]))

    It's too strange. I found a bunch of true memory leaks with this
    decorator. It seems to be reliable. It's correct with pickle and with
    iter, but not when pickling iters.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco Sulla@21:1/5 to All on Thu Jul 21 22:58:57 2022
    This naif code shows no leak:

    import resource
    import pickle

    c = 0

    while True:
    pickle.dumps(iter([]))

    if (c % 10000) == 0:
    max_rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    print(f"iteration: {c}, max rss: {max_rss} kb")

    c += 1

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco Sulla@21:1/5 to All on Fri Jul 22 00:39:26 2022
    I've done this other simple test:

    #!/usr/bin/env python3

    import tracemalloc
    import gc
    import pickle

    tracemalloc.start()

    snapshot1 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    for i in range(10000000):
    pickle.dumps(iter([]))

    gc.collect()

    snapshot2 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
    tracemalloc.stop()

    for stat in top_stats:
    print(stat)

    The result is:

    /home/marco/sources/test.py:14: size=3339 B (+3339 B), count=63 (+63), average=53 B
    /home/marco/sources/test.py:9: size=464 B (+464 B), count=1 (+1),
    average=464 B
    /home/marco/sources/test.py:10: size=456 B (+456 B), count=1 (+1),
    average=456 B
    /home/marco/sources/test.py:13: size=28 B (+28 B), count=1 (+1), average=28
    B

    It seems that, after 10 million loops, only 63 have a leak, with only ~3
    KB. It seems to me that we can't call it a leak, no? Probably pickle needs
    a lot more cycles to be sure there's actually a real leakage.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MRAB@21:1/5 to Marco Sulla on Fri Jul 22 00:37:21 2022
    On 21/07/2022 23:39, Marco Sulla wrote:
    I've done this other simple test:

    #!/usr/bin/env python3

    import tracemalloc
    import gc
    import pickle

    tracemalloc.start()

    snapshot1 = tracemalloc.take_snapshot().filter_traces(
        (tracemalloc.Filter(True, __file__), )
    )

    for i in range(10000000):
        pickle.dumps(iter([]))

    gc.collect()

    snapshot2 = tracemalloc.take_snapshot().filter_traces(
        (tracemalloc.Filter(True, __file__), )
    )

    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
    tracemalloc.stop()

    for stat in top_stats:
        print(stat)

    The result is:

    /home/marco/sources/test.py:14: size=3339 B (+3339 B), count=63 (+63), average=53 B
    /home/marco/sources/test.py:9: size=464 B (+464 B), count=1 (+1),
    average=464 B
    /home/marco/sources/test.py:10: size=456 B (+456 B), count=1 (+1), average=456 B
    /home/marco/sources/test.py:13: size=28 B (+28 B), count=1 (+1),
    average=28 B

    It seems that, after 10 million loops, only 63 have a leak, with only
    ~3 KB. It seems to me that we can't call it a leak, no? Probably
    pickle needs a lot more cycles to be sure there's actually a real leakage.
    If it was a leak, then the amount of memory used or the counts would
    increase with increasing iterations. If that's not happening, if the
    memory used and the counts stay the roughly the same, then it's probably
    not a leak, unless it's a leak of something that happens only once, such
    as creating a cache or buffer on first use.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Barry@21:1/5 to All on Fri Jul 22 08:00:01 2022
    


    On 21 Jul 2022, at 21:54, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote: On Thu, 21 Jul 2022 at 22:28, MRAB <python@mrabarnett.plus.com> wrote:

    It's something to do with pickling iterators because it still occurs
    when I reduce func_76 to:

    @trace
    def func_76():
    pickle.dumps(iter([]))

    It's too strange. I found a bunch of true memory leaks with this
    decorator. It seems to be reliable. It's correct with pickle and with
    iter, but not when pickling iters.

    With code as complex as python’s there will be memory allocations that occur that will not be directly related to the python code you test.

    To put it another way there is noise in your memory allocation signal.

    Usually the signal of a memory leak is very clear, as you noticed.

    For rare leaks I would use a tool like valgrind.

    Barry

    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco Sulla@21:1/5 to Barry on Fri Jul 22 21:40:47 2022
    On Fri, 22 Jul 2022 at 09:00, Barry <barry@barrys-emacs.org> wrote:
    With code as complex as python’s there will be memory allocations that
    occur that will not be directly related to the python code you test.

    To put it another way there is noise in your memory allocation signal.

    Usually the signal of a memory leak is very clear, as you noticed.

    For rare leaks I would use a tool like valgrind.

    Thank you all, but I needed a simple decorator to automatize the memory
    leak (and segfault) tests. I think that this version is good enough, I hope that can be useful to someone:

    def trace(iterations=100):
    def decorator(func):
    def wrapper():
    print(
    f"Loops: {iterations} - Evaluating: {func.__name__}",
    flush=True
    )

    tracemalloc.start()

    snapshot1 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    for i in range(iterations):
    func()

    gc.collect()

    snapshot2 = tracemalloc.take_snapshot().filter_traces(
    (tracemalloc.Filter(True, __file__), )
    )

    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
    tracemalloc.stop()

    for stat in top_stats:
    if stat.count_diff * 100 > iterations:
    raise ValueError(f"stat: {stat}")

    return wrapper

    return decorator


    If the decorated function fails, you can try to raise the iterations
    parameter. I found that in my cases sometimes I needed a value of 200 or 300

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)