• [PATCH 5/6] lightnvm: pblk: free up mempool allocation for erases c

    From =?utf-8?Q?Javier_Gonz=C3=A1lez?=@21:1/5 to All on Mon Oct 2 14:10:02 2017
    On 1 Oct 2017, at 15.25, Rakesh Pandit <rakesh@tuxera.com> wrote:

    While separating read and erase mempools in 22da65a1b pblk_g_rq_cache
    was used two times to set aside memory both for erase and read
    requests. Because same kmem cache is used repeatedly a single call to kmem_cache_destroy wouldn't deallocate everything. Repeatedly doing
    loading and unloading of pblk modules would eventually result in some
    leak.

    The fix is to really use separate kmem cache and track it
    appropriately.

    Fixes: 22da65a1b ("lightnvm: pblk: decouple read/erase mempools") Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>


    I'm not sure I follow this logic. I assume that you're thinking of the
    refcount on kmem_cache. During cache creation, all is good; if a
    different cache creation fails, destruction is guaranteed, since the
    refcount is 0. On tear down (pblk_core_free), we destroy the mempools associated to the caches. In this case, the refcount goes to 0 too, as
    we destroy the 2 mempools. So I don't see where the leak can happen. Am
    I missing something?

    In any case, Jens reported some bugs on the mempools, where we did not guarantee forward progress. Here you can find the original discussion
    and the mempool audit [1]. Would be good if you reviewed these.

    [1] https://www.spinics.net/lists/kernel/msg2602274.html

    Thanks,
    Javier

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEm1mT7zen+vs9+T8kYx8FO3WZGMoFAlnSLH8ACgkQYx8FO3WZ GMos1xAAy2LtMnOqkyamFKZV1yPsn8YNiW+g1TfSSUFxBHrWs5gAtn88GHSD2Bxm 4VEUoNEfFy8vv+E0+ypPuYzzdF7eOFN152ltX+Z7eQVpjuPhMFoCBVjHwmet160X v41rO/zhJfrC+rE1+c8o6saHKTLHnByRmmQI85kjUZ99eu/OwX/qEgXzsu5vQ2hs joV0eykzJ5GZxStmtBy8CWUhgbqYqkR0w5WMphJ9CRkisDZE+zFlR5Dh+4UeYNyb BcWJb1xOVXR8Af2ozP3FMumBoTUgpvtw737WQUQFP1ucYHBqGnfkI7+EoOrLNST3 wLiqxvzbjXNaAlHRAoPme0uim6Vlyt/mUQk+v1WtCFFHUoiC7KneVo1JLEfymYWF nESyyZ9mOgjhMBQJyNXaUwPxCYoFIKp0cS1OV00hvLp2Z8fbgII+oCLbVY+LLfBt MjELJesjJyviCcCg5bTRn/3hjGrWqGOGVWkweQtcLIGlTZEbRwoAYFkmKVWb28S8 xcncWCkpsL/s5y/8dki8dhIq9VzyUZ/eDZszCC+IRxWgdECvACNmM18J03mo04Af PICmSXTne9ahNONMD4oOlkD1FbCgxgtVLyMEwsUDPdEV5OxDk4lGhwxYl71pcvGs KHlj5311pOzFI+wLcxdRq0pkTjC2DkcectX4+3pjjUBMSFnlkJw=
    =XwQv
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rakesh Pandit@21:1/5 to All on Mon Oct 2 14:30:02 2017
    On Mon, Oct 02, 2017 at 02:09:35PM +0200, Javier González wrote:
    On 1 Oct 2017, at 15.25, Rakesh Pandit <rakesh@tuxera.com> wrote:

    While separating read and erase mempools in 22da65a1b pblk_g_rq_cache
    was used two times to set aside memory both for erase and read
    requests. Because same kmem cache is used repeatedly a single call to kmem_cache_destroy wouldn't deallocate everything. Repeatedly doing loading and unloading of pblk modules would eventually result in some
    leak.

    The fix is to really use separate kmem cache and track it
    appropriately.

    Fixes: 22da65a1b ("lightnvm: pblk: decouple read/erase mempools") Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>


    I'm not sure I follow this logic. I assume that you're thinking of the refcount on kmem_cache. During cache creation, all is good; if a
    different cache creation fails, destruction is guaranteed, since the
    refcount is 0. On tear down (pblk_core_free), we destroy the mempools associated to the caches. In this case, the refcount goes to 0 too, as
    we destroy the 2 mempools. So I don't see where the leak can happen. Am
    I missing something?

    In any case, Jens reported some bugs on the mempools, where we did not guarantee forward progress. Here you can find the original discussion
    and the mempool audit [1]. Would be good if you reviewed these.

    [1] https://www.spinics.net/lists/kernel/msg2602274.html


    Thanks, yes makes sense to follow up in patch thread. I will respond
    to above questions there later today.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rakesh Pandit@21:1/5 to Rakesh Pandit on Mon Oct 2 22:50:06 2017
    On Mon, Oct 02, 2017 at 03:25:10PM +0300, Rakesh Pandit wrote:
    On Mon, Oct 02, 2017 at 02:09:35PM +0200, Javier González wrote:
    On 1 Oct 2017, at 15.25, Rakesh Pandit <rakesh@tuxera.com> wrote:

    While separating read and erase mempools in 22da65a1b pblk_g_rq_cache
    was used two times to set aside memory both for erase and read
    requests. Because same kmem cache is used repeatedly a single call to kmem_cache_destroy wouldn't deallocate everything. Repeatedly doing loading and unloading of pblk modules would eventually result in some leak.

    The fix is to really use separate kmem cache and track it
    appropriately.

    Fixes: 22da65a1b ("lightnvm: pblk: decouple read/erase mempools") Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>


    I'm not sure I follow this logic. I assume that you're thinking of the refcount on kmem_cache. During cache creation, all is good; if a
    different cache creation fails, destruction is guaranteed, since the refcount is 0. On tear down (pblk_core_free), we destroy the mempools associated to the caches. In this case, the refcount goes to 0 too, as
    we destroy the 2 mempools. So I don't see where the leak can happen. Am
    I missing something?

    In any case, Jens reported some bugs on the mempools, where we did not guarantee forward progress. Here you can find the original discussion
    and the mempool audit [1]. Would be good if you reviewed these.

    [1] https://www.spinics.net/lists/kernel/msg2602274.html


    Thanks, yes makes sense to follow up in patch thread. I will respond
    to above questions there later today.


    I wasn't thinking it right in addition to looking at test results from
    a incorrectly instrumented debugged version.

    I went through the series you pointed and all seem okay to me now.

    Please drop this patch.

    Regards,

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)