• [RFC v2 PATCH] ksm: add offset arg to memcmp_pages() to speedup compari

    From Timofey Titovets@21:1/5 to All on Mon Oct 2 15:00:02 2017
    Currently while search/inserting in RB tree,
    memcmp used for comparing out of tree pages with in tree pages.

    But on each compare step memcmp for pages start at
    zero offset, i.e. that just ignore forward progress.

    That make some overhead for search in deep RB tree and/or with
    bit pages (4KiB+), so store last start offset where no diff in page content.

    Added: memcmpe()
    iter 1024 - that a some type of magic value
    max_offset_error - 8 - acceptable error level for offset.

    With that patch i get ~ same performance in bad case (where offset useless)
    on tiny tree and default 4KiB pages.

    So that just RFC, i.e. does that type of optimization make a sense?

    Thanks.

    Changes:
    v1 -> v2:
    Add: configurable max_offset_error
    Move logic to memcmpe()

    Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
    ---
    mm/ksm.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------
    1 file changed, 55 insertions(+), 6 deletions(-)

    diff --git a/mm/ksm.c b/mm/ksm.c
    index 15dd7415f7b3..780630498de8 100644
    --- a/mm/ksm.c
    +++ b/mm/ksm.c
    @@ -991,14 +991,58 @@ static u32 calc_checksum(struct page *page)
    return checksum;
    }

    -static int memcmp_pages(struct page *page1, struct page *page2)
    +
    +/*
    + * memcmp used to compare pages in RB-tree
    + * but on every step down the tree forward progress
    + * just has been ignored, that make performance pitfall
    + * on deep tree and/or big pages (ex. 4KiB+)
    + *
    + * Fix that by add memcmp wrapper that will try to guess
    + * where difference happens, to only scan from that offset against
    + * next pages
    + */
    +
    +static int memcmpe(const void *p, const void *q, const u32 len,
    + u32 *offset)
    +{
    + const u32 max_offset_error = 8;
    + u32 iter = 1024, i = 0;
    + int ret;
    +
    + if (offset == NULL)
    + return memcmp(p, q, len);
    +
    + if (*offset < len)
    + i =