• Zlib, XZ, Zstd, what to choose?

    From Helm@21:1/5 to All on Thu Apr 23 14:41:33 2020
    Hello everyone,

    I am working on a application and world like to know what you think is
    the best compression algorithm.

    "Best" in terms of decompression time and compression ratio. I don't
    really care how long it takes to compress. Zstd is good but its owned
    by Facebook.

    Thank you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Elhana@21:1/5 to All on Sat Apr 25 05:00:19 2020
    Helm:

    I am working on a application and world like to know what you think is
    the best compression algorithm.

    Zstd, by far.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From BGB@21:1/5 to Helm on Sat Apr 25 13:41:01 2020
    On 4/23/2020 1:41 PM, Helm wrote:
    Hello everyone,

    I am working on a application and world like to know what you think is
    the best compression algorithm.

    "Best" in terms of decompression time and compression ratio. I don't
    really care how long it takes to compress. Zstd is good but its owned by Facebook.


    Tradeoffs are:
    Zlib / Deflate: older, works reasonably well, neither the fastest nor
    best compression, reasonably well supported;
    XZ / LZMA: Good compression, but a lot slower than the others.
    Zstd: Fast decoding, decent compression (falls between Deflate and
    LZMA), slightly less open and relies on FSE.

    Will add here:
    LZ4: not the best compression, but decoding can be really fast, memory
    overhead for decoding is fairly low, and the decoder can be
    small/simple. LZ4 is one of the few options here viable for use on small microcontrollers.


    My experience with FSE is that it does well on newer PC style hardware,
    but in past tests (years ago) its speed seems to take a big hit on older
    or lower end hardware (vs something like Deflate).

    The memory overheads of Zstd and FSE are also too large to really be
    viable for small embedded applications (such as microcontrollers).
    In these cases, something like Deflate or LZ4 is preferable because they
    have much smaller overheads.


    Typical ranking in terms of compression (best to worst):
    LZMA
    Zstd
    Deflate
    LZ4

    Typical ranking in terms of speed (best to worst, PC-style hardware):
    LZ4
    Zstd
    Deflate
    LZMA

    Typical ranking in terms of speed (best to worst, low-end hardware):
    LZ4
    Deflate
    Zstd
    LZMA


    Typical ranking in terms of memory overhead (least to most, *):
    LZ4
    Deflate
    LZMA | Zstd

    *: LZ4, as noted, has minimal necessary overheads beyond a source and destination buffer. No tables or other structures necessary in this case
    (and "typical" use is buffer to buffer decoding).

    Deflate generally needs lookup tables for Huffman decoding (multiple
    kB), and (in some implementations) another 32K for a copy of the sliding
    window (though, it is possible to map this onto a buffer holding the
    decoded contents). Depending on implementation specifics, likely memory overhead is somewhere between 5kB and 128kB.


    Both LZMA and Zstd need some tables for entropy coding, but much of
    their memory overhead comes from (generally) keeping a full copy of the
    sliding window for decoding. In these encoders/decoders, this window may
    be up into the MB range.


    Part of the difference here is due to whether the contents are decoded
    all at one to a destination buffer, vs "streamed". Stream decoding is
    more common if the uncompressed data may be potentially arbitrarily
    large, so it is more common to encode things in terms of smaller buffers representing a small piece of the input or output. This is the common
    case for file compression, and in this case it makes sense to keep the
    sliding window in its own memory buffer.


    If decoding the buffer all at once, it generally makes more sense to use
    the buffer itself as the sliding window. This generally makes more sense
    if the compressor is being used as part of something else (such as part
    of a format like PNG, or part of a loader, ...), as opposed to using it
    for something like compressing or archiving files.



    In some cases, it may also make sense to go "full custom".


    Depends mostly on what one wants to do with it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eli the Bearded@21:1/5 to cr88192@hotmail.com on Sat Apr 25 20:32:24 2020
    In comp.compression, BGB <cr88192@hotmail.com> wrote:
    Tradeoffs are:
    Zlib / Deflate: older, works reasonably well, neither the fastest nor
    best compression, reasonably well supported;
    XZ / LZMA: Good compression, but a lot slower than the others.
    Zstd: Fast decoding, decent compression (falls between Deflate and
    LZMA), slightly less open and relies on FSE.

    Will add here:
    LZ4: not the best compression, but decoding can be really fast, memory overhead for decoding is fairly low, and the decoder can be
    small/simple. LZ4 is one of the few options here viable for use on small microcontrollers.

    [snip remainder]

    I'd like to thank you for the most informative post this group has seen
    in a couple of years.

    Elijah
    ------
    came here for JPEG discussion some years ago and never stopped reading

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)