• Basics of LZ77 Algorithm

    From yasar11732@gmail.com@21:1/5 to All on Tue Jun 11 08:11:11 2019
    Hello,

    I am trying to understand how LZ77 algorithm work. From what I read from various sources, I have come to following conclusion:

    According to LZ77 compression algorithm, if I encode jump and length using 4 bits, and character as 8 bits, I will use 16bits for each token. If the text I am compressing doesn't have any repetition, I will actually double the size of my input.

    I was wondering if I had arrived to correct conclusion, because it doesn't sound right.

    Thanks in advance,

    Yaşar Arabacı

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott@21:1/5 to yasar11732@gmail.com on Tue Jun 11 22:05:38 2019
    On Tue, 11 Jun 2019 08:11:11 -0700 (PDT), yasar11732@gmail.com wrote:

    I am trying to understand how LZ77 algorithm work. From what I read from va= >rious sources, I have come to following conclusion:

    According to LZ77 compression algorithm, if I encode jump and length using = >4 bits, and character as 8 bits, I will use 16bits for each token. If the t= >ext I am compressing doesn't have any repetition, I will actually double th= >e size of my input.

    I was wondering if I had arrived to correct conclusion, because it doesn't = >sound right.

    It does sound odd at first, but that's more or less right. It's a
    fundamental fact of information theory and applies to all (lossless) compression methods. Basically, over the set of all possible messages
    of length N, the average length of the corresponding compressed
    messages is also N.

    So yes, every method will have certain inputs that produce larger
    outputs. The trick to practical compression is to find algorithms that
    work well with patterns that you find in certain useful inputs, which
    AIUI is how LZ was designed. Then you check as you go, and if you have
    a chunk that is anti-compressible, you just store it without
    compression.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From flemingsarah015@gmail.com@21:1/5 to yasar...@gmail.com on Mon Jul 1 10:49:13 2019
    On Tuesday, June 11, 2019 at 4:11:12 PM UTC+1, yasar...@gmail.com wrote:
    Hello,

    I am trying to understand how LZ77 algorithm work. From what I read from various sources, I have come to following conclusion:

    According to LZ77 compression algorithm, if I encode jump and length using 4 bits, and character as 8 bits, I will use 16bits for each token. If the text I am compressing doesn't have any repetition, I will actually double the size of my input.

    I was wondering if I had arrived to correct conclusion, because it doesn't sound right.

    Thanks in advance,

    Yaşar Arabacı

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)