• #### Basics of LZ77 Algorithm

From yasar11732@gmail.com@21:1/5 to All on Tue Jun 11 08:11:11 2019
Hello,

I am trying to understand how LZ77 algorithm work. From what I read from various sources, I have come to following conclusion:

According to LZ77 compression algorithm, if I encode jump and length using 4 bits, and character as 8 bits, I will use 16bits for each token. If the text I am compressing doesn't have any repetition, I will actually double the size of my input.

I was wondering if I had arrived to correct conclusion, because it doesn't sound right.

Yaşar Arabacı

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)
• From Scott@21:1/5 to yasar11732@gmail.com on Tue Jun 11 22:05:38 2019
On Tue, 11 Jun 2019 08:11:11 -0700 (PDT), yasar11732@gmail.com wrote:

I am trying to understand how LZ77 algorithm work. From what I read from va= >rious sources, I have come to following conclusion:

According to LZ77 compression algorithm, if I encode jump and length using = >4 bits, and character as 8 bits, I will use 16bits for each token. If the t= >ext I am compressing doesn't have any repetition, I will actually double th= >e size of my input.

I was wondering if I had arrived to correct conclusion, because it doesn't = >sound right.

It does sound odd at first, but that's more or less right. It's a
fundamental fact of information theory and applies to all (lossless) compression methods. Basically, over the set of all possible messages
of length N, the average length of the corresponding compressed
messages is also N.

So yes, every method will have certain inputs that produce larger
outputs. The trick to practical compression is to find algorithms that
work well with patterns that you find in certain useful inputs, which
AIUI is how LZ was designed. Then you check as you go, and if you have
a chunk that is anti-compressible, you just store it without
compression.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)
• From flemingsarah015@gmail.com@21:1/5 to yasar...@gmail.com on Mon Jul 1 10:49:13 2019
On Tuesday, June 11, 2019 at 4:11:12 PM UTC+1, yasar...@gmail.com wrote:
Hello,

I am trying to understand how LZ77 algorithm work. From what I read from various sources, I have come to following conclusion:

According to LZ77 compression algorithm, if I encode jump and length using 4 bits, and character as 8 bits, I will use 16bits for each token. If the text I am compressing doesn't have any repetition, I will actually double the size of my input.

I was wondering if I had arrived to correct conclusion, because it doesn't sound right.