Can ziv lempel efficiency be improved by compressing longer repetitions

Question

I have a question - I was introduced to a version of Ziv- Lempel that encodes only repetitions of size 3 or longer(repetitions of 1 or 2 characters aren't encoded - the characters themselves are placed to the coded string and not the (m,k) value). I was asked if it is possible to improve ziv Lempel encoding efficiency (in terms of the length of the coded string - not time complexity).

I think that In terms of length of the coded string - there can be a scenario when not encoding a 3 length repetition at location p and instead encoding a repetition that starts at location p+1 or p+2 may yield better results - this also appears in the theory that I read: I have added a photo of the relevant paragraph that speaks about just that (but no examples are given). Every example I managed to find so far is one that either a code that encodes repetitions of length 3 can also detect.

Here is the paragraph that speaks of the fact that such a text exists:

Our compression algorithm as described so far is greedy: Any repeat of length 3 or more is reported and employed right away. Sometimes this is not optimal: We could have an [m₁, k₁] repeat in position p, and an [m₂, k₂] repeat in position p+1 or p+2, with k₁ << k₂. Thus a non-greedy algorithm may result in improved compression.

(original image)

score 2 · Accepted Answer · answered Jan 07 '14 at 00:13

2

Yes. gzip's and zlib's deflate algorithm uses "lazy" matching, which defers the decision to emit a match until the next character in order to see if there is a longer match there. That is definitely a win.

answered Jan 07 '14 at 00:13

Mark Adler

101,978
13
118
158

score 1 · Answer 2 · answered Jul 27 '14 at 17:15

1

This is technically called LZ non-greedy parsing approach. As Mark mentioned, gzip uses that, but for only p+1 skip.

In this document, you'll find more universal encoder with multiple details of that feature of LZ algorithm

answered Jul 27 '14 at 17:15

Triostrong

107
8

Can ziv lempel efficiency be improved by compressing longer repetitions

2 Answers2