How can I optimize my Lz77 Sliding Window Compressor?

Question

I wrote a Java compressor for a super obscure compression format. (It was mainly used on Amiga Computers in the 1990s).

There is a fair amount of documentation on how to decompress the file format, but none on actually how to compress it.

So, I tried to make it myself. It works, but there's one issue. It takes me 42 seconds to compress all the files I want to compress, on "low-intensity settings". It takes roughly 10x that time on higher intensity settings.

I believe it can be much faster than that.

It's basically a Lz77 sliding-window variant.

The real bottleneck is searching for an existing occurance to compress with. Right now, I'm using a Map<Byte, List<Integer>> (The List<Integer> is all the indices the byte is present at.)

To find a potential match, what it does is:

It takes the current index of the file being compressed. It gets the List<Integer> from the Map with the byte at the current index.

It finds the longest sub-list of bytes that already occurs in the file by using that List, and just checking how long they match for.

I think a better data structure could speed this up significantly, but I'm stuck at this point.

One of the restrictions I have to work with is that I need to strictly adhere to this ancient compression format due to the nature of what this program is for.

How could I optimize the compression without making it less effective at packing data?

Main Bottleneck code:

private static int search(byte[] data, int bufferEnd, List<Byte> target, Map<Byte, List<Integer>> dictionary) {
    int minIndex = Math.max(0, bufferEnd - getMaximumOffset(target.size())); // There's a certain point at which data will not be compressed. By calculating it here, it saves a lot of overheard, and prevents this from becoming O(n^2)

    byte test = target.get(0);
    if (!dictionary.containsKey(test))
        return -1; // No results found.

    List<Integer> possibleResults = dictionary.get(test);

    for (int i = possibleResults.size() - 1; i >= 0; i--) {
        int testIndex = possibleResults.get(i);
        if (minIndex > testIndex)
            break; // We've gone too far.

        // Test this
        boolean pass = true;
        for (int j = 1; j < target.size(); j++) {
            if (target.get(j) != data[j + testIndex]) {
                pass = false;
                break; // Break from the j for loop.
            }
        }

        if (pass) // A match has been found. Return it.
            return testIndex;
    }

    return -1;
}

Which is called by:

while ((tempIndex = search(data, i, searchList, dictionary)) >= 0) { // Find the longest compressable bunch of characters.
    if (data.length - 1 == readIndex) // If we've reached the end of the data, exit.
        break;

    searchList.add(data[++readIndex]);
}

Full Code here for anyone who needs it.

This might be better posed at the [sister site codereview](https://codereview.stackexchange.com/). — KevinO, Oct 01 '18 at 03:43
Good point. I was not aware this site existed, thank you. I will post it there instead. — Petersnow2, Oct 01 '18 at 03:55
A `Map>` could be written as an `int[][]` (with some mapping of bytes to integers for the first index) (unless you need null keys or elements in the value lists), thus avoiding the autoboxing. — Andy Turner, Oct 01 '18 at 05:39
What you probably want (speed/memory tradeoff) is to not keep a simple flat lookup table, but rather build a prefix-tree (trie) so you have a fast lookup of the longest prefix. — Hiery Nomus, Oct 01 '18 at 09:24
@HieryNomus Thank you for the suggestion! I did some reading into Trie, but either I'm not understanding it right or it's not quite what I'm looking for. The issue with it is that I need to be able to take any byte subList that's been previously done, and I need to keep track of the index of each of these characters. It's very possible I'm not understanding how to do this right though. — Petersnow2, Oct 02 '18 at 03:57
Update: I have optimized it quite a bit. While I'm 90% sure that there's a tree-like data structure I could use to make this even more performant, I'm going to talk to some of my college professors to see what they think. I've made it roughly 4x faster (regardless of data-size), so I think I'm at least getting somewhere. — Petersnow2, Oct 02 '18 at 04:02
@Kneesnap Do you have a testsuite with that class? Then it is easier to tinker with it. — Hiery Nomus, Oct 02 '18 at 07:15
Yeah, I probably should make one :P Since this isn't a professional project I figured I could be lazy. It wouldn't be a bad idea to start now though. — Petersnow2, Oct 02 '18 at 21:02

score 2 · Answer 1 · answered Oct 01 '18 at 09:27

You're missing a bunch of optimizations, especially low level ones.

Map<Byte, List<Integer>>

That's very inefficient.

Actually, a Map is pretty fast, but much slower than an array. Instead of map.get(someByte), which does autoboxing and a map lookup (some index computation and a few array accesses), you can do a single array access using array[someByte & 0xFF], getting about one order of magnitude speedup.

Similarly, List<Integer> implies autoboxing as you start with ints. Autoboxing is usually acceptable, but not when it's in the core of a demanding algorithm. You can write an own class behaving like List<int> or google for it (there are a few good libraries for that).

if (!dictionary.containsKey(test))
    return -1; // No results found.

List<Integer> possibleResults = dictionary.get(test);

This is a needless double lookup. Unless you're using null values, it can be written as

List<Integer> possibleResults = dictionary.get(test);

if (possibleResults == null)
    return -1; // No results found.

This is twice as fast, but as I wrote, you should use an array here.

Concerning the high-level optimizations, I don't really know how to compress efficiently, but I'm sure, there are quite some tricks. If there were no resources on compression, I'd start with rolling hash. But read about compression in general first.

Yeah, unfortunately what I'm looking for is a more high-level thing. I care more here about time complexity than things like that. But! I will implement those changes. — Petersnow2, Oct 02 '18 at 04:00
The suggestions from @maaartinus improve your big O for sure. Any unnecessary operation in a tight loop adds to your time complexity. Especially things like a map lookup (logN) or list walk (N) — Hiery Nomus, Oct 02 '18 at 07:17

How can I optimize my Lz77 Sliding Window Compressor?

1 Answers1