I'm interested in compressing many versions of a similar file. The files are PDFs with (often minor) differences between them.
My question is: Is the zip or gzip algorithm able to use the similarity between these files to improve compression? Or does it handle each file individually?
I've looked at http://www.infinitepartitions.com/art001.html from How does the GZip algorithm work?, which goes over the algorithms themselves, but doesn't answer whether implementation handles all files individually or not.
Follow-up questions: If not, are there file compression algorithms that would be able to leverage the similarity between files to aid in compression?