Context: I am attempting to automate the inspection of eps files to detect a list of attributes, such as whether the file contains locked layers, embedded bitmap images etc.
So far we have found some of these things can be detected via inspection of the raw eps file data and its accompanying metadata (similar to the information returned by imagemagick.) However it seems that in files created by illustrator 9 and above the vast majority of this information is encoded within the "AI9_DataStream" portion of the file. This data is encoded via ascii85 and compressed. We have found some success in getting at this data by using: https://github.com/huandu/node-ascii85 to decode and nodes zlib
library to decompress / unzip. (Our project is written in node / javascript). However it seems that in roughly half of our test cases / files the unzipping portion fails, throwing Z_DATA_ERROR
/ "incorrect data check".
Our method responsible for trying to decode:
export const decode = eps =>
new Promise((resolve, reject) => {
const lineDelimiters = /\r\n%|\r%|\n%/g;
const internal = eps.match(
/(%AI9_DataStream)([\s\S]*?)(AI9_PrivateDataEnd)/
);
const hasDataStream = internal && internal.length >= 2;
if (!hasDataStream) resolve('');
const encoded = internal[2].replace(lineDelimiters, '');
const decoded = ascii85.decode(encoded);
try {
zlib.unzip(decoded, (err, buffer) => {
// files can crash this process, for now we need to allow it
if (err) resolve('');
else resolve(buffer.toString('utf8'));
});
} catch (err) {
reject(err);
}
});
I am wondering if anyone out there has had any experience with this issue and has some insight into what might be causing this and whether there is an alternative avenue to explore for reliably decoding this data. Information on this topic seems a bit sparse so really anything that could get us going in the right direction would be very much appreciated.
Note: The buffers produced by the ascii85 decoding all have the same 78 9c
header which should indicate standard zlib compression (and it does in fact decompress into parsable data about half the time without error)