I'm trying to write a collection of yara signatures that will tag zip files based on artifacts of their creation.
I understand the EOCD has a magic number of 0x06054b50, and that it is located at the end of the archive structure. It has a variable length comment field, with a max length of 0xFFFF, so the EOCD could be up to 0xFFFF+ ~20 bytes. However, there could be data after the zip structure that could throw off the any offset dependent scanning.
Is there any way to locate the record without scanning the whole file for the magic bytes? How do you validate that the magic bytes aren't there by coincidence if there can be data after the EOCD?