2

I'm not sure if this board is the right place for such question, but I really couln't find a better place, so let me apologize for this in advance.

I'm trying to read a third party database for interoperability purposes and I'm having a very hard time with one specific table. This table has two columns: blobSize and blob. Blob size is an integer and blob is a byte array.

I'm guessing this field is zipped based on two assumptions:

1) blobsize does not correspond to the actual size of the blob field, as an example the blob I'll post at the end of the post has 294 bytes, while the blobsize informs a size of 2560.

2) the blob starts with 0x50 0x4B 0x01 0x02 (P K 1 2) which is pretty similar to the central directory header of a zip file (https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html#datadescriptor). But zip files have the zipped data in the begining of the file and the central directory is in the end of the file. The blob starts with something similar to the zip format central directoy and then have a lot of data, which is the inverse.

I tried to decompress the data with SevenZipSharp and XCeed Zip libraries without success. Since this data is generated in the application (and not zipping a file), there wouldn't be any information about filename, size, modification date etc in the blob, and these libraries expect that the data is from a file.

I also tried to find each element of the central directoy in the bytes, and they seem to follow what is specified in the zip file format. One special information that is present in the central directory section is the compression method, which in these database field would be '0x09 0x00', which should be enhanced deflate (deflate64).

Maybe I don't know how to decompress this data with the libraries, maybe they even aren't a compressed field. Maybe someone more experienced with compressed data or zip files may direct me to the right path.

This data should contain geometric information of some database elements. I also don't think it's is an encrypted field, because all the other data in the database is in binary format, but open and I managed to read them all. This is the only field that is giving me headaches.

As example, here it the contents of one row:

  • blobSize: 2560

  • blob:

-

string hexa = "504B01021500150004000900C0480E470000C048FFFFFFFF000000000000000000000000FFFF0000000000000000BB705EF0C1C28D520F19D0801D0333C3BFFF9C0C6C48E28C4036088381000303139001E2FFFBFFFF3F44908101C81C550D007F81B1058A3F181E550D883C2A445610433E1096302830B832E401E922864A5856268A16636085E77978D9804367C164959D9DD72E303283E4A18A5D80F6BA31C433043384300401D98E0CBE3874631716636062440E06ECAA30456F610A912D428EFD645B86452325F683A201548E83E20454068CB6070037D552A6591CB3F7E3A0EDA1E2705A80C119585EE4321400C962864C608991CAE00EC4203103206460C06112CC06B849309365817AF0800FF6782491A43ED82FE3021B09564FA82842D238B29900";

byte[] bytes = Enumerable.Range(0, hexa.Length)
                 .Where(x => x % 2 == 0)
                 .Select(x => Convert.ToByte(hexa.Substring(x, 2), 16))
                 .ToArray();
rbasniak
  • 4,484
  • 11
  • 51
  • 100
  • _"I'm trying to read a third party database for interoperability purposes"_ - do they publish an _API_ instead? Also, there are many compression techniques, it may not be zip. –  Sep 04 '15 at 13:02
  • @mickyduncan Unfortunately there's no API. I researched many compression techniques and each one has a header like this, but the only one which it begins with PKxx is the PKZIP format – rbasniak Sep 04 '15 at 16:35

1 Answers1

1

But zip files have the zipped data in the begining of the file and the central directory is in the end of the file.

That's true, but from the page you linked, each zip file header starts with the letters "PK" or 0x50 0x4B. This indicates it at least looks like a zip file, and you could try to read it as such.

See Unzip a memorystream (Contains the zip file) and get the files for examples.

Community
  • 1
  • 1
CodeCaster
  • 147,647
  • 23
  • 218
  • 272
  • _"indicates it's a PKZIP file"_ - Are you sure? _[This is always '\x50\x4b\x03\x04'.](https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html)_ –  Sep 04 '15 at 13:23
  • 1
    @Micky you're right I guess, the `PK 0x01 0x02` is a "central directory" header, `PK 0x03 0x04` is a header for a "local file". Still can't hurt to at least try to open the data as a zip file. – CodeCaster Sep 04 '15 at 13:28