Extract compressed data from binary database

Question

I have a c-treeAce database that is storing it's data in a compressed form. I believe it is using zlib, but I can't tell for certain. Once I get past the initial header information, this is what appears to be the first couple records in compressed form.

80 32 E4 96 48 59 0B BD 6F 21 D3 85 37 E9 FA AC 5E 10 C2 74 26 D8 8A 9B 
4D FF B1 63 15 C7 79 2B 3C EE A0 52 04 B6 68 1A CC DD 8F 41 F3 A5 57 09 BB 
6D 7E 30 E2 94 46 F8 AA 5C 0E 1F D1 83 35 E7 99 4B FD AF C0 72 24 D6 88 3A 
EC 9E 50 61 13 C5 77 29 DB 8D 3F F1 02 B4 66 18 CA 7C 2E E0 92 A3 55 07 B9 
6B 1D CF 81 33 44 F6 A8 5A 0C BE 70 22 D4 E5 97 49 FB AD 5F 11 C3 75 86 38 
EA 9C 4E 00 B2 64 16 27 D9 8B 3D EF A1 53 05 B7 C8 7A 2C DE 90 42 F4 A6 58 
69 1B CD 7F 31 E3 95 47 F9 0A BC 6E 20 D2 84 36 E8 9A AB 5D 0F C1 73 25 D7 
89 3B 4C FE B0 62 14 C6 78 2A DC ED 9F 51 03 B5 67 19 CB 7D 8E 40 F2 A4 56 
08 BA 6C 1E 2F E1 93 45 F7 A9 5B 0D BF D0 82 34 E6 98 4A FC AE 60 71 23 D5 
87 39 EB 9D 4F 01 12 C4 76 28 DA 8C 3E F0 A2 B3 65 17 C9 7B 2D DF 91 43 54 
06 B8 6A 1C CE 80 32 E4 F5 A7 59 0B BD 6F 21 D3 85 96 48 FA AC 5E 10 C2 74 
26 37 E9 9B 4D FF B1 63 15 C7 D8 8A 3C EE A0 52 04 B6 68 79 2B DD 8F 41 F3 
A5 57 09 1A CC 7E 30 E2 94 46 F8 AA BB 6D 1F D1 83 35 E7 99 4B 5C 0E C0 72 
24 D6 88 3A EC FD AF 61 13 C5 77 29 DB 8D 9E 50 02 B4 66 18 CA 7C 2E 3F F1 
A3 55 07 B9 6B 1D CF E0 92 44 F6 A8 5A 0C BE 70 81 33 E5 97 49 FB AD 5F 11 
22 D4 86 38 EA 9C 4E 00 B2 C3 75 27 D9 8B 3D EF A1 53 64 16 C8 7A 2C DE 90 
42 F4 05 B7 69 1B CD 7F 31 E3 95 A6 58 0A BC 6E 20 D2 84 36 47 F9 AB 5D 0F 
C1 73 25 D7 E8 9A 4C FE B0 62 14 C6 78 89 3B ED 9F 51 03 B5 67 19 2A DC 8E 
40 F2 A4 56 08 BA CB 7D 2F E1 93 45 F7 A9 5B 6C 1E D0 82 34 E6 98 4A FC 0D 
BF 71 23 D5 87 39 EB 9D AE 60 12 C4 76 28 DA 8C 3E 4F 01 B3 65 17 C9 7B 2D 
DF F0 A2 A2 54 B8 8A 1C CE 68 91 43 F5 A7 59 0B BD F5 22 32 E4 ED 41 FA AC 
5E 10 C2 D3 4D 37 E9 9B A3 CA AA BE F1 55 1D 8A 3C EE A0 52 04 15 C7 79 2B 
DD 8F 41 F3 A5 B6 68 1A CC 7E 30 E2 94 46 57 30 E2 94 46 F8 AA 5C 0E 1F D1 
83 35 E7 99 4B FD AF C0 72 24 D6 88 3A EC 9E 50 61 13 C5 77 29 DB 8D 3F F1 
02 B4 66 18 CA 7C 2E E0 92 A3 55 07 B9 6B 1D CF 81 33 44 F6 A8 5A 0C BE 70 
22 D4 E5 97 49 FB AD 5F 11 C3 75 86 38 EA 9C 4E 00 B2 64 16 27 D9 8B 3D EF 
A1 53 05 B7 C8 7A 2C DE 90 42 F4 A6 58 69 1B CD 7F 31 E3 95 47 F9 0A BC 6E 
20 D2 84 36 E8 9A AB 5D 0F C1 73 25 D7 89 3B 4C FE B0 62 14 C6 78 2A DC ED 
9F 51 03 B5 67 19 CB 7D 8E 40 F2 A4 56 08 BA 6C 1E 2F E1 93 45 F7 A9 5B 0D 
BF D0 82 34 E6 98 4A FC AE 60 71 23 D5 87 39 EB 9D 4F 01 12 C4 76 28 DA 8C 
3E F0 A2 B3 65 17 C9 7B 2D DF 91 43 54 06 B8 6A 1C CE

I know that the data saved has the following fields:

timestamp   8
character   50
character   50
varchar 4096

I have the following basic uncompress method:

[DllImport("zlib1.dll", CallingConvention = CallingConvention.Cdecl)]  //here I am using the same dll as the application in case it was modified
static extern int uncompress(ref byte[] dest, ref uint destLen, byte[] source, uint sourceLen);

public static byte[] DeCompressToString(byte[] data)
{
    uint _dLen = 8192;
    byte[] _d = new byte[_dLen];
    if (uncompress(ref _d, ref _dLen, data, (uint)data.Length) != 0)
        return null;
    byte[] result = new byte[_dLen];
    Array.Copy(_d, 0, result, 0, result.Length);
    return result;
}

I then tried iterating through the file in every possible combination of starting points and bytes to decompress using the following, but I got nothing back

FileNamePath = @"C:\TestFile.dat";
FileStream WorkingFile = new FileStream(FileNamePath, FileMode.Open, FileAccess.Read);
int GrabLength = 10000;
byte[] ByteArray = new byte[GrabLength];
WorkingFile.Position = 16384; //skip past header and null data
WorkingFile.Read(ByteArray, 0, GrabLength);

for (int i = 0; i < GrabLength; i++)
{
    byte[] NewByteArray = ByteArray.Skip(i).Take(GrabLength-i).ToArray();
    for (int t = 0; t < GrabLength - i; t++)
    {
        byte[] PartialByteArray = NewByteArray.Skip(0).Take(t).ToArray();
        try
        {
            var testDecompress = MainWindow.DeCompressToString(PartialByteArray);
            if (testDecompress != null)
            {
                Console.WriteLine(testDecompress); //breakpoint here was never reached.
            }
        }
        catch { }
    }
}

Assuming that the compressed length of a line would be less than the 4200 of an uncompressed line, it would seem that it should have found at least one byte array that could be decompressed.

Investigate whatever is storing this data and find out for certain. — Dour High Arch, Sep 09 '19 at 18:04
I have, the only thing that makes me uncertain is the listed ZLIB headers here don't match up to the start bytes https://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like/17176881#17176881 — Alan, Sep 09 '19 at 18:06
Why are you trying to directly access raw DB information in the file system? You should never normally need to do that. Can't you load the DB and query it through an appropriate interface instead? There should be a nuget--unless this is a competely new product and you're the one building the nuget... — TheAtomicOption, Sep 09 '19 at 18:17
@TheAtomicOption I don't have access to the full DB through ODBC. In past versions of the DB the files were all flat data files and I manually parsed the records out. Trying now to do the same with records that are compressed. — Alan, Sep 09 '19 at 18:24
@jdweng I guess it could be serialized, the binary file I am looking at is 125k, exporting to a csv the data is 91k. I will look into deserializing — Alan, Sep 09 '19 at 18:25
If it's a DB permissions thing, that's something that needs to be fixed at that point instead. ODBC is old. Are you stuck with ODBC as your only access method? Also, if it is zlib, you should be able to use something like the zlib.net nuget to avoid trying to handle and correctly decompress raw byte information yourself. But if the zlib headers aren't there, then it's either not zlib, or they're using their own custom modified version of it for some reason (unlikely, but if that's what they're doing you've got a big hill to climb). — TheAtomicOption, Sep 09 '19 at 18:29
It may be a list/array object of a class. So it would be approximately 25 to 50 objects (125K/ 5K). The 5 K being around 4096 bytes and maybe compressed so the number could be larger. — jdweng, Sep 09 '19 at 18:31
I am working with the binary files because ODBC isn't a great option here. I am bypassing vendor authentication that limits the access to certain tables that I need, and that I have permission from the data owner to access. There are 675 rows stored, but the last column is usually only 15 bytes of data. In the past, parsing the binary timestamp field was 5 bytes of data. — Alan, Sep 09 '19 at 19:06
If you have to bypass db vendor authentication because the DB vendor doesn't allow you to query through ODBC, then you're probably breaking a license somewhere. If this isn't simply compressed, it's possible they've started encrypting it to prevent exactly this sort of work around. — TheAtomicOption, Sep 09 '19 at 20:09

Extract compressed data from binary database

0 Answers0