How can I get a _raw stream_ of bytes being stored in a database?

Question

I have the following database, written in some variant of MS Access (it's an MDB file -- UNIX file confirms this)

I'm trying to read the contents of from Javascript in Windows. Though, really, any tools would be fine. For example, Python would be fine as well... but I am trying to avoid VB6. Of course, I can cut some of the code there and run it in the IDE and I can extract the values. But I want to understand the underlying mechanisms

I am using a query from node-adobd to select all the maps "Data" column. It's a string that was compressed via zlib (compression level unknown) and then shoved into a table.

In Node, using the following function I was able to make a query and get the underlying string stream...

75 0a 00 00 **78 01** 53 d5 20 ...

That's a ZLIB magic number which tipped me off. I tried using zlib from Node to decompress it but it's failing with various errors. Those first 4 bytes are a header the game engine seems to append; but it has nothing to do with the payload from what I can tell (you can read the compression routine here.

Since it seems to read it in VB6 land without a problem without any other post processing from what I can tell (you can find it modDatabase if you want to read it) all I can think of is: something is happening in the database connector or when I get the data. Something is being encoded without my consent, perhaps.

I query like such:

  connection.query('Select Number, Data from [Maps]')
   .then((results) => {
     const result = results[0]; // first one to experiment with
     // use the above code to construct a byte array from 
     // result.Data, but I have also tried things like 
     // Buffer.from ... but it asks about encodings
     // which I am unsure of!
   });

And I get the above signature in the case of getting raw bytes. Getting a buffer with another encoding has varying degrees of success. Viewing the raw data in DBeaver gives similar results depending on the encoding. However, to view it I have to pick an encoding.

My understanding is strings are stored simply as sequences of bytes in memory at the end of the day -- can I not get access to this from a low level? If not, how do you handle converting the encoding?

According to this page, Memo is Unicode (but it does not really say what) but it mentions two bytes, so probably UTF-16 assuming this is a relatively modern version of access...

Have you looked at https://stackoverflow.com/questions/21426935/accessing-mdb-files-through-nodejs or https://stackoverflow.com/questions/19807081/how-to-connect-php-with-microsoft-access-database? One is Node.JS and the other is PHP, if you're willing to go that route. I'm sure there's other options for reading an MDB file w/o VB. I, too, shudder to think about using VB. — computercarguy, May 24 '19 at 02:46
@computercarguy See above -- I am connecting to them. You can see the code block. To clarify -- I can _read data_ but it seems like the adapter is either doing some "encoding on my behalf" or on "VBs behalf"... so I am trying to understand how database acess APIs normally do this. Obviously, if the DAO Library in VB6 is is converting to something before decompressing the string, I have to do that in Node as well. I'm trying to ensure A. I have the "raw stream" and B. understand "what VB is doing under the hood" — Vaughan Hilts, May 24 '19 at 02:49
I recommended those links because they don't say anything about having to decode the results after getting them. Your encoded stream might just be an error message, since we can't see the actual result, or it might be some random chunk of the file itself. I'd guess using the correct ADODB adapter would get you results you can use without having to decode it first. If that's just the zipped file you're querying for, then try a JavaScript Unzip routine. https://stackoverflow.com/questions/294297/javascript-implementation-of-gzip — computercarguy, May 24 '19 at 02:59
The ZIP contains the MDB. It's suffixed .DAT, but it's definitely an MDB. The result of the query is in the question. Of course, using the right ADODB adapter in _VB_ would give me the results I'm looking for. My question is: how can I find the encoding VB6 was using and query THAT in another language? — Vaughan Hilts, May 24 '19 at 03:02
If the ADODB component you are using from VB6 which you know does what you want is COM (which it almost certainly is) then you should be able to use exactly the same component from any language which understands COM. — StayOnTarget, May 24 '19 at 11:44
If it is an access database, another option would be use a .NET language (vb.net, c#, etc) with the ADO.NET drivers. — MarkL, May 24 '19 at 13:32
In reference to your link about the Unicode and Memo fields, I think that's actually talking about how the Access UI handles data and stores data. I believe that whatever you would programatically write to a memo column is what would be stored there. If the original VB6 app wrote a vb6 string to that column, then it would be unicode data. If it, however, did some sort of conversion to it, for instance a call to StrConv, then what's in that column is whatever the app writes out to it. More of an app implementation question than an mdb question. — MarkL, May 24 '19 at 16:20
@MarkL That's for the clarification. Though, I imagine an ADODB adapter would have to convert the raw stream into some kind of string on the destination language -- it must be interpreting it _some way_, no? In that case, is the function I posted above (https://stackoverflow.com/questions/1240408/reading-bytes-from-a-javascript-string/1242596#1242596) enough to "remove the encoding" from the string? Since it's _binary_ data masquerading as a string, I merely just want the "binary data out of it" I am happy to apply a bounty if you wanted to spend time to write an example. — Vaughan Hilts, May 25 '19 at 16:50
@deblocker Could you explain why you think that converting to this code page would work? — Vaughan Hilts, Jun 02 '19 at 23:49

How can I get a _raw stream_ of bytes being stored in a database?

0 Answers0