0

I support an older c++ v100 security system. Most of the xml stored in the database in a byte array (blob data), is easy enough to decrypt with UTF-8. However, we have a couple places where I have been unable to figure out how to decrypt the byte array to anything useful, using C# 4.7. After some studying of the data yesterday, it was determined that the data uses some form of encoding called 'UCS-2 LE BOM'.

I almost have to assume this is some form of Unicode... but I have yet to get that binary array to decrypt to anything decipherable.

If you have any sort of example code that would me the readable string, it would be most appreciated.

Thus far I have tried all forms of encoding offered in C#, but the results are all cryptic.

The xml stored in each record, is like this:

<COMMAND_BLOCK> 
   <COMMAND Workstation='{Workstation}' Source="ServiceManager" Target="{Unit:|SELECT DISTINCT [Name],[Address] FROM [MasterControllers] WHERE [Address] LIKE '{Workstation}.R%'}" Command="50001"/>
</COMMAND_BLOCK>

I am currently only trying to get the stored byte array to be a string. Once it is a string, I'll be fine with it. I've tried the following:

var xml = Encoding.UTF8.GetString(_commandObjectList[i].ImageData);
var xml2 = Encoding.UTF32.GetString(_commandObjectList[i].ImageData);
var xml3 = Encoding.Unicode.GetString(_commandObjectList[i].ImageData);
var xml4 = Encoding.ASCII.GetString(_commandObjectList[i].ImageData);
var xml5 = Encoding.BigEndianUnicode.GetString(_commandObjectList[i].ImageData);
var xml6 = Encoding.Default.GetString(_commandObjectList[i].ImageData);
var xml7 = Encoding.UTF7.GetString(_commandObjectList[i].ImageData);

Returned values

Raw blob data stored in Sql Server Table: 0xFFFE3C0043004F004D004D0041004E0044005F0042004C004F0043004B003E0020000D000A003C0043004F004D004D0041004E004400200057006F0072006B00730074006100740069006F006E003D0027007B0057006F0072006B00730074006100740069006F006E007D002700200053006F0075007200630065003D00220053006500720076006900630065004D0061006E006100670065007200220020005400610072006700650074003D0022007B0055006E00690074003A007C00530045004C004500430054002000440049005300540049004E004300540020005B004E0061006D0065005D002C005B0041006400640072006500730073005D002000460052004F004D0020005B004D006100730074006500720043006F006E00740072006F006C006C006500720073005D0020005700480045005200450020005B0041006400640072006500730073005D0020004C0049004B004500200027007B0057006F0072006B00730074006100740069006F006E007D002E005200250027007D002200200043006F006D006D0061006E0064003D0022003500300030003200370022002F003E000D000A003C002F0043004F004D004D0041004E0044005F0042004C004F0043004B003E000D000A000000

The C++ code that encodes and decodes the xml into/from the database, is CFile, located in 'C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\atlmfc\include\afx.h'. C++ code used for encode/decode:

            CFile fileXML;
            if(fileXML.Open(dlg.m_strFileName, CFile::modeCreate|CFile::modeWrite))
            {
                fileXML.Write(dlg.m_strCommand,dlg.m_strCommand.GetLength());
            fileXML.Close();
            }
Jonathan Hansen
  • 423
  • 1
  • 7
  • 14
  • UTF-8 isn't an encryption format, it's a [character encoding format](https://en.wikipedia.org/wiki/UTF-8), as is UCS-2 Little Endian with Byte Order Mark. Are you sure your `ImageData` really contains XML? Might you be able to share a [mcve] showing the original, raw binary data as stored in the blob data without any transforms having been applied? – dbc Oct 10 '19 at 17:16
  • Added requested additional info to the question – Jonathan Hansen Oct 10 '19 at 19:23
  • The real problem seems to have nothing to do with encodings. UCS-2 LE BOM has a Byte Order Mark at the beginning, so if you read the binary with a `StreamReader`, the `StreamReader` will automatically pick up the BOM and choose a correct encoding. The real problem seems to be that the raw blob data was converted to binary wrongly. When I converted it manually as shown in [this answer](https://stackoverflow.com/a/321404) and trimmed trailing zeros, the XML could be parsed successfully. Demo fiddle: https://dotnetfiddle.net/VX60cD. – dbc Oct 10 '19 at 20:22
  • The reason I think this has nothing to do with encodings is that if I manually convert the blob data to binary and do `Encoding.Unicode.GetString(binary)`, I get the expected BOM, and then well-formed XML. I certainly don't get the odd sequence of Kanji characters shown in your question. – dbc Oct 10 '19 at 20:34
  • Does the above answer your question, or do you need additional help? – dbc Oct 10 '19 at 22:19
  • It appears I need some hand holding. Encoding.Unicode.GetString(), I get jibberish. How is it you got well formed xml? – Jonathan Hansen Oct 21 '19 at 19:49
  • By the time the data makes it to my method, it is already in a byte array. If I needed to convert it back to binary, trim off trailing zeros, then back to the byte array using your method, how would I do that? – Jonathan Hansen Oct 21 '19 at 19:52
  • I manually decoded it from the HEX string "Raw blob data" you included in your question: `0xFFFE3C004...`. Specifically I used [this answer](https://stackoverflow.com/a/321404/3744182) to [How can I convert a hex string to a byte array?](https://stackoverflow.com/q/321370/3744182) as shown in my [fiddle](https://dotnetfiddle.net/VX60cD). So the real problem must be that the code you're using to convert that raw blob data HEX string into the `_commandObjectList[i].ImageData` byte array isn't working. But you don't show that code so we don't know where you're going wrong. – dbc Oct 21 '19 at 19:54
  • *If I needed to convert it back to binary, trim off trailing zeros, then back to the byte array using your method, how would I do that?* -- how could we possibly know? We could only answer that if we knew how it was converted to `byte []` to begin with (and which we now suspect was done wrongly). Can you share a [mcve] showing how that was done? Or share the current (mysterious) contents of `commandObjectList[i].ImageData` by converting them back to a HEX string as shown in [How do you convert a byte array to a hexadecimal string, and vice versa?](https://stackoverflow.com/q/311165)? – dbc Oct 21 '19 at 19:58
  • Ok, I'm back on this (I was put on other projects for a while). The C++ code that encodes and decodes the xml into/from the database, is CFile, located in 'C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\atlmfc\include\afx.h' I added the code above to the question. – Jonathan Hansen Nov 12 '19 at 18:34

0 Answers0