0

Hi Folks
I have been working on a python module which will convert a binary string into a CSV record. A 3rd Party application does this usually, however I'm trying to build this logic into my code. The records before and after conversion are as follows:

CSV Record After Conversion

0029.6,000.87,002.06,0029.2,0010.6,0010.0,0002.1,0002.3,00120,00168,00054,00111,00130,00000,00034,00000,00000,00039,00000,0313.1,11:09:01,06-06-2015,00000169

I'm trying to figure out the conversion logic that has been used by the 3rd party tool, if anyone can help me with some clues regarding this, it would be great!
One thing I have analysed is that each CSV value corresponds to an unsigned short in the byte stream.
TIA, cheers!

Nachiketh
  • 193
  • 2
  • 18
  • possible duplicate of [Python - Converting Hex to INT/CHAR](http://stackoverflow.com/questions/7595148/python-converting-hex-to-int-char) –  Jun 10 '15 at 08:27
  • it is not a duplicate 'coz it is not a simple hex to int/char conversion that is going on, I'm trying to identify the logic which is being used for this specific conversion. I have tried the standard ways using libraries provided by python to do the conversion but without much success, that is the reason I decided to post here. – Nachiketh Jun 10 '15 at 09:23

3 Answers3

1

As already mentioned, without knowing the binary protocol it will be difficult to guess the exact encoding that is being used. There may be special case logic that is not apparent in the given data.

It would be useful to know the name of the 3rd party application or a least what field this relates to. Anything to give an idea as to what the encoding could be.

The following are clues as you requested on how to proceed:

  1. The end of the CSV shows a date, this can be seen at the start

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

  1. The end value 169 (hex A9) is suspiciously in between the next two hex values

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

  1. "00039," could refer to the last 4 digits

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

or:

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00 ....or 27 00 00 00

...you guess two bytes are used so perhaps the others are separate 0 value fields.

  1. "00034," could refer to:

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

and so on... simply convert some of the decimal numbers into hex and search for possible locations in the data. Consider that fields might be big or little endian or a combination thereof.

You should take a look at the struct python library which can be useful in dealing with such conversions once you know the formatting that is being used.

With more data examples the above theories could then be tested.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
  • wow! that helped me understand the starting and ending of each record in the Binary, i have rectified the same, hex record corresponding to the converted values is actually `28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00 3B 0C 01 09 11 06 06 15 20 AA A9` cheers! will continue decoding this ... hopefully I should be able to find out the logic.. – Nachiketh Jun 10 '15 at 10:35
  • and here is some output from the python struct library: `>>> struct.unpack('h','\x28\x01') (296,) >>> struct.unpack('h','\x57\x00') (87,)` each field is packed as an unsigned Short, so the convert is returning the digits right but not the floating point values. – Nachiketh Jun 10 '15 at 11:00
  • The struct format string can be used to unpack all the items at once with some suitable slicing, or at least in suitable groups. – Martin Evans Jun 10 '15 at 11:14
  • You might want to look at the following [wikipedia article](http://en.wikipedia.org/wiki/Fixed-point_arithmetic) on fixed point numbers. What you have is probably not floating point. – Martin Evans Jun 10 '15 at 11:31
  • Thanks for your help! I was able to convert the binary data into ascii using the struct unpack function and also figure out the way to represent the floating point numbers. cheers! :-) – Nachiketh Jun 16 '15 at 07:56
0

From the binary into meaningful strings, we must know that the binary code protocol We can't resolve the binary out of thin air.

Eason Ren
  • 1
  • 1
  • Yea I understand that is the normal conventional way, unfortunately it is a legacy application which is currently doing it and we do not have access to the code, we only have the executable, hence I'm trying to figure out all combinations on each value to see if there is something which yields the value showing up. – Nachiketh Jun 10 '15 at 09:18
0

Take a look at my Python script which converts a binary file to a CSV or BSV file given a C header file and a C struct name defining that binary record. https://github.com/SShabtai/MsgGini. Although not complete, it might give you some hints...

SShabtai
  • 1
  • 1