0

I am very new to Python and I am trying to read in a file that partially contains binary data. There is a header with some information about the data and after the header binary data follow. If one opens the file in a texteditor it looks like this:

>>> Begin of header <<<
value1: 5
value2: 7
...
value65: 9
>>> End of header <<<
���ÄI›C¿���†¨¨v@���ÄW]c¿��� U⁄z@���@¬P\¿����∂:q@���@Ò˚U¿���†÷Us@���`ªw4¿��� :‘m@���@À›9@���ÄAs@���¿‹ ¿����ır@���¿@&%@���†„bq@����*˙-@��� [q@����ÚN8@����
Òo@���@√·T@���†‰zm@����9\@����ÃÜq@����€dZ@���`Ëäs@���†∏8I@���¿¬Ot@���†�6

an additional problem is that I did not create the file myself and do not now if those are double or float data.

So how can I interpret those data?

Glostas
  • 1,090
  • 2
  • 11
  • 21
  • Err all I see is `?` – Universal Electricity Jul 09 '15 at 14:06
  • @UnicornsAreVeryVeryYummy Well, yes, it's binary data, not text. – Colonel Thirty Two Jul 09 '15 at 14:06
  • possible duplicate of [Reading some binary file in Python](http://stackoverflow.com/questions/1035340/reading-some-binary-file-in-python) –  Jul 09 '15 at 14:07
  • 2
    `f = open("myfile", "rb")` –  Jul 09 '15 at 14:08
  • 2
    There's no good way to tell what binary data represents; it might be n doubles or n*2 floats, or n 64-bit integers, or just some binary data. Reverse engineering is a broad field. – Colonel Thirty Two Jul 09 '15 at 14:08
  • 2
    "So how can I interpret those data?" – ask the person that created the file. – mkrieger1 Jul 09 '15 at 14:10
  • So first, thanks to all for the help: So basically the problem is the header. I can read in the data quit well, when i remove the header from the file. This can be done with x = numpy.fromfile(f, dtype = numpy.complex128 , count = -1) quite easily. The problem is that I cannot find any option for the function fromfile that skips lines (one can skip bytes, but the header size may be different from file to file. I did not manage to find the data with the other numpy functions – Glostas Jul 10 '15 at 11:41

1 Answers1

0

So first, thanks to all for the help: So basically the problem is the header. I can read in the data quit well, when i remove the header from the file. This can be done with

x = numpy.fromfile(f, dtype = numpy.complex128 , count = -1)

quite easily. The problem is that I cannot find any option for the function fromfile that skips lines (one can skip bytes, but the header size may be different from file to file.

In this great thread I found the how to convert an binary array to an numpy array:

convert binary string to numpy array

With this I could overcome the problem by reading in the datafile line for line and then merge every line after the end header line together in one string. This string was then changed into an nice array exactly as I wanted it.

Community
  • 1
  • 1
Glostas
  • 1,090
  • 2
  • 11
  • 21