0

I'm writing a small parser and after trials and errors it seems the file byte order is big endian (which i was told it ain't common, but it's there).

I don't think the original devs include anything about endianness since the byte order may depend only in the hardware that wrote the file. Please correct me here if flawed (is it possible that the developers specify in the C code the endianness?).

So I don't really find how would I parse those files, when there is no actual way to determine the byte order - say, for a Int32 number. I've read this similar post but that's for a system that writes and reads the binary files, hence you can just use an system-endianness reader.

In my case, the code parses the instrument output gathered and binary-written by potentially any type of computer with any OS (but I guess again endianness depends on the system architecture and not the OS).

Do you have any idea/pointers on how to deal with this problem?

Wikipedia was very informative but as far as I read it's just general information.

Minsky
  • 2,277
  • 10
  • 19
  • Just to get it right. You are trying to read a binary formatted file, which was created by a C-program (but you don't have access to that program)? If you don't have the source of the C program AND can't ask the developers of that program AND the binary format specification does not specify endianess, I think you will have a hard time to figure it out – Jakob Stark Jan 29 '22 at 11:14
  • "*is it possible that the developers specify in the C code the endianness*". Yes it is. *If* a dev has defined the data to be big endian then they can certainly ensure it is written in big endian. But you can't determine endianness from arbitrary data without it being specified or unless there is other info, such as known markers, that you know about in the data – kaylum Jan 29 '22 at 11:16
  • @JakobStark partially yes. I only have access to C the structures described in a manual, in a vague way. But that is quite useful. – Minsky Jan 29 '22 at 11:47
  • @kaylum thanks so much. Then, once I figure out (interpreting the numbers) this is big endian, is that it? Or maybe they didn't define it and the system defines it? – Minsky Jan 29 '22 at 11:48
  • @Minsky it is either LE or BE. If you don't know what the developers did, it does not matter if the endianess is defined by the system or the C program. – Jakob Stark Jan 29 '22 at 11:54
  • @JakobStark it does matter, if the devs didn't and the system does, someone running the parser on a different computer could get the wrong output. – Minsky Jan 29 '22 at 12:41
  • @Minsky sorry if that was not clear enough. I meant it in the sense, that if you do not know what the developers did, you need to deduce the endianness from the file anyway and it does not matter if it was defined by the system or the C program – Jakob Stark Jan 29 '22 at 12:55
  • @JakobStark you're right, thanks for clarifying. So far there aren't many fixed values for a byte(s) that I could use to deduce endianness. But I will resort to that probably. – Minsky Jan 29 '22 at 13:46

0 Answers0