0

My problem is that I have a string in hex representation say: 036e. I want to write it to a binary file in the same hex representation. I first converted the string to an integer using sstrtoul() function. Then I use fwrite() function to write that to a file. Here is the code that I wrote. I get the following output in my file after running this:

6e03 0000

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main() {
  ofstream fileofs("binary.bin", ios::binary | ios::out);
  string s = "036e";
  int x = strtoul(s.c_str(), NULL, 16);
  fileofs.write((char*)&x, sizeof(int));
  fileofs.close();
}

While the result that I expect is something like this:

036e

Can anybody explain exactly what I'm doing wrong over here?

Rahul Sharma
  • 5,562
  • 4
  • 24
  • 48
  • Adjust your expectations so that they coincide with reality. Apparently you are working on a little-endian platform, where `sizeof(int)` is 4. Expecting 2 bytes defies logic, as much as expecting big-endian layout. – IInspectable Oct 09 '16 at 18:26
  • I tried using `unsigned short` instead. But it outputs some garbage results. – Rahul Sharma Oct 09 '16 at 18:30

2 Answers2

3

Your problem has to do with endianees and also with the size of an integer.

For the inverting bytes the explanation is that you are running in a little-endian system.

For the extra 2 zeros the explanation is that you are using a 32 bit compiler where ints have 4 bytes.

There is nothing wrong with your code as long as you are going to use it always in 32 bit, little-endian systems.

Provided that you keep the system, if you read your integers from the file using similar code you'll get the right values (your first read integer will be 0x36E).

To write the data as you wish you could use exactly the same code with minor changes as noted bellow:

unsigned short x = htons(strtoul(s.c_str(), NULL, 16));
fileofs.write((char*)&x, sizeof(x));

But you must be aware that when you read back the data you must convert it to the right format using ntohs(). If you write your code this way it will work in any compiler and system as the network order is allways the same and the converting functions will only perform data changes if necessary.

You can find more information on another thread here and in the linux.com man page for those functions.

Community
  • 1
  • 1
João Amaral
  • 321
  • 2
  • 7
  • Thanks. Got it. But still I'd want to know if there is any way I can write the string "03e6" as hex to the binary file (may be without converting it into an int)? – Rahul Sharma Oct 09 '16 at 19:03
  • If you have just 2 bytes you could use a short instead of an int and use htons() function on top of the converted data. I'll change my answer with the complete example. – João Amaral Oct 09 '16 at 19:47
1

If you want 16 bits, use a data type that is guaranteed to be 16 bits. uint16_t from cstdint should do the trick.

Next Endian.

This is described in detail many places. The TL;DR version is some systems, and virtually every desktop PC you are likely to write code for, store their integers with the bytes BACKWARD. Way out of scope to explain why, but when you get down into the common usage patterns it does make sense.

So What you see as 036e is two bytes, 03 and 6e, and stored with the lowest significance byte, 6e, first. So that the computer sees is a two byte value containing 6e03. This is what is written to the output file unless you take steps to force an ordering on the output.

There are tonnes of different ways to force an ordering, but lets focus on the one that both always works(even when porting to a system that is already big endian) and is easy to read.

uint16_t in;
uint8_t out[2];

out[0] = (in >> 8) & 0xFF; // put highest in byte in first out byte
out[1] = in & 0xFF; // put lowest in byte in second out byte

out is then written to the file.

Recommended supplementary reading: Serialization This will help explain the common next problem: "Why my strings crash my program after I read them back in?"

user4581301
  • 33,082
  • 7
  • 33
  • 54