2

What I want to do: read a series of 4 bytes e.g. 00000000 00000011 00000001 00000011 (this is a random example) from a binary file, and represent it as an integer in my program. What is the best way to do this?

EDIT SOLUTION I overlooked this part of the spec for the PNG file format here, hopefully this is useful to anyone that finds the question.

I am experimenting with the PNG image format and am having trouble extracting a 4 byte number. I Have succeeded in opening and printing the binary representation of the file, so I know that the data I am working with isn't corrupted or malformed.

I have reviewed questions like Reading 16-bit integers from binary file c++, and the 32 bit equivalent(s) but I cannot discern if they are reading integers that are in a binary file e.g. 00000000 72 00000000 or reading bytes as integers, which is what my goal is.

As an example, the first four bytes of the first chunk are 00000000 00000000 00000000 00001101 or 13.

Following the example of questions like the one above, this should == 13:

int test;
img.read( (char*) &test, sizeof(test));

yet it outputs 218103808

I also tried the approach of using a union with a character array and integer data member, and got the same output of 218103808

also, on my system sizeof(int) is equal to 4

And lastly, just to be sure that it wasn't a malformed PNG (which it wasn't I am rather sure) I used gimp to import it then export it as a new file, therefore natively created on my system.

EDIT

As I mentioned, after seekg(8) the next four bytes are 00000000 00000000 00000000 00001101 but when I decided to test the read function using

bitset<32> num;
img.read( (char*) &num, sizeof(int) );

it outputs 00001101 00000000 00000000 00000000 I am simply confused by this part, here. It's as if the bytes are reversed here. And this string of bytes equates to 218103808

Any insight would be appreciated

Community
  • 1
  • 1
Trés DuBiel
  • 540
  • 3
  • 16
  • For a general notion of how to get started, you might want to look at one my older answers. http://stackoverflow.com/a/5762648/179910 – Jerry Coffin Nov 04 '15 at 00:14
  • Thanks, that is quite helpful. I am at a bit of a loss as to why I am getting the value `218103808` mentioned in my edit. I see it in your header verification function. I know it signifies the length, which is 13, but why is it mangled to that other number? Thanks again – Trés DuBiel Nov 04 '15 at 00:32
  • I think you are confused about the little-endian notation. First paragraph of https://en.wikipedia.org/wiki/Endianness should get you on track. – Andrea Gilmozzi Nov 04 '15 at 00:35
  • @TrésDuBiel: You can find a clue to the answer in the name: "portable NETWORK graphics". The numbers in a PNG file are stored in what's often referred to as "network order"--that is, big-endian. Intel's processors are little endian, so you need to byte-swap the numbers to get the correct values. – Jerry Coffin Nov 04 '15 at 16:12

1 Answers1

3

Notice that 218103808 is 0x0D000000 in hex. You might want to read about Endianess

That means the data you are reading is in big endian format, while your platform uses little endian.

Basically you need to reverse the 4 bytes, (and you likely want to use unsigned integers), so you get 0x0000000D, (13 decimal) which you can do like:

#define BSWAPUINT(x)  ((((x) & 0x000000ff) << 24) |\
                       (((x) & 0x0000ff00) << 8)  |\
                       (((x) & 0x00ff0000) >> 8)  |\
                       (((x) & 0xff000000) >> 24))
unsigned int test;
img.read( (char*) &test, sizeof(test));
test = BSWAPUINT(test);

The above code will only work if the code runs on a little endian platform though.

To have your code be independent on whether your platform is big or little endian you can assemble the bytes to an integer yourself, given that you know the data format is big endian, you can do:

unsigned char buf[4];
unsigned int test;
img.read( (char*) &test, sizeof(test));
test  = (unsigned int)buf[0] << 24;
test |= buf[1] << 16;
test |= buf[2] << 8;
test |= buf[3];

Or, on unix systems you can #include <arpa/inet.h> and use ntohl()

test = ntohl(test);

(Dealing with data in this manner, you are also better of using types such as uint32_t instead of int/unsigned int's , from stdint.h )

nos
  • 223,662
  • 58
  • 417
  • 506
  • This is certainly the case. I cannot believe that I overlooked the endian-ness of PNG when I researched the file format. Thanks a ton – Trés DuBiel Nov 04 '15 at 00:49