4

i have i file i want to read in C++. First thing i have to read and check is the magic number of the file. In my case it is the Hex-Value: 0xABCDEF00

I read and compare the number this way:

ifstream input ("C:/Desktop/myfile", ios::binary);
if (input.is_open()) {
input.seekg(0, ios::beg);
unsigned char magic[4] = {0};
input.read((char*)magic, sizeof(magic));

if(magic[0] == 0xAB &&
   magic[1] == 0xCD &&
   magic[2] == 0xEF &&
   magic[3] == 0x00) {
   cout << "It's my File!" << endl;
} else {
   cout << "Unknown File!" << endl;
}
}

This works very well, but is there a way to compare the whole read char[]-Array at once? Like this way:

unsigned int magicNumber = 0xABCDEF00;
... same code for reading file as above ...
Instead of checking each Array-Entry a way like this: 

if(magic == magicNumber) {
    do something ...
}

Would be nice to know if there is such a way - if not thanks for teeling me that there is no such way :)

Opa114
  • 518
  • 4
  • 12
  • 28
  • 3
    You could use memcpy to copy the content of the char array it into an unsigned int. `memcpy(&anUnsignedInt, magic, sizeof(unsigned int));` – Christoph Jun 14 '16 at 21:48
  • Is magical number in range of `long`? Then consider `std::stol` or familiar (`stoi` for `int`, `stoll` for `long long`, add `u` before `i`/`l`/`ll` for unsigned)... – Lapshin Dmitry Jun 14 '16 at 21:48
  • 1
    Not going to work in binary file, @LapshinDmitry – user4581301 Jun 14 '16 at 21:48
  • @user4581301 Oh, my fault. Then `Christoph`s comment is a thing! – Lapshin Dmitry Jun 14 '16 at 21:49
  • 1
    With endian issues you'll need to know how this will show up in memory. What you've got here isn't totally mad, but comparing a 4-byte buffer using [`memcmp`](http://en.cppreference.com/w/cpp/string/byte/memcmp) could be an improvement. – tadman Jun 14 '16 at 21:50
  • File is in Big-Endian. `memcmp` seems so be a good alternative. – Opa114 Jun 14 '16 at 21:51
  • By the time the compiler's done with that if, assuming optimization is on, you aren't going to do much better. Converting to unsigned int could send you down an endian rabbit hole, so @Christoph 's solution is a bit of a risk. If you want to give it a shot, save yourself the memcpy and do this: `uint32_t temp; input.read((char*)&temp, sizeof(temp));` – user4581301 Jun 14 '16 at 21:54

4 Answers4

4

Good old memcmp could help here. Once you have read the unsigned char magic[4] you can do the comparison as simply as:

const unsigned char magicref[4] = {0xAB, 0xCD, 0xEF, 0}
if (memcmp(magic, magicref, sizeof(magic)) == 0) {
    // do something ...
}

This is endianness independant.

If you know what you platform will give you for the magic number and do not care about portability on other platforms, you can directly process everything as uint32_t:

uint32_t magic, refmagic = 0xABCDEF00;  // big endian here...
input.read(reinterpret_cast<char *>(&magic), sizeof(magic)); // directly load bytes into uint32_t
if (magic == refmagic) {
    //do something...
}

This is not portable across different platforms, but can be used in simple cases provided a comment in bold red flashing font saying BEWARE: use only on big endian system

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • First part would be the best solution for me i think. Very good that it is endianness independant. But the `const char magicref[4] = {0xAB, 0xCD, 0xEF, 0}` shuld be `const unsigned char magicref[4] = {0xAB, 0xCD, 0xEF, 0}`. If not i get some compiler warnings like: warning: narrowing conversion of '237' from 'int' to 'const char' inside – Opa114 Jun 14 '16 at 22:42
  • 1
    To be perfect @Opa114 it should actually be uint8_t instead, as those are not semantically characters. – olivecoder Jun 14 '16 at 22:44
  • @olivecoder thanks for the hint, i try using `uint8_t` instead of `unsigend char` – Opa114 Jun 14 '16 at 22:48
4

You can do:

union magic_t {
    uint8_t bytes[4];
    uint32_t number;
};

then as you originally wanted:

magic_t my_magic = {0xAB, 0xCD, 0xEF, 0};
magic_t file_magic;
input.read((char *) file_magic.bytes, sizeof(file_magic));
if ( file_magic.number == my_magic.number )...

and you don't need to care about endianess at all.

Depending on the endianness number can be different but that doesn't matter at all as that would be always the right sequence of bytes even if the number isn't 0xABCDEF00.

Or, optionally, we can just use casting (but I think that's ugly).

olivecoder
  • 2,858
  • 23
  • 22
  • alternative solution, too, but i think it is a little bit to complex for that little comparison. But good to know that i can do it this way :) – Opa114 Jun 14 '16 at 22:47
  • *Or, optionally, we can just use casting* No, in general you can't use casting as that violates [strict aliasing](http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) and is therefore undefined behavior. It's also likely to raise SIGBUS on hardware that has alignment restrictions. – Andrew Henle Jun 15 '16 at 09:13
  • Thanks for observing that, however I haven't recommended or elaborated the casting solution. The code was written directly on stack overflow and don't even work. Check one of the answers above for a casting solution please, you can also collaborate to improve them. – olivecoder Jun 15 '16 at 09:24
3

If you know the endianness of your platform, you can use an uint32_t variable to do that.  

For a little endian system, use:

uint32_t number;
input.read(reinpterpret_cast<char*>(&number), 4);
if ( number == 0x00EFCDAB )
{
   cout << "It's my File!" << endl;
}

For a big endian system, use:

uint32_t number;
input.read(reinpterpret_cast<char*>(&number), 4);
if ( number == 0xABCDEF00 )
{
   cout << "It's my File!" << endl;
}
incarnadine
  • 658
  • 7
  • 19
R Sahu
  • 204,454
  • 14
  • 159
  • 270
1

There are already very good answers here! For the records, here a variant using equal() of the standard <algorithm> library:

unsigned char magic[4] = {0};
input.read((char*)magic, sizeof(magic));

const unsigned char code[4] = { 0xab, 0xcd, 0xef, 0x00 };
if(equal(code, code+sizeof(code), magic)) 
    cout << "It's my File!" << endl;
else 
   cout << "Unknown File!" << endl;

It's very similar to the memcmp() version but it works with any container, not only arrays of char.

Online demo

Christophe
  • 68,716
  • 7
  • 72
  • 138