8

I'm having some problems reading a binary file and converting it's bytes to hex representation.

What I've tried so far:

ifstream::pos_type size;
char * memblock;

ifstream file (toread, ios::in|ios::binary|ios::ate);
  if (file.is_open())
  {
    size = file.tellg();
    memblock = new char [size];
    file.seekg (0, ios::beg);
    file.read (memblock, size);
    file.close();

    cout << "the complete file content is in memory" << endl;

std::string tohexed = ToHex(memblock, true);


    std::cout << tohexed << std::endl;

   }

Converting to hex:

string ToHex(const string& s, bool upper_case)
{
    ostringstream ret;

    for (string::size_type i = 0; i < s.length(); ++i)
        ret << std::hex << std::setfill('0') << std::setw(2) << (upper_case ? std::uppercase : std::nouppercase) << (int)s[i];

    return ret.str();
}

Result: 53514C69746520666F726D61742033.

When I open the original file with a hex editor, this is what it shows:

53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00
04 00 01 01 00 40 20 20 00 00 05 A3 00 00 00 47
00 00 00 2E 00 00 00 3B 00 00 00 04 00 00 00 01
00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 05 A3
00 2D E2 1E 0D 03 FC 00 06 01 80 00 03 6C 03 D3

Is there a way to get the same desired output using C++?

Working solution (by Rob):

...

std::string tohexed = ToHex(std::string(memblock, size), true);

...
string ToHex(const string& s, bool upper_case)
{
    ostringstream ret;

    for (string::size_type i = 0; i < s.length(); ++i)
    {
        int z = s[i]&0xff;
        ret << std::hex << std::setfill('0') << std::setw(2) << (upper_case ? std::uppercase : std::nouppercase) << z;
    }

    return ret.str();
}
  • "memblock contains only first 15 bytes, stopping at the null byte (16th)" What makes you say that? I don't see where you are printing out the contents of `memblock`. I suspect that `memblock` contains the entire file, but that the code you aren't showing us misinterprets its contents. Please reduce your program to the smallest complete program that demonstrates the error, and post that program in the question. http://sscce.org – Robᵩ Mar 08 '12 at 17:20
  • @Rob okay shall i repost the first 15 bytes for you to be more clear? –  Mar 08 '12 at 17:21
  • Assuming that this is a homework or a learning assignment of some sort, here's a couple of hints: (1) you are missing a `while` loop, (2) calling `tellg()` on a stream that you have just opened is premature. – Sergey Kalinichenko Mar 08 '12 at 17:22
  • @develroot - No, please just tell us how you concluded that it only contained the first fifteen bytes. – Robᵩ Mar 08 '12 at 17:22
  • that's what it shows when I run the program. –  Mar 08 '12 at 17:24
  • Have you checked that the value of `size` isn't 15 bytes, just by pure coincidence being the count before the first null byte? `ifstream` in binary mode certainly isn't supposed to do anything with null bytes... – Matthew Walton Mar 08 '12 at 17:24
  • @develroot - "that's what it shows when I run the program". What program? Please post a complete minimal program that demonstrates the error, so that we can help you find the error. http://sscce.org – Robᵩ Mar 08 '12 at 17:25
  • @sscce.org what should I post? the includes? the return 0; or what? THIS IS THE WHOLE PROGRAM –  Mar 08 '12 at 17:26
  • @develroot - That is by no means the entire program, but it was sufficient to find the error. *Please* read http://sscce.org to understand why I asked you for a complete program. – Robᵩ Mar 08 '12 at 17:32

1 Answers1

8
char *memblock;
… 
std::string tohexed = ToHex(memblock, true);
…

string ToHex(const string& s, bool upper_case)

There's your problem, right there. The constructor std::string::string(const char*) interprets its input as a nul-terminated string. So, only the characters leading up to '\0' are even passed to ToHex. Try one of these instead:

std::string tohexed = ToHex(std::string(memblock, memblock+size), true);
std::string tohexed = ToHex(std::string(memblock, size), true);
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • okay...that worked for the hex conversion...but i still have a problem: the result isn't exactly the same. `47` is now `00`, or `2D` is now `05`..and so are all the non-ASCII characters. –  Mar 08 '12 at 17:38
  • @develroot - if it helps, you have a sign-extension bug in your `ToHex` routine. Try `(s[i]&0xff)` instead of `(int)s[i]`. – Robᵩ Mar 08 '12 at 17:53
  • This is amazing! :) – ReinstateMonica3167040 Sep 15 '18 at 23:48