3

I want to understand how does reading binary files work in C++. My code:

int main() {
    ifstream ifd("input.png",ios::binary |ios::ate);
    int size = ifd.tellg();
    ifd.seekg(0,  ios::beg);
    vector<char> buffer;
    buffer.reserve(size);
    ifd.read(buffer.data(), size);

    cout << buffer.data();
    return 0;
}

I thought that if I cout my buffer I would get the result in binary but that is not the case.

My output is: ˙Ř˙á6Exif

And if I read the text file it displays the text in normal form not in binary. Obviously my logic is not right here. How can I read files to a buffer so it will contain binary values? P.s. I`m doing this for implementing a Shannon-Fano algorithm so if anyone has any advice on reading a binary file I would be grateful.

mrNobody
  • 91
  • 1
  • 1
  • 7
  • Your console won't print binary it will try to make sense of the binary data as if it was text + a few control codes – Galik May 03 '17 at 16:56
  • You opened a picture file `image.png` which (presumably) contains binary encoded image data (aka an array of red/green/blue intensity levels or some such stuff). – Galik May 03 '17 at 17:01
  • 1
    @MarkoMlakar Yes, data is in binary. There is no distinction in c++ really if it was binary or text. By opening file with `ios::binary` you ensure that no text conversion happen. – Pavel P May 03 '17 at 17:02
  • It sounds like you are misunderstanding what "binary" means here. An example of your [broken] assumption/expectation would allow us to tailor our answers better. – Lightness Races in Orbit May 03 '17 at 17:13

3 Answers3

5

You need to resize your vector, not reserve it:

int main()
{
    ifstream ifd("input.png", ios::binary | ios::ate);
    int size = ifd.tellg();
    ifd.seekg(0, ios::beg);
    vector<char> buffer;
    buffer.resize(size); // << resize not reserve
    ifd.read(buffer.data(), size);

    cout.write(buffer.data(), buffer.size()); // you cannot just output buffer to cout as the buffer won't have '\0' ond-of-string terminator
}

Otherwise your code tries to read size characters into an empty buffer. You may as well use vector constructor that sets vector size: vector<char> buffer(size);

You can output byte values of your buffer this way:

void dumpbytes(const vector<char>& v)
{
    for (int i=0; i<v.size(); ++i)
    {
        printf("%u ", (unsigned char)v[i]);
        if ((i+1) % 16 == 0)
            printf("\n");
    }
    printf("\n");
}

Or something like common hex editors do for hex output:

void dumphex(const vector<char>& v)
{
    const int N = 16;
    const char hex[] = "0123456789ABCDEF";
    char buf[N*4+5+2];
    for (int i = 0; i < v.size(); ++i)
    {
        int n = i % N;
        if (n == 0)
        {
            if (i)
                puts(buf);
            memset(buf, 0x20, sizeof(buf));
            buf[sizeof(buf) - 2] = '\n';
            buf[sizeof(buf) - 1] = '\0';
        }
        unsigned char c = (unsigned char)v[i];
        buf[n*3+0] = hex[c / 16];
        buf[n*3+1] = hex[c % 16];
        buf[3*N+5+n] = (c>=' ' && c<='~') ? c : '.';
    }
    puts(buf);
}

Buffer with "Hello World!" data would be printed as follows:

48 65 6C 6C 6F 20 57 6F 72 6C 64 21                  Hello World!
Pavel P
  • 15,789
  • 11
  • 79
  • 128
  • Thank you for the answer, but the result is the same – mrNobody May 03 '17 at 16:58
  • @MarkoMlakar if your file contains binary data why would you try to out that data to `cout`? Try to read text file and you'll see your text. – Pavel P May 03 '17 at 17:00
  • I thought that if I input an image it will cout the bits that make that image – mrNobody May 03 '17 at 17:03
  • Here's a nice little hack you could use. [here](http://stackoverflow.com/questions/111928/is-there-a-printf-converter-to-print-in-binary-format) – fingaz May 03 '17 at 17:06
  • @MarkoMlakar and it does just that! If your file contained 4 bytes with values `[20, 20, 20, 20]` then your code would output for spaces, as `' ' == 20`. `cout the bits that make that image` - do you mean hex or binary? Either way, you need to convert your data manually. The point is, with my change your code should read your file correctly. – Pavel P May 03 '17 at 17:12
  • And how can I output ASCII values of my buffer? – mrNobody May 03 '17 at 17:24
  • @MarkoMlakar `dumphex` that I added outputs your buffer in hex. `ASCII values of my buffer` - there is no such thing for buffer. ASCII is for text. You mean decimal values of your buffer bytes? Then loop through your buffer and print each byte (don't forget to cast to an int, otherwise you will print ASCII chars, same what you started with). – Pavel P May 03 '17 at 17:46
2

Based on Pavel answer, you can also add this to see the data in real binary, namely 0's and 1s. do not forget to include the bitset header.

void dumpbin(const vector<char>& v)
{
    for (int i = 0; i < v.size(); ++i)
    {
        cout <<bitset<8>((unsigned char)(v[i])) << " ";
        if ((i + 1) % 8 == 0)
            printf("\n");
    }
}
Shadi
  • 1,701
  • 2
  • 14
  • 27
1

Opening a file in binary mode means that your operating system won't transparently translate line endings between the CR/LF/CRLF formats.

It doesn't have any effect at all on how your computer prints a string, seven lines later. I don't know what "get the result in binary" means, but I suggest rendering the contents of your vector<char> by printing its constituent bytes, one at a time, in their hex-pair representation:

std::cout << std::hex << std::setfill('0');
for (const auto byte : buffer)
   std::setw(2) << byte;

The output will look something like:

0123456789abcdef0123456789abcdef

Every two characters represents the 0-255 byte value of a byte in your data, using the base-16 (or "hex") numerical system. This is a common representation of non-text information.

Alternatively, you could output the data in base-2 (literally "binary").

It's up to you how to present the information. The file open mode has nothing to do with your vector.

You also need to fix your vector's size; at the moment you call .reserve when you meant .resize.

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055