I am parsing a binary file. My code (edited for brevity) looks like:
std::ifstream ifs;
...
ifs.open(argv[i], std::ios::binary);
if (ifs.is_open()) {
...
std::string header;
std::string data;
std::copy_n(std::istream_iterator<char>(ifs), HEADER_SIZE, std::back_inserter(header));
while (!ifs.eof()) {
if (!ifs.good()) {
std::cerr << "Error reading message header." << std::endl;
size = 0u;
} else {
data.clear();
std::copy_n(std::istream_iterator<char>(ifs), size, std::back_inserter(data));
if (!ifs.good()) {
std::cerr << "Error reading message data." << std::endl;
size = 0u;
} else {
...
header.clear();
std::copy_n(std::istream_iterator<char>(ifs), DCP_LOG_HEADER_SIZE, std::back_inserter(header));
}
The binary data contains byte 0x20 as one of the bytes in a header part way through the file. The hexdump of the relevant part of the file looks like:
00000080 01 00 00 00 99 7a 2b 50 dd 00 04 05 00 00 00 00 |.....z+P........|
00000090 99 7c 20 50 dd 00 04 05 01 00 00 00 99 7c 21 50 |.| P.........|!P|
Adding debug to my code I see that the bytes are read as:
Header: 00 99 7a 2b 50 dd 00 04
Data: 05 00 00 00
Header: 00 99 7c 50 dd 00 04 05
Error reading message data.
The hexdump lines up quite nicely. You can clearly see that 00 99 7c 20 50 dd 00 04
is read as 00 99 7c 50 dd 00 04
and then the subsequent byte 05
is read too.
Why is the 0x20 byte (space character) not read?
As a side question (maybe needing a separate stack overflow question), if I create a function scope variable for std::istream_iterator<char>(ifs)
and use it to avoid the overhead of constructing such an object twice each time around the loop, I get some very odd behaviour. The first read is fine, but the second has a single null byte prepended to the read data. The third read gets two null bytes at which point the code fails. I guess that the nth read gets n-1 null bytes prepended to the read data. Why can't I re-use the object.
Also, if I use std::istreambuf_iterator<char>(ifs)
, then even without using it as a function scope variable/object, I get null bytes pre-pended to the data.
Frankly I am very disappointed in C++ when it comes to file IO. Trying to do things in a "proper" C++ way can lead to some very awkward code, and I have read several articles which show that, without some really awkward C++ code, reading files is just quicker using basic C functions instead of using C++.