11

I have binary files whose contents I'm trying to read into a vector. All the files are the same size, yet using my code below, the final vector size is always a little bit smaller than the file size, and it's different from file to file (but same for each file). I'm confused at what is going on here...

#include <fstream>
#include <vector>
#include <iostream>
#include <iterator>
int main(int argc, char *argv[]) {
  std::string filename(argv[1]);

  // Get file size
  std::ifstream ifs(filename, std::ios::binary | std::ios::ate);
  int size = (int)ifs.tellg();
  std::cout << "Detected " << filename << " size: " << size << std::endl; // seems correct!

  // Load file
  ifs.seekg(0, std::ios::beg);
  std::istream_iterator<char unsigned> start(ifs), end;
  std::vector<char unsigned> v;
  v.reserve(size);
  v.assign(start, end);

  std::cout << "Loaded data from " << filename << ", with " << v.size() << " elements" << std::endl; 
}

Trying this on a file, I get this:

Detected foo_binary.bin size: 2113753
Loaded data from foo_binary.bin, with 2099650 elements

The 2113753 number is the correct size of the file in bytes.

Trying this on another file of the same size, the vector size will end up having 2100700 elements. A bit more, but again not the whole file.

What's going on here?

Stan
  • 1,227
  • 12
  • 26
  • Do you get the same thing using one of [thses](http://stackoverflow.com/questions/7241871/loading-a-file-into-a-vectorchar) methods? – NathanOliver Dec 23 '15 at 20:24

1 Answers1

14

There are multiple stream iterators. The class template std::istream_iterator<T> is for formatted input, i.e., it is going go skip leading whitespace before trying to read an object of type T.

From the looks of it you want std::istreambuf_iterator<char> which is used to iterate over the characters in a file, not doing any skipping.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380