39

I'd like to avoid unnecessary copies. I'm aiming for something along the lines of:

std::ifstream testFile( "testfile", "rb" );
std::vector<char> fileContents;
int fileSize = getFileSize( testFile );
fileContents.reserve( fileSize );
testFile.read( &fileContents[0], fileSize );

(which doesn't work because reserve doesn't actually insert anything into the vector, so I can't access [0]).

Of course, std::vector<char> fileContents(fileSize) works, but there is an overhead of initializing all elements (fileSize can be rather big). Same for resize().

This question is not so much about how important that overhead would be. Rather, I'm just curious to know if there's another way.

Pedro d'Aquino
  • 5,130
  • 6
  • 36
  • 46
  • 1
    If you want to avoid the reallocation cost required by `push_back` _and_ you want to avoid the cost of zeroing the buffer required by using `resize`, don't use a `std::vector` at all: use a `boost::scoped_array` or something similar. – James McNellis Jan 21 '11 at 17:32

3 Answers3

66

The canonical form is this:

#include<iterator>
// ...

std::ifstream testFile("testfile", std::ios::binary);
std::vector<char> fileContents((std::istreambuf_iterator<char>(testFile)),
                               std::istreambuf_iterator<char>());

If you are worried about reallocations then reserve space in the vector:

#include<iterator>
// ...

std::ifstream testFile("testfile", std::ios::binary);
std::vector<char> fileContents;
fileContents.reserve(fileSize);
fileContents.assign(std::istreambuf_iterator<char>(testFile),
                    std::istreambuf_iterator<char>());
wilhelmtell
  • 57,473
  • 20
  • 96
  • 131
  • Won't that do reallocations while the vector is growing? (Since the iterators might not support subtraction, the constructor cannot determine the size in advance.) – Thomas Jan 21 '11 at 17:22
  • Yes, it would. If that's really a concern, then reserve and use `std::copy()`. Updated. – wilhelmtell Jan 21 '11 at 17:26
  • In the second example, `reserve` needs to be `resize`, no? – James McNellis Jan 21 '11 at 17:44
  • But `resize()` would initialize the elements. That's not strictly necessary. – wilhelmtell Jan 21 '11 at 17:46
  • 6
    Yes, it is. As written, the code is incorrect because `fileContents.begin()` is not dereferenceable (it is equal to `fileContents.end()`). An STL implementation with debugging support (like the Visual C++ 2010 STL) should raise an assertion when executing this code. – James McNellis Jan 21 '11 at 17:48
  • 1
    Better late than never: simplified the code a little. Remove the `` dependency by replacing the `std::copy()` call with `std::vector::assign()`. Also, for `std::ifstream` there's no need to pass `std::ios::in` to the constructor. The constructor knows that. – wilhelmtell Nov 10 '11 at 11:10
  • `(std::istreambuf_iterator(testFile))`: Why the extra parentness? – cubuspl42 Aug 19 '14 at 16:28
  • @cubuspl42 Because of the most vexing parse. – Étienne Oct 25 '14 at 09:47
  • 4
    @wilhelmtell is this (the 2nd option) more efficient than simply doing `vector fileContents(fileSize);` and `testFile.read(&fileContents[0], fileSize);` ? Judging from a quick test (150MB file), using read seems quite more efficient in terms of speed – LyK Oct 24 '15 at 16:12
5

If you want true zero-copy reading, that is, to eliminate copying from kernel to user space, just map the file into memory. Write your own mapped file wrapper or use one from boost::interprocess.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
0

If I understand you correctly, you want to read each element but don't want to load it all into the fileContents, correct? I personally don't think this would make unnecessary copies because open files multiple times would decrease performance more. Read once into a fileContentsvector is a reasonable solution in this case.

roxrook
  • 13,511
  • 40
  • 107
  • 156
  • I did not mean to vote this down, but it is locked in. If you edit the answer I can / will remove the down vote. – ditkin Jan 21 '15 at 17:25