153

I have an array of values that is passed to my function from a different part of the program that I need to store for later processing. Since I don't know how many times my function will be called before it is time to process the data, I need a dynamic storage structure, so I chose a std::vector. I don't want to have to do the standard loop to push_back all the values individually, it would be nice if I could just copy it all using something similar to memcpy.

phoenix
  • 7,988
  • 6
  • 39
  • 45
bsruth
  • 5,372
  • 6
  • 35
  • 44

10 Answers10

264

There have been many answers here and just about all of them will get the job done.

However there is some misleading advice!

Here are the options:

vector<int> dataVec;

int dataArray[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
unsigned dataArraySize = sizeof(dataArray) / sizeof(int);

// Method 1: Copy the array to the vector using back_inserter.
{
    copy(&dataArray[0], &dataArray[dataArraySize], back_inserter(dataVec));
}

// Method 2: Same as 1 but pre-extend the vector by the size of the array using reserve
{
    dataVec.reserve(dataVec.size() + dataArraySize);
    copy(&dataArray[0], &dataArray[dataArraySize], back_inserter(dataVec));
}

// Method 3: Memcpy
{
    dataVec.resize(dataVec.size() + dataArraySize);
    memcpy(&dataVec[dataVec.size() - dataArraySize], &dataArray[0], dataArraySize * sizeof(int));
}

// Method 4: vector::insert
{
    dataVec.insert(dataVec.end(), &dataArray[0], &dataArray[dataArraySize]);
}

// Method 5: vector + vector
{
    vector<int> dataVec2(&dataArray[0], &dataArray[dataArraySize]);
    dataVec.insert(dataVec.end(), dataVec2.begin(), dataVec2.end());
}

To cut a long story short Method 4, using vector::insert, is the best for bsruth's scenario.

Here are some gory details:

Method 1 is probably the easiest to understand. Just copy each element from the array and push it into the back of the vector. Alas, it's slow. Because there's a loop (implied with the copy function), each element must be treated individually; no performance improvements can be made based on the fact that we know the array and vectors are contiguous blocks.

Method 2 is a suggested performance improvement to Method 1; just pre-reserve the size of the array before adding it. For large arrays this might help. However the best advice here is never to use reserve unless profiling suggests you may be able to get an improvement (or you need to ensure your iterators are not going to be invalidated). Bjarne agrees. Incidentally, I found that this method performed the slowest most of the time though I'm struggling to comprehensively explain why it was regularly significantly slower than method 1...

Method 3 is the old school solution - throw some C at the problem! Works fine and fast for POD types. In this case resize is required to be called since memcpy works outside the bounds of vector and there is no way to tell a vector that its size has changed. Apart from being an ugly solution (byte copying!) remember that this can only be used for POD types. I would never use this solution.

Method 4 is the best way to go. It's meaning is clear, it's (usually) the fastest and it works for any objects. There is no downside to using this method for this application.

Method 5 is a tweak on Method 4 - copy the array into a vector and then append it. Good option - generally fast-ish and clear.

Finally, you are aware that you can use vectors in place of arrays, right? Even when a function expects c-style arrays you can use vectors:

vector<char> v(50); // Ensure there's enough space
strcpy(&v[0], "prefer vectors to c arrays");
starball
  • 20,030
  • 7
  • 43
  • 238
MattyT
  • 6,531
  • 2
  • 20
  • 17
  • 1
    @ Method1, std::copy may use traits to optimize the copy (implementations exist). In particular, vector-to-vector copies without back_inserters are likely to be faster than memcpy(). std::memcpy() must deal with unaligned memory. std::copy() does not. – MSalters Nov 04 '08 at 12:09
  • 10
    You can't safely & portably refer to "&dataArray[dataArraySize]"--it's dereferencing a past-the-end pointer/iterator. Instead, you can say dataArray + dataArraySize to get the pointer without having to dereference it first. – Drew Hall Nov 07 '08 at 03:59
  • 4
    @Drew: yes, you can, at least in C. It is defined that `&expr` doesn't evaluate `expr`, it only computes the address of it. And a pointer *one* past the last element is perfectly valid, too. – Roland Illig May 27 '11 at 06:22
  • 3
    Have you tried doing method 4 with 2? i.e. reserving the space before inserting. It seems that if the data size is big, multiple insertions will need multiple reallocations. Because we know the size a priori, we can do the reallocation, before inserting. – Jorge Leitao Dec 31 '13 at 07:01
  • all those proposals but method 5 have undefined behaviour if the original array is empty – jyavenard May 11 '18 at 08:10
  • @jyavenard `ISO C++ forbids zero-size array ‘dataArray’ [-Wpedantic]` – Ruslan Dec 21 '19 at 10:58
  • 3
    @MattyT what is the point of method 5? Why make an intermediate copy of the data? – Ruslan Dec 21 '19 at 11:00
  • 7
    I personally would rather profit from arrays decaying to pointers automatically: `dataVec.insert(dataVec.end(), dataArray, dataArray + dataArraySize);` – appears much clearer to me. Cannot gain anything from method 5 either, only looks pretty inefficient – unless compiler is able to optimise the vector away again. – Aconcagua Feb 19 '20 at 15:18
  • For primitive data types, method 4 is still less efficient than using `std::unique_ptr` + `memcpy` because it boils down to `memmove` preceded by a few if-branches. – Mikhail Vasilyev Jul 18 '21 at 12:54
  • 1
    How about vector.assign()? – Tumb1eweed Jul 26 '22 at 04:15
140

If you can construct the vector after you've gotten the array and array size, you can just say:

std::vector<ValueType> vec(a, a + n);

...assuming a is your array and n is the number of elements it contains. Otherwise, std::copy() w/resize() will do the trick.

I'd stay away from memcpy() unless you can be sure that the values are plain-old data (POD) types.

Also, worth noting that none of these really avoids the for loop--it's just a question of whether you have to see it in your code or not. O(n) runtime performance is unavoidable for copying the values.

Finally, note that C-style arrays are perfectly valid containers for most STL algorithms--the raw pointer is equivalent to begin(), and (ptr + n) is equivalent to end().

phoenix
  • 7,988
  • 6
  • 39
  • 45
Drew Hall
  • 28,429
  • 12
  • 61
  • 81
  • 4
    The reason why looping and calling push_back is bad is because you might force the vector to resize multiple times if the array is long enough. – bradtgmurray Nov 03 '08 at 17:56
  • @bradtgmurray: I think any reasonable implementation of the "two iterators" vector constructor I suggested above would call std::distance() first on the two iterators to get the needed number of elements, then allocate just once. – Drew Hall Nov 04 '08 at 02:38
  • 5
    @bradtgmurray: Even push_back() wouldn't be too bad because of the exponential growth of vectors (aka "amortized constant time"). I think runtime would only be on the order of 2x worse in the worst case. – Drew Hall Nov 04 '08 at 02:40
  • 2
    And if the vector is already there, a vec.clear(); vec.insert(vec.begin(), a, a + n); would work as well. Then you wouldn't even require a to be a pointer, just an iterator, and the vector assignment would be failry general (and the C++/STL way). – MP24 Nov 06 '08 at 21:18
  • 7
    Another alternative when unable to construct would be [assign](http://stackoverflow.com/questions/259297/): `vec.assign(a, a+n)`, which would be more compact than copy & resize. – mMontu Oct 28 '13 at 17:33
52

If all you are doing is replacing the existing data, then you can do this

std::vector<int> data; // evil global :)

void CopyData(int *newData, size_t count)
{
   data.assign(newData, newData + count);
}
Torlack
  • 4,395
  • 1
  • 23
  • 24
12

std::copy is what you're looking for.

luke
  • 36,103
  • 8
  • 58
  • 81
12

Since I can only edit my own answer, I'm going to make a composite answer from the other answers to my question. Thanks to all of you who answered.

Using std::copy, this still iterates in the background, but you don't have to type out the code.

int foo(int* data, int size)
{
   static std::vector<int> my_data; //normally a class variable
   std::copy(data, data + size, std::back_inserter(my_data));
   return 0;
}

Using regular memcpy. This is probably best used for basic data types (i.e. int) but not for more complex arrays of structs or classes.

vector<int> x(size);
memcpy(&x[0], source, size*sizeof(int));
bsruth
  • 5,372
  • 6
  • 35
  • 44
  • I was going to recommend this approach. – mmocny Nov 03 '08 at 17:09
  • It is most likely more efficient to resize your vector up front if you know the size ahead of time, and not use the back_inserter. – luke Nov 03 '08 at 17:09
  • you could add my_data.reserve(size) – David Nehme Nov 03 '08 at 17:10
  • Note that internally this is doing exactly what you seem to want to avoid. It is not copying bits, it is just looping and calling push_back(). I guess you only wanted to avoid typing the code? – mmocny Nov 03 '08 at 17:11
  • 1
    Wjy not use the vector constructor to copy the data? – Martin York Nov 03 '08 at 17:33
  • because that would only work for the first iteration. When adding more data on subsequent iterations, I can't just use the constructor. – bsruth Nov 03 '08 at 20:28
  • This is great (`std::copy`) because it gives more flexibility, especially if you don't want to copy the entire vector and can append another vector. – Chef Pharaoh Dec 10 '19 at 16:19
4

Yet another answer, since the person said "I don't know how many times my function will be called", you could use the vector insert method like so to append arrays of values to the end of the vector:

vector<int> x;

void AddValues(int* values, size_t size)
{
   x.insert(x.end(), values, values+size);
}

I like this way because the implementation of the vector should be able to optimize for the best way to insert the values based on the iterator type and the type itself. You are somewhat replying on the implementation of stl.

If you need to guarantee the fastest speed and you know your type is a POD type then I would recommend the resize method in Thomas's answer:

vector<int> x;

void AddValues(int* values, size_t size)
{
   size_t old_size(x.size());
   x.resize(old_size + size, 0);
   memcpy(&x[old_size], values, size * sizeof(int));
}
Shane Powell
  • 13,698
  • 2
  • 49
  • 61
4
int dataArray[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };//source

unsigned dataArraySize = sizeof(dataArray) / sizeof(int);

std::vector<int> myvector (dataArraySize );//target

std::copy ( myints, myints+dataArraySize , myvector.begin() );

//myvector now has 1,2,3,...10 :-)
Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Antonio Ramasco
  • 431
  • 5
  • 3
  • 2
    Whilst this code snippet is welcome, and may provide some help, it would be [greatly improved if it included an explanation](//meta.stackexchange.com/q/114762) of *how* and *why* this solves the problem. Remember that you are answering the question for readers in the future, not just the person asking now! Please [edit] your answer to add explanation, and give an indication of what limitations and assumptions apply. – Toby Speight Mar 01 '17 at 15:01
  • 4
    Wait, what's `myints`? – mavavilj Aug 02 '18 at 17:32
  • I guess this example is from https://www.cplusplus.com/reference/algorithm/copy/, where you can find myints :) – willSapgreen Dec 31 '21 at 17:06
2

avoid the memcpy, I say. No reason to mess with pointer operations unless you really have to. Also, it will only work for POD types (like int) but would fail if you're dealing with types that require construction.

Assaf Lavie
  • 73,079
  • 34
  • 148
  • 203
  • 10
    Maybe this should be a comment on one of the other answers, as you do not actually propose a solution. – finnw Jul 01 '13 at 19:14
1

In addition to the methods presented above, you need to make sure you use either std::Vector.reserve(), std::Vector.resize(), or construct the vector to size, to make sure your vector has enough elements in it to hold your data. if not, you will corrupt memory. This is true of either std::copy() or memcpy().

This is the reason to use vector.push_back(), you can't write past the end of the vector.

Thomas Jones-Low
  • 7,001
  • 2
  • 32
  • 36
  • If you are using a back_inserter, you don't need to pre-reserve the size of the vector you're copying to. back_inserter does a push_back(). – John Dibling Nov 03 '08 at 17:49
0

Assuming you know how big the item in the vector are:

std::vector<int> myArray;
myArray.resize (item_count, 0);
memcpy (&myArray.front(), source, item_count * sizeof(int));

http://www.cppreference.com/wiki/stl/vector/start

Thomas Jones-Low
  • 7,001
  • 2
  • 32
  • 36