1

I have a std::vector [1, 2, 3, 4, 5] and I want to get another vector containing all the elements but the 2nd one: [1, 3, 4, 5]. One way to do is (vec1 is my input vector):

std::vector<int> vec2;
vec2 = vec1;
vec2.erase(vec2.begin()+1)

Here I don't really like the erase which is O(n) complexity, so considering the copy of the array I will have 2n operations. I was thinking the old dummy way would be better:

std::vector<int> vec2;
for(int i=0; i<vec1.size(); ++i){
    if (i != 1) 
        vec2.push_back(vec1[i]);
}

This is amortized O(n) time. The asymptotic behaviors are the same but the number of operations might be smaller.

I have to do this on rather small vectors (about 100 elements), but I have billions of them. Will I notice a significant difference ?

How would you do it?

user3091275
  • 1,013
  • 2
  • 11
  • 27
  • 5
    Call reserve first. As to whether or not you'll notice the difference, well profile it, that's the only way to get a reliable answer. – Borgleader Dec 03 '16 at 18:46
  • Maybe you should reconsider your algorithm, don't copy if you don't have to. Also check out insert() – Tony J Dec 03 '16 at 19:22
  • You can iterate your vector backwards and erase the last element, so you don't have to copy. Or you can track the index of what should be the first element and iterate from it to end. There are many ways to avoid copying. – Tony J Dec 03 '16 at 19:26

3 Answers3

2

Concerning complexity, you cannot seek better than O(n) because you have to copy n elements anyway. But for the purpose of some micro-optimization, you can:

  • reserve a priori the size, as in the comments
  • avoid checking the condition inside a loop, by doing 2 consecutive copies but without checking:

    std::vector<int> vec1 = { 1,2,3,4,5 };
    std::vector<int> vec2(vec1.size()-1);
    
    constexpr int i = 1; // let i be the position you want to skip
    
    std::copy(vec1.cbegin(), vec1.cbegin() + i, vec2.begin());
    std::copy(vec1.cbegin() + i+1, vec1.cend(), vec2.begin()+i);
    

since there is no if statement or std::copy-if, the copies will be straightforward and moreover there will be good room for compiler optimization.

EDIT: as in the comments below, resizing the vector incurs some additional of initialization, see a discussion of ways to avoid the inititlization here: Using vector<char> as a buffer without initializing it on resize()

Community
  • 1
  • 1
A.S.H
  • 29,101
  • 5
  • 23
  • 50
  • 1
    "*reservation at creation*" That doesn't reserve; it *initializes* those values. It's an O(n) process. If you want to reserve, you should call *reserve*. – Nicol Bolas Dec 03 '16 at 19:57
  • @NicolBolas I said that the process will be O(n) anyway. However you are right that the initializer will be invoked. Reservation or, I mean, resizing at creation is known to be hard (still feasible) with std::vector (see http://stackoverflow.com/questions/15219984/using-vectorchar-as-a-buffer-without-initializing-it-on-resize). – A.S.H Dec 03 '16 at 20:11
1

This can be done using the existing routines of std::vector. Given i, which is the location you want to skip (which is within the range of vec):

template<typename T>
vector<T> skip_copy(const vector<T> &vec, size_t i)
{
  vector<T> ret;
  ret.reserve(vec.size() - 1);
  ret.insert(ret.end(), vec.begin(), vec.begin() + i);
  ret.insert(ret.end(), vec.begin() + i + 1, vec.end());
  return ret;
}

By reserving space, we avoid any needless memory allocations in ret. And vector::insert, when done at the end, will not need to shift elements around.

Granted, it is possible that insert is doing a few extra conditions than are strictly needed (checking to see if the range of iterators will fit in the existing storage), but it's unlikely that a series of push_back calls will be faster. And not having to pre-initialize the array pays for itself if the size of the vector is significant.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
0

This is the fastest way I can think of:

std::vector<int> vec1{1, 2, 3, 4, 5};

// Do not call the constructor with the known size here. That would allocate
// AND initialize a lot of data that will be overwritten later anyway.
std::vector<int> vec2;

// Instead, use it on 'reserve' since it only allocates the memory and
// leaves it uninitialized. The vector is still considered empty, but its
// underlying capacity will be big enough to be filled with new data
// without any unnecessary initializations and reallocations occuring (as long
// as you don't exceed the capacity)
vec2.reserve(vec1.size() - 1);

// Nothing expensive should be going on behind the scenes now, just fill it up
for(auto it = vec1.begin(), skip = vec1.begin() + 1 ; it != vec1.end() ; ++it) {
    if(it != skip) {
        vec2.push_back(*it);
    }
}
zee01
  • 1
  • 1