Best way to extract a subvector from a vector?

Question

Suppose I have a std::vector (let's call it myVec) of size N. What's the simplest way to construct a new vector consisting of a copy of elements X through Y, where 0 <= X <= Y <= N-1? For example, myVec [100000] through myVec [100999] in a vector of size 150000.

If this cannot be done efficiently with a vector, is there another STL datatype that I should use instead?

you say you want to extract a subvector, but it seems to me that what you really want is a view / access to the subvector - difference being that a view would not copy - old school C++ would be to use start pointer and end pointer, given the fact that mem on a std::vector is contiguous, then it should be possible for you to iterate using pointers and thereby avoid copy, however if you do not mind copy, then just initialize a new vector with the scope of your previous vector — serup, Feb 13 '17 at 13:18
There is .data()(http://www.cplusplus.com/reference/vector/vector/data/) since c++11. However, using pointers is discouraged within stl containers , see https://stackoverflow.com/questions/31663770/c-safety-of-accessing-element-of-vector-via-pointers — Dávid Tóth, Nov 06 '19 at 06:46
@serup maybe not interested to OP but I would need to know how to " initialize a new vector with the scope of your previous vector". — meolic, Apr 11 '21 at 10:10
Note) There's also [`assign`](https://en.cppreference.com/w/cpp/container/vector/assign) in std::vector, such as `v.assign(s.begin()+M, s.begin()+N);` — starriet, Jul 06 '23 at 10:35

score 469 · Accepted Answer · answered Jan 07 '09 at 19:04

469

vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
vector<T> newVec(first, last);

It's an O(N) operation to construct the new vector, but there isn't really a better way.

answered Jan 07 '09 at 19:04

Greg Rogers

35,641
17
67
94

16

+1, also it's O(Y-X), which is less than or equal to O(N) (and in his example much less) – orip Jan 07 '09 at 21:53
97

@orip Well, then it's O(N) after all. – Johann Gerell Jan 08 '09 at 07:56
66

@GregRogers: It doesn't make sense to use the big-O notation where N is a specific number. Big-O communicates the rate of growth with respect to how N changes. Johann: It's best not to use one variable name in two ways. We'd normally say either `O(Y-X)`, or we'd say `O(Z) where Z=Y-X`. – Mooing Duck Sep 11 '13 at 17:51
2

@GregRogers By using this way, we need to declare a new vector. Is there a way to change the original vector? something like myVec(first, last)? I know this is wrong, but I really need the solution as I want to use recursion in my codes, and need to repeatedly use the same vector (although changed). Thanks! – ulyssis2 Nov 05 '15 at 12:24
@ulyssis2 Create a new vector and swap? – AturSams Jan 27 '16 at 15:07
27

Why not just `vector newVec(myVec.begin() + 100000, myVec.begin() + 101000);`? – aquirdturtle Mar 20 '17 at 23:57
2

In addition, If `newVec` already exists and you want to replace it's contents use `newVec.asign(first, last)`. It has the same semantics as the constructor in the answer. – Elvis Teixeira Jun 07 '17 at 11:33
1

what happens if iterator last is outofbounds? – Dr Deo Aug 20 '17 at 11:54
2

You can also use `auto` instead of `vector::const_iterator` – Ahmed Karaman Dec 14 '17 at 07:50
1

Just for my own reference, the resulting vector is a shallow copy right? the sub-vector is not pointers to the original vectors underlying data structure right? I basically want to know that if I return back up a stackframe, that the internal pointers to the resulting sub-vector are still valid because they're separate from the original super-vector. – searchengine27 Feb 19 '18 at 23:57
1

Is line 3 a copy? Is it possible to construct a reference subvector that way? – Sandburg Oct 05 '18 at 09:56
1

As a side note std::vector range constructor `constructs the container with the contents of the range [first, last)`. – AZTECCO Feb 28 '20 at 10:27
Great proposition. Could we use `auto cbegin` instead of `begin` ? ```auto first = myVec.cbegin() + 100000; auto last = myVec.cbegin() + 101000;``` – pamplemousse_mk2 Mar 01 '21 at 10:33
@AZTECCO that's the constructor this answer uses. – luizfls Sep 29 '21 at 18:01
@pamplemousse_mk2 Yes. – luizfls Sep 29 '21 at 18:01
1

@luizfls yes it is the constructor used, I thought pointing out that `last` element is not included was relevant , hence the [x,y) convention, in fact the answer used vector.begin + ..1000 to get ...999 elements – AZTECCO Sep 29 '21 at 19:34
@DrDeo, i tried out of bounds `last` in a toy program: https://godbolt.org/z/87MsPqnWT It doesn't do a bounds check. Compiler :clang 14.0.0, c++2a – patro Jul 06 '22 at 17:20

Martin York · Answer 2 · 2009-01-07T19:22:02.950

109

Just use the vector constructor.

std::vector<int>   data();
// Load Z elements into data so that Z > Y > X

std::vector<int>   sub(&data[100000],&data[101000]);

edited Jan 07 '09 at 19:22

answered Jan 07 '09 at 19:14

Martin York

257,169
86
333
562

2

Ok, I didn't realize it was that simple to obtain an iterator from an arbitrary vector element. – An̲̳̳drew Jan 07 '09 at 19:58
Well, you actually get pointers to the elements, which in most cases are equivalent to iterators. However, I assume that they won't be checked like iterators when you have a checked-iterators implementation. Could be some magic I'm unaware of though that makes that work too... – Niklas Jan 07 '09 at 20:56
7

Taking the address of those vector elements is an unportable hack that will break if the vector storage is not in fact contiguous. Use begin() + 100000 etc. – j_random_hacker Jan 08 '09 at 06:29
4

My bad, apparently the standard guarantees that vector storage is contiguous. Nevertheless it's bad practice to work with addresses like this as it is certainly not guaranteed to work for all containers supporting random access, while begin() + 100000 is. – j_random_hacker Jan 08 '09 at 06:36
40

@j_random_hacker: Sorry have to disagree. The STL specification for std::vector was explicitly changed to support this type of procedure. Also a pointer is valid type of iterator. Look up iterator_traits<> – Martin York Jan 08 '09 at 07:35
1

Yep, pointers aren't necessarily valid std::vector::iterator's but they are valid iterators for an array in continuous storage, which is how it is used. – Greg Rogers Jan 08 '09 at 20:04
@Uri: This will always work. But like all code it is limited by system constraints. If you try and do something that is not physically possible you will get an exception. – Martin York Aug 17 '15 at 18:12
How would you slice up to the last element using that ? If you do sub(&data[100000],&data[data.size()]); That would not always work and may throw an access violation ? – azerty Sep 30 '16 at 23:19
6

@taktak004 Nope. Remember that `operator[]` returns a reference. It is only at the point where you read or write the reference that it would become an access violation. Since we do neither but instead get the address we have not invoked UB,. – Martin York Oct 01 '16 at 01:24
This won't work if you need to copy data till end of source vector. – bialix Dec 06 '17 at 17:58
3

@bialix: It will. I assume you are talking about taking the address of one past then end being the problem. This is covered by the standard and is allowed. It is UB to dereference this addresses but not to take its address. Note the definition of `X[Y]` is defined as `*(X + Y)` in the standard. And the operators `&*` placed together cancel each other out (assuming they have not been overridden (which they can't be for pointers)). => `&X[Y]` => `&*(X + Y)` => `(X + Y)`. Which is also why `10000[data]` is a valid expression for array access. – Martin York Dec 06 '17 at 21:35
actually MSVC 2017 does raise an access violation for the case that @taktak004 considered (although only in Debug mode and with Debuglevel 2) – mrclng Feb 17 '22 at 10:04

score 57 · Answer 3 · answered Nov 06 '19 at 06:47

This discussion is pretty old, but the simplest one isn't mentioned yet, with list-initialization:

 vector<int> subvector = {big_vector.begin() + 3, big_vector.end() - 2};

It requires c++11 or above.

Example usage:

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(){

    vector<int> big_vector = {5,12,4,6,7,8,9,9,31,1,1,5,76,78,8};
    vector<int> subvector = {big_vector.begin() + 3, big_vector.end() - 2};

    cout << "Big vector: ";
    for_each(big_vector.begin(), big_vector.end(),[](int number){cout << number << ";";});
    cout << endl << "Subvector: ";
    for_each(subvector.begin(), subvector.end(),[](int number){cout << number << ";";});
    cout << endl;
}

Result:

Big vector: 5;12;4;6;7;8;9;9;31;1;1;5;76;78;8;
Subvector: 6;7;8;9;9;31;1;1;5;76;

it defines the subrange: Starts from the 3rd element, and goes until the 2nd from behind — Dávid Tóth, Apr 24 '22 at 17:03
Can you please clarify whether this way includes the last element? — Hudson, Jan 29 '23 at 18:32
It's inclusive, as the example shows: the last element is the one 2 positions before .end() — Dávid Tóth, Jan 30 '23 at 20:05

einpoklum · Answer 4 · 2023-08-30T22:35:28.570

These days, we use spans! So you would write:

#include <span>

...
auto start_pos = 100000;
auto length = 1000;
auto span_of_myvec = std::make_span(myvec);
auto my_subspan = span_of_myvec.subspan(start_pos, length);

to get a span of 1000 elements of the same type as myvec's. Or a more terse form:

auto my_subspan = std::make_span(myvec).subspan(1000000, 1000);

(but I don't like this as much, since the meaning of each numeric argument is not entirely clear; and it gets worse if the length and start_pos are of the same order of magnitude.)

Anyway, remember that this is not a copy, it's just a view of the data in the vector, so be careful. If you want an actual copy, you could do:

std::vector<T> new_vec(my_subspan.cbegin(), my_subspan.cend());

Notes:

If you use a version of C++ older than C++20, you can use gsl::span - from the Guidelines Support Library. For more information, see: http://www.modernescpp.com/index.php/c-core-guideline-the-guidelines-support-library and https://github.com/martinmoene/gsl-lite for a concrete implementation
For more information about spans, see: What is a "span" and when should I use one?
std::vector has a gazillion constructors, it's super-easy to fall into one you didn't intend to use, so be careful.

would use `cbegin` and `cend` just for the principle ;) `std::cbegin` etc even. — JHBonarius, Jun 29 '18 at 20:16
@JHBonarius: Seeing how this code isn't templated on the choice of container, I don't see there's a particular benefit; a matter of taste I suppose. — einpoklum, Jun 29 '18 at 21:16

score 31 · Answer 5 · edited Jun 13 '20 at 15:40

31

std::vector<T>(input_iterator, input_iterator), in your case foo = std::vector<T>(myVec.begin () + 100000, myVec.begin () + 150000);, see for example here

edited Jun 13 '20 at 15:40

jackw11111

1,457
1
17
34

answered Jan 07 '09 at 19:04

Anteru

19,042
12
77
121

1

Since Andrew is trying to construct a new vector, I would recommend "std::vector foo(..." instead of copying with "foo = std::vector(..." – Drew Dormann Jan 07 '09 at 19:34
6

Yeah, of course, but whether you type std::vector foo = std::vector(...) or std::vector foo (...) should not matter. – Anteru Jan 07 '09 at 20:06

score 12 · Answer 6 · answered Jan 07 '09 at 19:26

If both are not going to be modified (no adding/deleting items - modifying existing ones is fine as long as you pay heed to threading issues), you can simply pass around data.begin() + 100000 and data.begin() + 101000, and pretend that they are the begin() and end() of a smaller vector.

Or, since vector storage is guaranteed to be contiguous, you can simply pass around a 1000 item array:

T *arrayOfT = &data[0] + 100000;
size_t arrayOfTLength = 1000;

Both these techniques take constant time, but require that the length of data doesn't increase, triggering a reallocation.

This is also good if you want the original vector and the subvector to be linked. — PyRulez, Mar 01 '15 at 19:38

score 6 · Answer 7 · answered Oct 16 '15 at 02:03

6

You didn't mention what type std::vector<...> myVec is, but if it's a simple type or struct/class that doesn't include pointers, and you want the best efficiency, then you can do a direct memory copy (which I think will be faster than the other answers provided). Here is a general example for std::vector<type> myVec where type in this case is int:

typedef int type; //choose your custom type/struct/class
int iFirst = 100000; //first index to copy
int iLast = 101000; //last index + 1
int iLen = iLast - iFirst;
std::vector<type> newVec;
newVec.resize(iLen); //pre-allocate the space needed to write the data directly
memcpy(&newVec[0], &myVec[iFirst], iLen*sizeof(type)); //write directly to destination buffer from source buffer

answered Oct 16 '15 at 02:03

MasterHD

2,264
1
32
41

2

I wonder if with -O3, @Anteru's "using constructor" `std::vector(myVec.begin () + 100000, myVec.begin () + 150000);` , wouldn't the longer-version of this produce into exactly the same assembly? – sandthorn Feb 14 '18 at 15:58
1

MSVC++ 2015, for example, compiles `std::vector<>(iter, iter)` to `memmove()`, if appropriate (if constructor is trivial, for a suitable definition of trivial). – Pablo H Jun 26 '18 at 19:18
1

Don't call `memcpy`. Do a `std::copy` or a constructor which accepts a range (two iterators), and the compiler and the std.library will conspire to call `memcpy` when appropriate. – Bulletmagnet Apr 12 '19 at 08:38

score 5 · Answer 8 · edited Jun 10 '19 at 05:51

5

You could just use insert

vector<type> myVec { n_elements };

vector<type> newVec;

newVec.insert(newVec.begin(), myVec.begin() + X, myVec.begin() + Y);

edited Jun 10 '19 at 05:51

MasterAler

1,614
3
23
35

answered Jun 10 '19 at 00:27

Matheus Vinícius de Andrade

353
5
7

score 4 · Answer 9 · answered Jan 07 '09 at 19:03

4

You can use STL copy with O(M) performance when M is the size of the subvector.

answered Jan 07 '09 at 19:03

Yuval F

20,565
5
44
69

Upvoted because it pointed me in the right direction but I can see why @LokiAstari suggests it's not the correct choice - since the STL::copy works with two std::vector arrays of the same size and type. Here, the OP wants to copy a subsection into a new, smaller array as outlined here in the OP's post: "0 <= X <= Y <= N-1" – Andrew Dec 31 '16 at 23:22
@Andrew, see example using std::copy and std::back_inserter – chrisg Jul 26 '17 at 06:03
@LokiAstari why not? – chrisg Jul 26 '17 at 06:04
@chrisg: The clue is in the name `copy`. To use std::copy you need to build the destination first (which means initializing all the members to the their default value). Then you can copy them over. So its two distinct operations. Why not simply use the vector constructor and initialize the members directly to their final values. See the nice answer that has +50 votes. – Martin York Jul 26 '17 at 08:03
2

@LokiAstari I was referring to an edit to this that did not survive peer review, which posed the example
vector newvec; std::copy(myvec.begin()+10000, myvec.begin() +10100, std::back_inserter(newvec));
in this case, you don't need to build the destination first, but sure, direct initialization is more... direct. – chrisg Jul 26 '17 at 19:16
1

@chrisg: Its also two lines. Additionally you need to stick a third line in to make sure it is efficient. `newvec.reserve(10100 - 10000);`. ITs definitely an option and technically it will work. But out of the two which are you going to recommend? – Martin York Jul 26 '17 at 21:11
@LokiAstari ah, ok, I'm sold. The need for reserve on larger sets pushed me over the line. I also didn't like the "under the hood" feel of using naked pointers, dependency on an underlying detail (or "guarantee" that vector is contiguous), and unchecked array-like accessing, because they feel very C-like and un-C++-like, but then I decided its no worse than the iterator arithmetic used for std::copy approach – chrisg Jul 26 '17 at 22:14
@chrisg: The most up-voted answer uses standard iterators and the constructor. – Martin York Jul 27 '17 at 16:00

score 2 · Answer 10 · answered Nov 15 '21 at 02:13

Suppose there are two vectors.

 vector<int> vect1{1, 2, 3, 4};
 vector<int> vect2;

Method 1. Using copy function. copy(first_iterator_index, last_iterator_index, back_inserter()) :- This function takes 3 arguments, firstly, the first iterator of old vector. Secondly, the last iterator of old vector and third is back_inserter function to insert values from back.

    // Copying vector by copy function
    copy(vect1.begin(), vect1.end(), back_inserter(vect2));

Method 2. By using Assign Function. assign(first_iterator_o, last_iterator_o). This method assigns the same values to new vector as old one. This takes 2 arguments, first iterator to old vector and last iterator to old vector.

    //Copying vector by assign function
    vect2.assign(vect1.begin(), vect1.end());

score 1 · Answer 11 · answered Jan 07 '09 at 19:06

1

The only way to project a collection that is not linear time is to do so lazily, where the resulting "vector" is actually a subtype which delegates to the original collection. For example, Scala's List#subseq method create a sub-sequence in constant time. However, this only works if the collection is immutable and if the underlying language sports garbage collection.

answered Jan 07 '09 at 19:06

Daniel Spiewak

54,515
14
108
120

in c++ way to do that would be to have vector of shared_ptr to X instead of vector of X and then copy SPs, but unfortunately I dont think that is faster because atomic operation involved with cpying SP. Or the original vector could be a const shared_ptr of vector instead and you just take reference to subrange in it. ofc you dont need to make it a shared_ptr of vector but then you have lifetime problems... all this is off top of my head, could be wrong... – NoSenseEtAl Oct 22 '13 at 14:30

score 0 · Answer 12 · edited May 23 '17 at 12:10

0

Maybe the array_view/span in the GSL library is a good option.

Here is also a single file implementation: array_view.

edited May 23 '17 at 12:10

Community

1
1

answered Apr 27 '17 at 07:36

myd7349

1
1

1

Kindly add answer here along with link. As external link might change in future – Panther Apr 27 '17 at 08:03

score 0 · Answer 13 · answered Jun 24 '18 at 05:25

Copy elements from one vector to another easily
In this example, I am using a vector of pairs to make it easy to understand
`

vector<pair<int, int> > v(n);

//we want half of elements in vector a and another half in vector b
vector<pair<lli, lli> > a(v.begin(),v.begin()+n/2);
vector<pair<lli, lli> > b(v.begin()+n/2, v.end());


//if v = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
//then a = [(1, 2), (2, 3)]
//and b = [(3, 4), (4, 5), (5, 6)]

//if v = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7)]
//then a = [(1, 2), (2, 3), (3, 4)]
//and b = [(4, 5), (5, 6), (6, 7)]

'
As you can see you can easily copy elements from one vector to another, if you want to copy elements from index 10 to 16 for example then we would use

vector<pair<int, int> > a(v.begin()+10, v.begin+16);

and if you want elements from index 10 to some index from end, then in that case

vector<pair<int, int> > a(v.begin()+10, v.end()-5);

hope this helps, just remember in the last case v.end()-5 > v.begin()+10

JHBonarius · Answer 14 · 2018-06-29T20:29:51.650

Yet another option: Useful for instance when moving between a thrust::device_vector and a thrust::host_vector, where you cannot use the constructor.

std::vector<T> newVector;
newVector.reserve(1000);
std::copy_n(&vec[100000], 1000, std::back_inserter(newVector));

Should also be complexity O(N)

You could combine this with top anwer code

vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
std::copy(first, last, std::back_inserter(newVector));

score 0 · Answer 15 · edited Sep 21 '22 at 03:22

0

vector::assign may be another solution

// note: size1 < src.size() && size2 < src.size()
std::vector<int> sub1(size1), sub2(size2);
sub1.assign(src.begin(), src.begin() + size1);
sub2.assign(src.begin(), src.begin() + size2);

edited Sep 21 '22 at 03:22

vilalabinot

1,420
4
17

answered Sep 15 '22 at 10:06

tangmr

1
2

score -2 · Answer 16 · answered Nov 18 '13 at 20:06

-2

Posting this late just for others..I bet the first coder is done by now. For simple datatypes no copy is needed, just revert to good old C code methods.

std::vector <int>   myVec;
int *p;
// Add some data here and set start, then
p=myVec.data()+start;

Then pass the pointer p and a len to anything needing a subvector.

notelen must be!! len < myVec.size()-start

answered Nov 18 '13 at 20:06

mrrgu

17

1

This doesn't perform a copy. – NoDataDumpNoContribution Jan 30 '17 at 08:32

Best way to extract a subvector from a vector?

16 Answers16

Linked

Related