Splitting a vector by a pattern

Question

I have a vector 'a' which contains huge amount of data and should be split into two seperate vectors 'b' and 'c'.

vector<unsigned char> a; //contains a lot of data

vector<unsigned char> b; //data should be split into b and c
vector<unsigned char> c;

The layout of the data in vector 'a' is as follows:

bbbbccccbbbbccccbbbbcccc

The first 4 bytes should go into vector 'b', the next 4 bytes into vector 'c', etc..

I could iterate through my data and push_back (or insert) every element into the corresponding vector (based on the index they have in vector 'a'). However, I tried this and the result was very slow.

Is there a more performant way in C++ to achieve this?

Make sure you reserve the vector before pushing back. You could also consider `std::array` if the sizes are known at compile time. You should also show the code you attempted so far, in case there are any obvious optimizations. — Neil Kirk, Sep 14 '15 at 15:09
std::vector b(a.size() / 2, 'b'); std::vector c(a.size() / 2, 'c'); — João Augusto, Sep 14 '15 at 15:25
Does the data need to absolutely be broken into two vectors? Have you considered special iterators `b` and `c` that will properly stride across the data in `a` instead? — AndyG, Sep 14 '15 at 15:32
Yes, I have to split the data into two seperate data sets. Maybe it would be faster to copy the vector and then erase the elements not needed? I haven't thought of a parrallel solution yet. I wonder if the bottleneck is on the CPU or RAM side. — user3067395, Sep 14 '15 at 15:42
AndyG asked a question, then you said "yes, I have to do something completely unrelated to AndyG's question". What are "two separate data sets", and how are they not handled by pairs of iterators *exactly*. In short, what are you doing with the resulting vectors that require they be of the form "3 pointers to a block of independently allocated contiguous memory" **exactly**? Do the blocks need to be independently allocated? Contiguous? Have 3 pointers (begin, end, capacity)? Does the type have to be `std::vector` *exactly*? — Yakk - Adam Nevraumont, Sep 15 '15 at 17:36
I need two seperate vectors because I need them to be contiguous in memory. The data is processed after splitting the vectors, and the algorithms pretend that the data is contiguous (for example: they iterate over the data using indices and not iterators). I'd have to change a lot of code in different places. The type is always vector. — user3067395, Sep 15 '15 at 18:08

ChronoTrigger · Accepted Answer · 2015-09-14T16:44:19.963

9

Try to pre-allocate the memory that you are going to use to avoid copies. Assuming a contains full sequences, you can do:

b.reserve(a.size() / 2);
c.reserve(a.size() / 2);
for (auto it = a.begin(); it < a.end(); it += 8) {
  b.insert(b.end(), it, it + 4);
  c.insert(c.end(), it + 4, it + 8);
}

Update

If you don't mind modifying the original vector a, you can use it to keep one of the subsequences and avoid allocating more memory. Assuming a contains full sequences:

b.reserve(a.size() / 2);
auto writer = a.begin();
for (auto reader = a.cbegin(); reader < a.cend(); reader += 8, writer += 4) {
  b.insert(b.end(), reader, reader + 4);
  std::copy(reader + 4, reader + 8, writer);
}
a.resize(a.size() / 2);

edited Sep 14 '15 at 16:44

answered Sep 14 '15 at 15:15

ChronoTrigger

8,459
1
36
57

2

Now cross your fingers and hope that everything lines up and you won't iterate past `a.end()` :-) – AndyG Sep 14 '15 at 15:16
Yep, assuming the OP's sequences are complete. – ChronoTrigger Sep 14 '15 at 15:17
I think I'll go for this solution. Still not fast, but reserving the vectors speeds it up significantly. – user3067395 Sep 14 '15 at 15:39
2

Your second solution is wrong. You need a second iterator through vector a to keep track of the copy destination, Let me correct it. – KevinZ Sep 14 '15 at 16:07

score 5 · Answer 2 · edited Jun 20 '20 at 09:12

You can go quite a bit faster than ChronoTrigger's Updated Answer if you're friendly to the cache.

Bear in mind one thing with vectors, you want to iterate over as many elements as possible in a single go, doing exactly one thing to them.

So long as you don't mind messing with your original vector<T> a, this solution will work:

Assumption 1

Your elements are evenly distributed about your array with a stride of K. K = 4 in your example: BBBBCCCCBBBBCCCC

Assumption 2

The size of your vector is a multiple of K. In your example, N = 24 = 6 * 4.

Assumption 3

The algorithm does not need to be stable. That is, the relative ordering of elements in b does not need to be the same as it was while they were in a. Same for c.

(I did implement a stable version of this... you don't want to use it)

Approach

Iterate over vector a with two iterators, the first begins at 0, the second at N
Swap elements at iterators for K elements
Repeat until iterators meet
- BBBBCCCCBBBBCCCC becomes CCCCCCCCBBBBBBBB

In layman's terms does the following:

Maintain elements of c at the beginning
Create vector b as the second half of a
'a' is now 'c'

It may appear to be more work, but because we're being really friendly to memory, it will in practice be much faster (take 1/3 of the time in my own experiments)

Code

 void FastSplit(std::vector<int>& a, int stride)
 {
    auto asize = a.size();
    size_t j = asize-1;
    if ((asize / stride) % 2 == 1)
    {
       j -= stride;
       asize = asize - (asize+stride)/2; // asize now represents number of C elements
    }
    else
    {
        asize /= 2; // asize now represents number of C elements
    }
    
    for (size_t i=0; i < j; i+=stride, j-=stride)
    {
        for (size_t k = 0; k < stride; ++k, ++i, --j)
        {
            std::swap(a[i], a[j]);
        }
    }

Live Demo

In my test on 4 million integers, ChronoTrigger's first answer takes time T, his second answer takes time 0.6T, and my answer takes time 0.2T (two tenths the time!)

One additional benefit to this code is that it handles the case where there is not an equal number of elements to be distributed:

BBBBCCCCBBBB

Whereas the linked answer can only handle cases where there is an equal number of B and C elements.

Edit

I've also implemented a stable version of the above algorithm, following sp2danny's comment, however it's remarkably slow and you'd never want to use it because it has to iterate over the array O(n) times to perform the in-place swapping. At that point, I'd prefer ChronoTrigger's answer.

Code just in case someone wants to sit around for a while (it's also in the linked demo code):

 // Slow!
 void StableSwappingSplit(std::vector<int>& a, int stride)
 {
    auto asize = a.size();

    auto starti = 0;
    while(starti < asize)
    {
        for (size_t i=starti, j = starti+stride;j < asize-starti; i+=stride, j+=stride)
        {
            for (size_t k = 0; k < stride; ++k, ++i, ++j)
            {
                std::swap(a[i], a[j]);
            }
        }
        starti += stride;
    }
    
    if ((asize / stride) % 2 == 1)
    {
       asize = asize - (asize+stride)/2; // asize now represents number of C elements
    }
    else
    {
        asize /= 2; // asize now represents number of C elements
    }
    //std::cout << "After swapping: \n";
    //PrintVec(a);
     
    std::vector<int> b(std::make_move_iterator(a.begin() + asize), std::make_move_iterator(a.end())); // copy second half into b
    a.resize(asize);
    //a is now c
 }

I don't think the condition `a[0] == a[j]` is going to work for live data. The algorithm is also not stable. — sp2danny, Sep 14 '15 at 18:50
@sp2danny: Well I got around to implementing an actual in-place stable version of my algorithm, and it's super slow! Hopefully OP doesn't need stability. — AndyG, Sep 16 '15 at 17:59

West · Answer 3 · 2015-09-15T04:14:44.903

0

vector<char>b(a.size()/2),c(a.size()/2);
for(auto i=a.begin();i<a.end();i=i+8){
    move(i,i+4,b.begin()+(i-a.begin())/2);
    move(i+4,i+8,c.begin()+(i-a.begin())/2);
}

Moving here is the same as copying. So we move the first half to the B array and the second to the C array, using i as out index and deciding where the data goes in the b array by using the distance between i and the beginning of a divided by 2.

edited Sep 15 '15 at 04:14

answered Sep 14 '15 at 15:16

West

722
7
16

You are copying twice the data you want to `c` with `i=i+4` in the `for`. – ChronoTrigger Sep 14 '15 at 15:19
Unfortunately, results in a 'vector iterator + offset out of range' assertion, althought i - i+8 are still 'inside' the vector. – user3067395 Sep 14 '15 at 15:37
1

While this code snippet may solve the question, [including an explanation](http://meta.stackexchange.com/questions/114762/explaining-entirely-code-based-answers) really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. – Lynn Crumbling Sep 15 '15 at 02:05