How do I build a vector<> of search results from a source vector<>?

Question

Considering this example:

std::vector<Student> students;
//poplate students from a data source
std::vector<Student> searched(students.size());
auto s = std::copy_if(students.begin(), students.end(), searched.begin(),
    [](const Student &stud) {
        return stud.getFirstName().find("an") != std::string::npos;
    });
searched.resize(std::distance(searched.begin(), s));

I have the following questions:

Is it ok to allocate memory for searched vector equals to the initial vector? There may be 500 not small objects and maybe none satisfying the search criteria? Is there any other way?
When copying to the searched vector it is called the copy assignment operator and ..obviously a copy is made. What if from those 500 objects 400 satisfying the search criteria? Isn't just memory wasting?

I am a c++ noob so I may say something stupid. I don't see why to ever use vector<T> where T is a object. I would always use vector<shared_ptr<T>>. If T is a primitive type like an int i guess it's kinda straight forward to use vector<T>.

I considered this example because I think it's very general, you always have to pull some data out of a database or xml file or any other source. Would you ever have vector<T> in your data access layer or vector<shared_ptr<T>>?

Honestly I would use a `std::back_inserter(searched)`for the output iterator of the `copy_if` and forego the initial sizing entirely. — WhozCraig, Mar 08 '13 at 21:18
A good response for most of this (especially #2) comes down to a question of why you're making a copy in the first place. If possible, avoid making the copy at all, and use something like a `transform_if` to filter *and process* the subset, instead of just creating and storing the subset. — Jerry Coffin, Mar 08 '13 at 21:32

score 8 · Accepted Answer · edited May 23 '17 at 10:25

Concerning your first question:

1 - Is it ok to allocate memory for searched vector equals to the initial vector? There may be 500 not small objects and maybe none satisfying the search criteria? Is there any other way?

You could use a back inserter iterator, using the std::back_inserter() standard function to create one for the searched vector:

#include <vector>
#include <string>
#include <algorithm>
#include <iterator> // This is the header to include for std::back_inserter()

// Just a dummy definition of your Student class,
// to make this example compile...
struct Student
{
    std::string getFirstName() const { return "hello"; }
};

int main()
{
    std::vector<Student> students;

    std::vector<Student> searched;
    //                   ^^^^^^^^^
    //                   Watch out: no parentheses here, or you will be
    //                   declaring a function accepting no arguments and
    //                   returning a std::vector<Student>

    auto s = std::copy_if(
        students.begin(),
        students.end(),
        std::back_inserter(searched),
    //  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    //  Returns an insert iterator
        [] (const Student &stud) 
        { 
            return stud.getFirstName().find("an") != std::string::npos; 
        });
}

Concering your second question:

2 - When copying to the searched vector it is called the copy assignment operator and ..obviously a copy is made. What if from those 500 objects 400 satisfying the search criteria? Isn't just memory wasting?

Well, if you have no statistical information on the selectivity of your predicate, then there is not much you can do about it. Of course, if your purpose is to process somehow all those students for which a certain predicate is true, than you should use std::for_each() on the source vector rather than create a separate vector:

std::for_each(students.begin(), students.end(), [] (const Student &stud) 
{ 
    if (stud.getFirstName().find("an") != std::string::npos)
    {
        // ...
    }
});

However, whether this approach satisfies your requirements depends on your particular application.

I don't see why to ever use vector<T> where T is a object. I would always use vector<shared_ptr<T>>.

Whether or not to use (smart) pointers rather than values depends on whether or not you need reference semantics (apart from possible performance considerations about copying and moving those objects around). From the information you provided, it is not clear whether this is the case, so it may or may not be a good idea.

@JackWillson #2 is entirely dependent on whether you want to maintain independence of the results from the original container for modification purposes If you need them for modifications that you do *not* want propagated to the objects in the original container, shared pointers are not a good move. If you are ok with modifications begin reflected in both vectors, or if you're not planning on modifying them at all, shared pointers are a worthy consideration if memory footprint is an issue. *But make sure it is an issue **first**.* Don't over-optimize unless you know its a problem. — WhozCraig, Mar 08 '13 at 21:29
@JackWillson: I tried to answer the second question as well, hopefully the answer makes sense. — Andy Prowl, Mar 08 '13 at 21:34

score 0 · Answer 2 · answered Mar 08 '13 at 21:44

0

What are you going to do with all those students?

Just do that instead:

for(Student& student: students) {
    if(student.firstNameMatches("an")) {
        //.. do something
    }
}

answered Mar 08 '13 at 21:44

Peter Wood

23,859
5
60
99

How do I build a vector<> of search results from a source vector<>?

2 Answers2