I have code that creates several object instances (each instance having a fitness value, among other things) from which I want to sample N unique objects using weighted selection based on their fitness values. All objects not sampled are then discarded (but they need to be initially created to determine their fitness value).
my current code looks something like this:
vector<Item> getItems(..) {
std::vector<Item> items
.. // generate N values for items
int N = items.size();
std::vector<double> fitnessVals;
for(auto it = items.begin(); it != items.end(); ++it)
fitnessVals.push_back(it->getFitness());
std::mt19937& rng = getRng();
for(int i = 0, i < N, ++i) {
std::discrete_distribution<int> dist(fitnessVals.begin() + i, fitnessVals.end());
unsigned int pick = dist(rng);
std::swap(fitnessVals.at(i), fitnessVals.at(pick));
std::swap(items.at(i), items.at(pick));
}
items.erase(items.begin() + N, items.end());
return items;
}
Typically ~10,000 instances are initially created, with N being ~200. The fitness value is non-negative, usually valued at ~70. It could go as high as ~3000, but higher values are increasingly more unlikely.
Is there an elegant way to get rid of the fitnessVals vector? Or perhaps a better way to do this in general? Efficiency is important, but I'm also wondering about good C++ coding practices.