0

My code spends 40% of its time searching through unsorted vectors. More specifically, the searching function my_search repeatedly receives a single unsorted vector of length N, where N can take any values between 10 and 100,000. The weights associated with each element have relatively little variance (e.g. [ 0.8, 0.81, 0.85, 0.78, 0.8, 0.7, 0.84, 0.82, ...]).

The algorithm my_search starts by summing all the weights for each object and then sample an average of N elements (as many as the length of the vector) with replacements. The algorithm is quite similar to

int sum_of_weight = 0;
for(int i=0; i<num_choices; i++) {
   sum_of_weight += choice_weight[i];
}
int rnd = random(sum_of_weight);
for(int i=0; i<num_choices; i++) {
  if(rnd < choice_weight[i])
    return i;
  rnd -= choice_weight[i];
}

from this post.

I could sort the vector before searching but takes a time of the order of O(N log N) (depending on the sort algorithm used) and I doubt (but might be wrong as I haven't tried) that I would gain much time especially as the weights have little variance.

Another solution would be to store information of how much weight there is before a series of points. For example, while summing the vector, every N/10 elements, I could store the information of how much weights has been summed yet. Then, I could first compare rnd to these 10 breakpoints and search in only a tenth of the total length of the vector.

  • Would this be a good solution?
  • Is there a name for the process I described?
  • How can I estimate what is the right number of breakpoints to store as a function of N?
  • Is there a better solution?
Community
  • 1
  • 1
Remi.b
  • 17,389
  • 28
  • 87
  • 168
  • If you are searching the same list repeatedly, that N log (N) hit from sorting gets more and more attractive. Assuming you sort once and cache. – user4581301 Nov 03 '16 at 05:13
  • 1
    @user4581301 OP says he repeatedly receives unsorted vectors – smac89 Nov 03 '16 at 05:14
  • @smac89 OP doesn't say whether or not the unsorted vector is repeated. OP has two choices: Impose order or linear search. – user4581301 Nov 03 '16 at 05:16
  • 1
    A better solution is to create a second array of length N, and store every partial sum in that array. Then you can use a binary search to find the index where the partial sum exceeded `rnd`. In fact, if all you care about is the index, then you don't even need the second array. Just modify the input array, so that it contains the partial sums. – user3386109 Nov 03 '16 at 05:18
  • What type are choice_weight's elements of? If double/float, you should define sum_of_weight to be the same type (but not integer). – Slava Nov 03 '16 at 05:22
  • Is this an attempt to choose numbers at random with probability according to their weights? If so, have a look at https://en.wikipedia.org/wiki/Alias_method - sampling takes O(1) time after O(n) setup time – mcdowella Nov 03 '16 at 05:22
  • Also, make sure that the random value you generate is also floating point number in the appropriate range. You can use std::uniform_real_distribution(0.0, sum_of_weight); With the binary search over partial sums (already mentioned above) will give you good and fast solution. – Slava Nov 03 '16 at 05:33
  • Possible duplicate of [what does \*\* mean in C](https://stackoverflow.com/questions/2893129/what-does-mean-in-c) – phuclv Jun 01 '19 at 03:55

1 Answers1

5

log(N) Solution

{
    std::vector<double> sums;
    double sum_of_weight = 0;
    for(int i=0; i<num_choices; i++) {
       sum_of_weight += choice_weight[i];
       sums.push_back(sum_of_weight);
    }

    std::vector<double>::iterator high = std::upper_bound(sums.begin(), sums.end(), random(sum_of_weight));

    return std::distance(sums.begin(), high);
}

Essentially the same idea you have for a better way to solve it, but rather than store only a 10th of the elements, store all of them and use binary search to find the index of the one closest to your value.


Analysis

Even though this solution is O(logN), you really have to ask yourself if it's worth it. Is it worth it to have to create an extra vector, thus accumulating extra clock cycles to store things in the vector, the time it takes for vectors to resize, the time it takes to call a function to perform binary search, etc?

As I was writing the above, I realised you can use a deque instead and that will almost get rid of the performance hit from having to resize and copy contents of vectors without affecting the O(1) lookup of vectors.

So I guess the question remains, is it worth it to copy over the elements into another container and then only do an O(logN) search?

Conclusion

TBH, I don't think you've gained much from this optimization. In fact I think you gained an overhead of O(logN).

smac89
  • 39,374
  • 15
  • 132
  • 179
  • I am a little unclear about the role of the if statement. Can you please elaborate on it? When `high` points to the first element of the vector `sums`, then shouldn't the index `std::distance(sums.begin(), high) - 1` be `-1`? Thank you! – Remi.b Nov 07 '16 at 18:55
  • Ok, I see the edit to fix the issue of having a negative index. I still don't quite get the necessity of this if statement. I feel like upper_bound returns a iterator pointing to the value that is just higher than the random value drawn, which is exactly what we are looking for, isn't it? – Remi.b Nov 07 '16 at 19:06
  • 1
    Thanks :). I could modify the rest of the code to work with a cumulative sum of the choice weights all along without much of a computation time cost. So your solution ended making a fair difference in performance. Thanks! – Remi.b Nov 07 '16 at 19:22
  • @Remi.b Not a problem. In the analysis I was trying to draw your attention to some dangers of early optimization, but I'm glad you found a way to make this actually worth the effort – smac89 Nov 07 '16 at 19:25
  • I suddenly gets a little confused. Is [upper_bound](http://en.cppreference.com/w/cpp/algorithm/upper_bound) making a binary search? It does not appear it does and one would need to implement it (see [here](http://stackoverflow.com/questions/446296/where-can-i-get-a-useful-c-binary-search-algorithm)). But I did saw a better performance after modifying the code which get me confused. Thanks again – Remi.b Nov 16 '16 at 19:05
  • @Remi.b Yes upper_bound does a binary search but it returns the first item larger than what you are searching for. The answer in the linked question says you can implement binary search using any one of those three. I specifically picked `upper_bound` because of the nature of what you were looking for – smac89 Nov 16 '16 at 19:24