1

I need to run an efficient function to generate string combinations. I used multiple answers in SO to write something that works. The vectors are combined, the resultant strings then sorted by character and then duplicates are removed by inserting the resultant vector into an unordered set. The function needs to run 1e7 times on long vectors (100K) and I need help running more efficient code. Here's what I have now:

vector<string> vec_combos(vector<string> v1, vector<string> v2) {
    vector<string> res;
    unordered_set<string> s;
    for(int i=0;i<v1.size();i++) {
        for(int j=0;j<v2.size();j++){
            res.push_back(v1[i]+v2[j]);
            sort(res.back().begin(),res.back().end());
        }
    }
    for( int i = 0; i < res.size(); ++i ) s.insert( res[i] );
    res.assign( s.begin(), s.end() );
    return res;
}

int main() {
    vector<string> v1 = {"ab","bc","ca"}, v2 = {"a","b"},v3;
    // combined vector = {"aba","bca","caa","abb","bcb","cab"}
    // sort string chars = {"aab","abc","aac","abb","bbc","abc"}
    // remove duplicates = {"aab","abc","aac","abb","bbc"}
    v3 = vec_combos(v1, v2);
    for(int i=0;i<v3.size();i++) {
        cout << v3[i] << ",";
    }
    return(0);
}
Eric
  • 1,381
  • 9
  • 24
  • 1
    A) youre passing vectors by copy to the functions, these are useless copies, use references instead B) to remove duplicates take advantage of the fact that you sort and use [unique](http://en.cppreference.com/w/cpp/algorithm/unique), C) use emplace_back D) reserve upfront since you know exactly how many you'll be adding – Borgleader Jun 09 '17 at 13:19
  • Borgleader, I will try A - thank you. As far as B, I followed the advice [here](https://stackoverflow.com/questions/1041620/whats-the-most-efficient-way-to-erase-duplicates-and-sort-a-vector) and confirmed that vec.erase unique is slower than the unordered set solution. I am new to c++ and am no tfamiliar with C and D but I'll read up. Thank you. – Eric Jun 09 '17 at 13:25
  • [Look here](http://coliru.stacked-crooked.com/a/9c23137548589b16), the output order is different but the elements looked the same. – Borgleader Jun 09 '17 at 13:28
  • @Jarod42 that sort is sorting each string so the letters are in alphabetical order – Borgleader Jun 09 '17 at 13:37
  • yes. I only need the strings themselves sorted, not the large vector. I save some time with vec_combos1. – Eric Jun 09 '17 at 13:38

1 Answers1

2

Passing by reference, and avoid temporary unneeded container. You may use something like:

std::vector<std::string> vec_combos2(const std::vector<std::string>& v1,
                                     const std::vector<std::string>& v2)
{
    std::unordered_set<std::string> words; // Or std::set<std::string>

    for (const auto& s1 : v1) {
        for (const auto& s2 : v2) {
            std::string s = s1 + s2;
            std::sort(s.begin(), s.end());
            words.insert(std::move(s));
        }
    }
    return { words.begin(), words.end() };
}

Demo

Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • You could std::move the string inside the set and avoid a copy. – Borgleader Jun 09 '17 at 13:53
  • @Borgleader: indeed, fixed. (move_iterator also used BTW). – Jarod42 Jun 09 '17 at 13:55
  • Jarod, I am probably screwing something up but I can't shake getting a compiler error with vec_combos2: Xcode throws the following :"/iterator:959:14: Cannot cast from lvalue of type 'const value_type' (aka 'const std::__1::basic_string') to rvalue reference type 'reference' (aka 'std::__1::basic_string &&'); types are not compatible". is the return not a vector? Any ideas? – Eric Jun 09 '17 at 15:31
  • Indeed, I cannot move string from `set`, so a copy is needed. Fixed and demo link added. (strange that error happens only for `stdlib=libc++` ). – Jarod42 Jun 09 '17 at 17:34