1

this is my function to erase all elements from a vector of strings 'results' which aren't as long as 'itemsnum'. However I'm a bit concerned that it calls itself multiple times, is there an easier way?

vector<string> eraselower(vector<string> results, int itemsnum){ //erases all elements in vector which are not long enough
    for (unsigned j=0; j<results.size(); j++){
        if(results[j].length()<itemsnum ){ results.erase(results.begin()+j); }}
    for (unsigned j=0; j<results.size(); j++){
        if(results[j].length()<itemsnum ){ results=eraselower(results,itemsnum);}}
    return results;
}

Thanks.

Joao Noch
  • 179
  • 8

3 Answers3

4

The best way is the erase-remove idiom:

#include <algorithm>
#include <string>
#include <vector>

std::vector<std::string> eraselower(std::vector<std::string> results, int itemsnum)
{
    results.erase(
        std::remove_if(results.begin(), results.end(),
                      [itemsnum](const std::string & s) {
                          return s.size() < itemsnum; }),
        results.end());
    return results;
}
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • You forgot an `s.length()` in the lambda body. – aschepler Jul 02 '17 at 23:08
  • @aschepler: Thanks, fixed! – Kerrek SB Jul 02 '17 at 23:12
  • That should be `std::remove_if`, you're passing a predicate not a value. – Blastfurnace Jul 02 '17 at 23:20
  • @Blastfurnace: Yes, indeed, thanks. Today is not a good day. – Kerrek SB Jul 02 '17 at 23:21
  • In the spirit of friendly debate, I'd like to share my thoughts. In my opinion, these small typos are an indicator that no matter how "conceptually clean" the erase-remove solution is, nothing beats a down-to-earth implementation. There's a real cognitive cost in using advanced constructs, which often manifests itself in making mistakes. In this specific case, it would have been caught by the compiler, but one may not always be so lucky. Thanks for the answer anyway, I've learnt something :) – Boris Dalstein Jul 02 '17 at 23:44
  • 1
    @Boris: I suppose the question is whether the mistakes in your down-to-earth implementation are easier to spot :-) – Kerrek SB Jul 03 '17 at 00:05
  • @KerrekSB I posted the down-to-earth implementation below :) It'd be interesting to make an actual experiment to measure "readibility / bug-proneness". We could make 3-4 variations of each of our implementations, some corrects, some with bugs. Then, ask thousands of programmers to answer the question "is the following implementation correct?" by randomly picking one of the implementation, and measure how long it takes them, and whether they spot the bugs. – Boris Dalstein Jul 03 '17 at 00:13
  • @Boris: This is missing the point somewhat. Algorithms are abstractions, and abstractions are the way to deal with complexity. The reader can *choose* to either dig into or skip over a call to an algorithm, and if the names are well chosen, the reader may be entirely satisfied with understanding that part of the program. If all your codebase consists of "down-to-earth" raw code, then the reader has *no choice* but to dig through each and every detail to understand what the code is doing. Raw code forbids the reader to *not* pay attention to uninteresting details. – Kerrek SB Jul 03 '17 at 11:24
  • @KerrekSB I agree that as a rule of thumb, abstractions are great for the reasons you mention. I'm arguing that in this specific case, the API of the abstraction is so verbose, full of noise, and non-obvious that, as far as readability is concerned, I find that it's hurting more than it helps. In fact, using the abstraction takes the same number of symbols than not using it! The reader has no choice but look up the doc of std::remove_if to understand what the code is doing. If the API was `results.erase_if([n](const string & s) {return s.length() < n;});`, then of course I would use that. – Boris Dalstein Jul 03 '17 at 23:59
  • @KerrekSB Or are you telling me that the behaviour of `std::remove_if` is self-explanatorary from the name? ;) – Boris Dalstein Jul 04 '17 at 00:00
  • Haha, I just googled `erase_if`... it seems I'm not the only one that thought about it ;) https://stackoverflow.com/questions/36384571/is-there-a-better-alternative-to-stdremove-if-to-remove-elements-from-a-vector – Boris Dalstein Jul 04 '17 at 00:10
0

Your main problem is with the first for loop. If for example results[2] and results[3] are both short strings, you reach j==2 and erase the second element - which moves everything after that down one slot, so the short string that was results[3] is now in results[2]. Then the for loop immediately increments j to 3, so you never check the second short string which is now results[2].

The second recursive loop does take care of "fixing" that correctly, but you're right, it's silly.

std::remove_if is perfect for this. Note that it doesn't actually erase anything from the vector itself, it just slides elements down and leaves garbage at the end, so we call vector::erase afterward to shrink the vector to the correct size:

vector<string> eraselower(vector<string> results, int itemsnum){ 
    //erases all elements in vector which are not long enough
    results.erase( std::remove_if( results.begin(), results.end(),
        [itemsnum](const std::string& s) { return s.length() < itemsnum; } ),
        results.end() );

    return results;
}

You might want to consider taking your vector by non-const reference, rather than passing by copy and then returning another modified vector.

aschepler
  • 70,891
  • 9
  • 107
  • 161
0

The erase-remove idiom works, though I personally prefer solutions which are easily readable even by C++ beginners. I would do the following:

vector<string> eraselower(const vector<string> & strings, int itemsnum)
{
    vector<string> res;
    res.reserve(strings.size());
    for (const string & s: strings) {
        if (s.length() >= itemsnum) {
            res.push_back(s);
        }
    }
    return res;
}
Boris Dalstein
  • 7,015
  • 4
  • 30
  • 59
  • 1
    "Easily readable" is highly subjective. The other solutions basically state what is being done `remove_if`, `erase`, etc, A `for` loop doesn't convey this information. – PaulMcKenzie Jul 02 '17 at 23:40
  • while I can see why for loops *should* be more easily readable by the beginners, I think that if you understand both concepts (your and the `remove_if` syntax), the second one is far more readable and simple – Fureeish Jul 02 '17 at 23:44
  • @Fureeish I agree, it is highly subjective. Beginners will find my solution more readable. Experts may find the erase-remove idiom more readable. I can personally read both fine (now that I looked up the erase-remove idiom). My point is that I prefer to write code which is more readable by beginners rather than by experts, since the latter are less likely to introduce a bug when editing the code anyway. :) – Boris Dalstein Jul 02 '17 at 23:46
  • @Boris -- *since the latter are less likely to introduce a bug* -- it could be more likely to introduce a bug by hand-coding a `for` loop. Algorithm functions never fail when given the correct parameters -- for the most part, mistakes done in algorithm functions (typos) result in compiler errors. A `for` loop can fail, for example `for (int i = 0; i < 10;j++)` -- How many times have you seen that mistake done in a nested loop (mixing up `i` and `j`)? – PaulMcKenzie Jul 02 '17 at 23:53
  • @PaulMcKenzie "Algorithm functions can never fail when given the correct parameters.": no for loop can fail used correctly either. My opinion is that you are more likely to use the erase-remove idiom incorrectly than you are to use a for loop incorrectly. Especially a for-range loop. – Boris Dalstein Jul 02 '17 at 23:57