2

Given a std::vector of strings, what is the best way of removing all elements starting from the end that are empty (equal to empty string or whitespace). The removal of elements should stop when a non-empty element is found.

My current method, (work in progress) is something like:

while (Vec.size() > 0 && (Vec.back().size() == 0 || is_whitespace(Vec.back()))
{
    Vec.pop_back();
}

where is_whitespace returns a bool stating if a string is whitespace or not

I suspect that my method will resize the vector at each iteration and that is suboptimal. Maybe with some algorithm it is possible to do in one step.

Input: { "A", "B", " ", "D", "E", " ", "", " " }

Desired Output: { "A", "B", " ", "D", "E" }

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
Sturm
  • 3,968
  • 10
  • 48
  • 78

2 Answers2

4

As I did not find a good dupe on first glance, here is a simple solution:

// Helper function to see if string is all whitespace
// Can also be implemented as free-function for readablity and
// reusability of course
auto stringIsWhitespace = [](const auto &str)
{
    return std::all_of(
        begin(str), end(str), [](unsigned char c) { return std::isspace(c); });
};

// Find first non-whitespace string from the back
auto it = std::find_if_not(rbegin(Vec), rend(Vec), stringIsWhitespace);
// Erase from there to the end
Vec.erase(it.base(), end(Vec));

Note the unsigned in the lambda due to this gotcha.

Live example thanks to @Killzone Kid.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
  • This is definitely a more modern C++ solution, but note that it does require iterating through the whitespace elements twice. – Mike Borkland Apr 30 '18 at 14:11
  • 1
    @MikeBorkland But on the other hand, we use a bulk-delete operation over repeated pops. Especially for simple stuff as `char`, I'm very reluctant to believe there is a non-negligible performance difference without benchmarks. That second "iteration" can be implemented as assignment to the vector's size member – Baum mit Augen Apr 30 '18 at 14:13
  • Doesn't seem to work with vector of strings, which is as I understand OP's requirement – Killzone Kid Apr 30 '18 at 14:14
  • @KillzoneKid Ah right; his code tests chars not strings, went from there. Hang on, I'll fix it. Thanks. – Baum mit Augen Apr 30 '18 at 14:15
  • @BaummitAugen Apart from missing closing brace for lambda, it still doesn't seem to work :( https://ideone.com/GJ202T – Killzone Kid Apr 30 '18 at 14:23
  • You are probably right. I'd be interested to see how they performed against each other with a `vector` of, say, 1 million elements, with 500,000 "whitespace" strings on the end. – Mike Borkland Apr 30 '18 at 14:25
  • @KillzoneKid Right, messed up what we wanted to keep and what to delete, the `find_if` is obviously wrong and should be `find_if_not`. Or switch the logic in the predicate, either works. Thanks again; not my day it seems. – Baum mit Augen Apr 30 '18 at 14:27
  • @MikeBorkland Honestly, I wouldn't worry about it until I actually find it's a bottleneck in my real code. Clarity first, optimization later. – Baum mit Augen Apr 30 '18 at 14:31
  • @BaummitAugen No worries, scratched my head as well ;) Have an upvote – Killzone Kid Apr 30 '18 at 14:32
3

Here's a better way:

for (auto it = Vec.rbegin(); it != Vec.rend() && is_whitespace(*it); )
{
    it = Vec.erase(it);
}

It will start from the end and stop once non-whitespace has been encountered or the beginning of the vector is reached, whichever comes first. Note that I don't increment the iterator in the for loop.

Mike Borkland
  • 714
  • 1
  • 5
  • 12