1

For example, I have a file with the following contents:

     Hello John Smith
               Hello Jack Brown
                     OK I love you

Note that each sentence has some leading white spaces. I want to use std::fstream to read them line by line, and want to remove the leading white spaces but keep the spaces between the words in a sentence.

My desired output should be as follows:

Hello John Smith
Hello Jack Brown
OK I love you

I also find this post gives many trivial methods to my question. However, I think none of them is elegant in terms of modern C++. Is there any more elegant means?

Community
  • 1
  • 1
xmllmx
  • 39,765
  • 26
  • 162
  • 323

2 Answers2

4
std::ifstream file("input.txt");

std::string line;

while(std::getline(file,line))
{
     auto isspace = [](unsigned char ch) { return std::isspace(ch); };

     //find the first non-space character
     auto it = std::find_if_not(line.begin(), line.end(), isspace);

     line.erase(line.begin(), it); //erase all till the first non-space

     std::cout << line << "\n";
}

Note that we could just pass std::isspace as third argument to std::find_if_not, but there are overloads of std::isspace which causes compilation error — to fix this you can use cast though, as:

auto it = std::find_if_not(line.begin(), 
                           line.end(), 
                           static_cast<int(*)(int)>(std::isspace));

which looks ugly. But because of the function type in the cast, the compiler is able to figure out which overload you intend to use in the code.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • Several comments: first, you have undefined behavior. You cannot call the one argument version of `isspace` with a `char`. You should declare the lambda to take an `unsigned char` (or use `std::ctype`; a bit trickier, but it allows you to use a specific locale without modifying the global locale). Second, why not just specify the lambda as the third argument of `find_if` (and I'd specify the lambda so that `std::find_if` worked). And finally: it's probably worth pointing out that this won't work for multi-byte characters, which are becoming more and more common. – James Kanze Jul 01 '14 at 15:25
  • @JamesKanze: Thanks for your comment. I used `std::find_if_not` just to demonstrate its usage and also I wanted to point out we could pass `std::isspace` as third argument (after casting). – Nawaz Jul 01 '14 at 15:28
  • But it's a C++11 function; if you use `std::find_if`, then you don't need C++11. (Of course, you can't use lambda anyway, but you'll probably want to provide a global `IsSpace` and `NotIsSpace` predicate anyway, unless this is the only time you'll ever be processing text.) – James Kanze Jul 01 '14 at 15:40
  • 2
    And of course, while you _can_ pass `std::isspace` directly, it will result in undefined behavior, since `std::find_if_not` will end up passing it a `char`. (For historical reasons, character processing in C++ is more complex than it should be, and has a number of sneaky traps. Back when I was doing it a lot, one of my test inputs was always a list of Paris suburbs. With `"L'Haÿ-les-Roses"`. You'd be surprised at how many programs stopped at the `'ÿ'`. Character code 255 in Latin-1.) – James Kanze Jul 01 '14 at 15:44
  • @JamesKanze: Great comments. Hope viewers of this answer read your comments as well. – Nawaz Jul 01 '14 at 15:55
4

As a complement to Nawaz' answer: it's worth pointing out that Boost has a String_Algo library, with (along with a lot of other things) functions like trim, which will simplify the code a lot. If you're doing any text processing at all, and you can't or don't want to use Boost, you should implement something similar yourself for your toolkit (e.g. a function MyUtils::trim, based on Nawaz' algorithms).

Finally, if you may need someday to handle UTF-8 input, then you should look into ICU.

James Kanze
  • 150,581
  • 18
  • 184
  • 329