25

Is there an inbuilt function to convert C++ string from upper case letters to lowercase letters ? If not converting it to cstring and using tolower on each char is the only option ?

Thank you very much in advance.

Mosty Mostacho
  • 42,742
  • 16
  • 96
  • 123
brett
  • 5,379
  • 12
  • 43
  • 48
  • 4
    In Germany, tolower("STRASSE") should result in "straße". In Swiss, it should be "strasse". And there are many more cases like this around the world. An inbuilt function must correctly handle those cases. If you don't care, just use tolower() on each character as most answers show. – Sjoerd Aug 04 '10 at 08:56
  • @Sjoerd: Nice example. Have you heard of any library that deals with those cases gracefully ? I could be interested in it. – ereOn Aug 04 '10 at 09:07
  • @ereOn: No, I never needed one. I know there are problems with tolower() but where I live and in the applications I write, tolower() usually is "good enough". – Sjoerd Aug 04 '10 at 09:23
  • I'd imagine ICU can handle it correctly (http://icu-project.org/), but it might be overkill for the OP's purposes. – jalf Aug 04 '10 at 10:51
  • @ereOn: it is actually near impossible without a dictionary that contains all ambiguous words. In german, SS is only ß if spoken slowly, otherwise it should become ss. – Sebastian Mach Jun 16 '11 at 10:48
  • @Jalf: Does ICU come with dictionaries? In German, for an instance, there is no 1-to-1 mapping of `SS -> lowercase`; e.g. `STRASSE -> straße`, but `TRASSE -> trasse`. Some words are not even unambiguous: `MASSE -> maße` (= measures, dimensions) and `MASSE -> masse` (= mass). (edit: I realise just now that I already visited this question in the past, xd) – Sebastian Mach Apr 25 '12 at 06:08
  • @phresnel: honestly? I don't know how (or if) this case is handled. ICU just follows Unicode's rules, and I don't know what the Unicode standard has to say about this case. – jalf Apr 25 '12 at 06:26

7 Answers7

37

If boost is an option:

#include <boost/algorithm/string.hpp>    

std::string str = "wHatEver";
boost::to_lower(str);

Otherwise, you may use std::transform:

std::string str = "wHatEver";
std::transform(str.begin(), str.end(), str.begin(), ::tolower);

You can also use another function if you have some custom locale-aware tolower.

ereOn
  • 53,676
  • 39
  • 161
  • 238
  • Note that [according to cppreference](https://en.cppreference.com/w/cpp/string/byte/tolower), a call to `tolower` with a parameter that can't be cast to `unsigned char` results in UB. So what you probably want is something like: `std::transform(str.begin(), str.end(), str.begin(), [](unsigned char c){ return std::tolower(c); });` – Pacopenguin Jan 17 '23 at 15:21
22
std::transform(myString.begin(), myString.end(), myString.begin(), std::tolower);
Philipp
  • 48,066
  • 12
  • 84
  • 109
TortoiseTNT
  • 241
  • 1
  • 4
  • 9
    This and other transform + tolower answers should take into account that this won't necessarily compile, depending on what standard headers are included in this file. There is one `tolower` in `` and an overload in ``. If both get included, you'll get a compiler error. See for example: http://stackoverflow.com/questions/1350380/problems-using-stl-stdtransform-from-cygwin-g – UncleBens Aug 04 '10 at 15:40
  • 2
    Note that this answer (and all of the other `transform` answers) potentially cause undefined behaviour, because `cstdlib`'s `std::tolower` requires a non-negative argument – M.M Aug 16 '15 at 15:18
2

Like ereOn says: std::transform(str.begin(), str.end(), str.begin(), std::tolower );

Or via for_each: std::for_each(str.begin(), str.end(), std::tolower );

Transform is probably better of the two.

graham.reeds
  • 16,230
  • 17
  • 74
  • 137
1

For this problem you can use the STL's transform method to solve it:

std::string str = "simple";
std::transform(str.begin(), str.end(), str.begin(), std::tolower);
Nanne
  • 11
  • 1
1

The above answers produced errors. This works perfectly:

std::transform(str.begin(), str.end(), str.begin(),
        [](unsigned char c){ return std::tolower(c); });
Ronny Sherer
  • 8,349
  • 1
  • 22
  • 9
0

I have an implementation I found it faster than std::transform , Compiled in g++ -03 Fedora 18. my example converts std::string

performance time in seconds :
transform took         : 11 s
my implementation took : 2 s
Test data size = 26*15*9999999 chars
inline void tolowerPtr(char *p) ;

inline void tolowerStr(std::string& s)
{char* c=const_cast<char*>(s.c_str());
size_t l = s.size();
  for(char* c2=c;c2<c+l;c2++)tolowerPtr(c2); 
};

inline void tolowerPtr(char *p) 
{
switch(*p)
{
  case 'A':*p='a'; return;
  case 'B':*p='b'; return;
  case 'C':*p='c'; return;
  case 'D':*p='d'; return;
  case 'E':*p='e'; return;
  case 'F':*p='f'; return;
  case 'G':*p='g'; return;
  case 'H':*p='h'; return;
  case 'I':*p='i'; return;
  case 'J':*p='j'; return;
  case 'K':*p='k'; return;
  case 'L':*p='l'; return;
  case 'M':*p='m'; return;
  case 'N':*p='n'; return;
  case 'O':*p='o'; return;
  case 'P':*p='p'; return;
  case 'Q':*p='q'; return;
  case 'R':*p='r'; return;
  case 'S':*p='s'; return;
  case 'T':*p='t'; return;
  case 'U':*p='u'; return;
  case 'V':*p='v'; return;
  case 'W':*p='w'; return;
  case 'X':*p='x'; return;
  case 'Y':*p='y'; return;
  case 'Z':*p='z'; return;
};
return ;
}

void testtransform( std::string& word )
{
std::string word2=word; 
time_t t;
time_t t2;
time(&t);
std::cout << "testtransform: start " << "\n";
int i=0;
for(;i<9999999;i++) 
{    word2=word;
    std::transform(word2.begin(), word2.end(), word2.begin(), ::tolower);
}
time(&t2);
std::cout << word2 << "\n";
std::cout << "testtransform: end " << i << ":"<< t2-t << "\n";
}

void testmytolower( std::string& word )
{
std::string word2=word; 
time_t t;
time_t t2;
time(&t);
std::cout << "testmytolower: start " << "\n";
int i=0;
for(;i<9999999;i++)
{   word2=word;
    cstralgo::tolowerStr(word2);
}
time(&t2);
std::cout << word2 << "\n";
std::cout << "testmytolower: end " << i << ":"<< t2-t << "\n";
}

int main(int argc, char* argv[])
{
   std::string word ="ABCDEFGHIJKLMNOPQRSTUVWXYZ";
   word =word+word+word+word+word+word+word+word+word+word+word+word+word+word+word;
   testtransform( word);
   testmytolower( word);
   return 0;
}

I will be glad to know if performance can be improved further.

Anand Rathi
  • 790
  • 4
  • 11
  • 8
    You are just covering ASCII alphabet here, and in a quite horrible way. The same can be achieved by: `if (*p >= 'A' && *p <=Z) *p += 'a' - 'A'`. And you are still potentially missing many other letters e.g. 'Á'. – gatopeich Apr 13 '18 at 13:34
  • Your test has a problem. Run it on a long random word I generated mine with `cat /dev/urandom | LC_CTYPE=C tr -dc '[:alpha:]' | fold -w ${1:-100000000} | head -n 1`, and you function is only a bit (10% faster than transform). Use the suggested `if (*p >= 'A' && *p <=Z) *p += 'a' - 'A'` and get 45-48% faster. Pass the char parameter by reference (not pinter) and get an additional 5% faster. – Yuval Zilber Mar 10 '23 at 19:16
0

There is no built-in function to do this, and doing it is surprisingly complicated, because of locales et al. If tolower does what you need, it may be your best bet.