3

I'm trying to parse a string character per character so I can load an image depending on every letter. So if the text is "Hello" i will print 5 images that are the same letters but made in photoshop. It works fine until I want to parse the € symbol.

std::string al = "Test €";

std::string letter="";
for (int i=0; i< al.length();++i)
{
    if (al[i]=='.') letter ="dot";
    else if (al[i]==',') letter ="coma";
    else if (al[i]==' ') letter ="space";
    //else if (al[i]=='€') letter ="euro";
    else letter=al[i];
}

This works fine: letter will adquire the values:"T","e","s","t","space" but if I uncomment the else if (al[i]=='€') letter ="euro"; and try to build it, then I receive a red mesage error that says:

warning: multi-character character constant

So the thing is that I need to know if al[i] is the € symbol to be able to assing "euro" to letter (then my code will be able to work with it)

I've search on google and found that link which says that "\u20AC" is the c++ code for the € and I suppose that the symbol needs more than a byte maybe, but still can't find how to deal with it and be able to parse it in my code. Any Idea of how could I do it?

Thank you so much.

Note: I don't know the C++ version used (dunno where I can check it) but I know its not c++11

Community
  • 1
  • 1
Megasa3
  • 766
  • 10
  • 25
  • It's an eclipse c++ project in fedora. Don't think that is the problem but maybe someone want to know that. – Megasa3 Jul 16 '15 at 11:58
  • 1
    [Required reading](http://www.joelonsoftware.com/articles/Unicode.html) – n. m. could be an AI Jul 16 '15 at 12:14
  • 1
    read this first: [Joel on Software's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](http://www.joelonsoftware.com/articles/Unicode.html), http://www.teknically-speaking.com/2014/02/unicode-utf-8-and-character-encodings_23.html – phuclv Jul 16 '15 at 12:50
  • € is not a single byte character in Unicode but encoded as `20AC` in UTF-16 and `E2 82 AC` in UTF-8 – phuclv Jul 16 '15 at 12:52

3 Answers3

2

The first issue is you should be mindful of using Unicode characters in your source code. Compilers are only required to support a specific character set and not all compilers may like your code. I suggest you read this answer for a more detailed explanation.

Second problem is that the character is to large to be represented in a character literal. You need to explicitly tell the compiler to use a wide character literal instead.

L'\x20AC`   // Notice the preceeding L

The third problem is the rest of your code still uses narrow character strings. Change std::string to std::wstring.

Community
  • 1
  • 1
Captain Obvlious
  • 19,754
  • 5
  • 44
  • 74
  • Well I never had to face that problem before so it's the first time I see that. Ofc I knew there are a lot of types of encoding but just that, the theory. You helped me a lot to understand what and why I had to change my code and with so simple steps and changes. I did it and now it works fine. Thank you so much!!! – Megasa3 Jul 16 '15 at 14:05
1

std::string assumes all characters are encoded in one byte. The symbol you want is a unicode character that's encoded in two bytes (that's why you get 'multi character character' error)

Best thing to use a library that understand unicode and stick with that one. This question might be relevant: unicode string in c++ with boost

Community
  • 1
  • 1
Sorin
  • 11,863
  • 22
  • 26
0

"\u20AC" is a string, so you should split your big string into some sub strings then you compare with them. If they are equal, then you replace it else you replace per character in the sub strings

string al = "Test €";  (assume you declared std namespace already) 
string letter="";
char* ch = strtok(al," ");

while(ch!=NULL) {
    if(al.compare(ch)==0){ 
        letter="euro";
    }
    //your code here
}
Frank Bryce
  • 8,076
  • 4
  • 38
  • 56
  • I don't see how this is going to work. You extract a _substing_ from `al` then compare it _to_ `al`.= so there will never be a match. The way you've presented this the `while` loop will be infinite unless you set `ch` to `nullptr` at some point. – Captain Obvlious Jul 17 '15 at 14:40