1

I'm trying to assign 'o͝' (a phonetic character) to a Character in a Java program, but I get the error "Invalid character constant". My file is using UTF-8 and other phonetic characters work ok, but not this one. It looks as if this character is, in fact, two (an 'o' and a ligature or something like that), but I can not break it is forming parts.

Code example:

Character test = 'o͝';

Any help would be appreciated.

user936580
  • 1,223
  • 1
  • 12
  • 19

4 Answers4

4

The glyph is called "small letter o with combining double breve" and can, in source, be written as;

String a = "\u006f\u035d";

Since it is a combining character (ie two characters), the resulting value cannot be assigned to a single Java char, you'll need to use a String.

Joachim Isaksson
  • 176,943
  • 25
  • 281
  • 294
2

You can try looking the number of the character on the character table and assigning that to the variable, something like:

char a = '\u0040';
  • This is absolutely the right thing to to. Including obscure characters literally into your literals always puts you at the mercy of the file system that stores your code, and of the environment that compiles it The `\uXXXX` escape always works and can handle any character in Unicode – Kilian Foth Apr 21 '12 at 18:43
  • It may well be that you "char" was actually two, with a combining diacritical char as the second, so you would have to use String a = "o͝", or use an int code-point with the one-char version. – Joop Eggen Apr 21 '12 at 18:47
  • Yes, it looks that it is a combining diacritical char, so I could not find the combined character in a table. After using an hexadecimal editor and decoding the UTF-8 values, I found that they are U+006F (the 'o') and U+035D (the COMBINING DOUBLE BREVE). Thank you. – user936580 Apr 21 '12 at 18:56
0

As already said, you shouldn't hardcode characters like that, you should use the unicode point values found here:

http://www.utf8-chartable.de/

What you want actually involves a "combining character":

http://en.wikipedia.org/wiki/Combining_character

The combining diacritical marks are 0x0300 - 0x036f. So, eg, to create the character you want ('o' with double breve), use:

String o_doubleBreve = "o\u035d";

Prints as o͝

CodeClown42
  • 11,194
  • 1
  • 32
  • 67
0

I agree with the above answers that giving the \u representation is best in any new code you happen to write, however one will come across projects with source code having this issue and supposedly they were able to compile their code. One such example I am working with now is openNLP.

Well if you run into something like this, you see that when running in an IDE like Eclipse if you follow a procedure like this, you can change the workspace default representation to be UTF-8. This will allow successful compiling of the code.

Community
  • 1
  • 1
demongolem
  • 9,474
  • 36
  • 90
  • 105