2

I am working on a simple java program that can take a string like this:

⛔️✋STOP✋⛔️ You've violated the law! But now... You

and replace each emoji with the appropriate java character. (I'm not sure what to call them).

Here is an example:

The automobile emoji: would be replaced with: "\\uD83D\\uDE97".

This allows me to have a string such as

"I am a car: \uD83D\uDE97"

in Java source code, and let it look like this:

enter image description here

I can easily do this for one type of emoji by doing this:

emojistring = emojistring.replace("", "\uD83D\uDE97");

The problem is I will be translating strings, like my example string, that will have lots of different types of emojjis. I don't want to have to write a emojistring.replace("Emoji","Java Character") for every single type of emoji that is in my string.

Is there an automatic way to detect an emoji in a string and replace it with the relevant java code?

Foobar
  • 7,458
  • 16
  • 81
  • 161
  • Check http://stackoverflow.com/questions/24840667/what-is-the-regex-to-extract-all-the-emojis-from-a-string out – klor Apr 22 '16 at 20:25
  • 1
    Possible duplicate of [Replace emoji with appropriate java code](http://stackoverflow.com/questions/36802371/replace-emoji-with-appropriate-java-code) – f_puras Apr 22 '16 at 20:39
  • I think there is some confusion here. Java already stores supplementary characters as UTF-16 surrogate pairs. Writing `""` is *exactly the same* as writing `"\uD83D\uDE97"`. Your emojistring.replace call does nothing. – VGR Apr 22 '16 at 21:25
  • @VGR Whoops, fixed that. – Foobar Apr 22 '16 at 21:28
  • @f_puras It is not a duplicate. If you go to the bottom of each post the question asked is different. – Foobar Apr 22 '16 at 21:29
  • https://stackoverflow.com/questions/12013341/removing-characters-of-a-specific-unicode-range-from-a-string/12013465#12013465 – Yogev Sep 05 '17 at 18:16

2 Answers2

4

Take a look at emoji-java and more specifically its EmojiParser class.

You can parse your strings to aliases (text representations), HTML decimal or HTML hexadecimal. You can also remove the emojis.

Example:

String str = "An awesome string with a few emojis!";
String result = EmojiParser.parseToAliases(str);
System.out.println(result);
// Prints:
// "An :grinning:awesome :smiley:string with a few :wink:emojis!"

Disclaimer: I wrote this library

Vincent Durmont
  • 813
  • 8
  • 16
0

The shown character placeholder is the Unicode "character" / code point U+01F697. As Java encodes Unicode as UTF-16 chars, for such high numbered code points one needs a pair of chars.

You could also have done:

int[] codepoints = { 0x1F697 };
String s = new String(codepoints, 0, codepoints.length);

In effect that solved nothi0x1F697ng. The actual problem is that the font cannot represent the emoji, and reverts to such box char.

Java can do much with styled text, for instance as HTML in java Swing GUI. Then you could replace emoji characters with images. Or you might use a font editor, and use registerFont.

You can programatically check fonts:

Font font = ...
if (!font.canDisplay(0x1F697)) {
    ...
}
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138