1

Im trying to make a program that reads some text from a .txt file, I want to count how many times a certain word was used.

The text however also has emojis included in it, Java prints these emojis as square brackets '[]' in the console when I print the line.

Is there anyway Netbeans can detect/support these emojis? a few examples: (,,,,✋)

Im using a buffered reader and writer.

     while((line = bufferedReader.readLine()) != null) {

          System.out.println(line);

        } 

Cheers!

Omair Arif
  • 23
  • 1
  • 3

2 Answers2

2

You are seeing squares (probably tofus) because you don't have a font able to render those characters. So the first step would be to ensure that you have such a font.

Even having a font able to render those characters doesn't mean that they will be correctly printed in the NetBeans console. This because the Emoji's are typically non-BMP codepoints (> 0xFFFF) thus encoded with 2 UTF-16 characters ( -> "\uD83D\uDE48"). These 2 characters are Surrogate Pairs which are a way to represent non-BMP codepoints using BMP codepoints.

The IDE is supposed to convert "\uD83D\uDE48" to a single codepoint (0x1F648) and then ask to the font to render this codepoint and not the two separated Surrogate Pairs.

Java String class has several methods to deal with codepoints instead of chars:

String.codepoints()
String.codePointAt(int i)
Character.isBmpCodePoint(int cp)
Character.isSurrogate(char c)
Character.isHighSurrogate(char c)
Character.isLowSurrogate(char c)

Eg

Integer.toHexString("\uD83D\uDCA9".codePointAt(0)) -> 1f4a9
Community
  • 1
  • 1
Simone Rondelli
  • 356
  • 1
  • 18
0

I think these Emojis are formated in UTF-8 so you could use an InputStreamReader with a charSet like this:

 BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF8"));
Marvin
  • 55
  • 11
  • No luck unfortunately, still facing the same issue :/ – Omair Arif Sep 26 '16 at 13:30
  • Im sorry but I misunderstood your question. I don't think there is a way to display this emojis in a terminal. My last idea is to convert them into their code point and display their number with println() – Marvin Sep 26 '16 at 13:34
  • Marvin, thank you for pointing this out, getting their code point number is good enough for me, (I just need to count how many times they have been used). Can you kindly guide as to how print the code point value, and not the square brackets[]. Thanks a lot. – Omair Arif Sep 26 '16 at 13:51