0

I initialized the Scanner class object in this manner:

Scanner scanner = new Scanner(new File("data.txt"),"utf-8");  

When i try to read a file with chars like ç or é, scanner.hasNextLine() returns false, scanner don't read nothing.

I tried to use "iso-8859-1". And file reading was successful. But file is UTF-8 file and chars like 'ç' are displayed as "ç".

Please help me solve the problem and make the program properly read and display UTF-8 characters.

HackPack
  • 11
  • 1
  • 4

2 Answers2

0

specify encoding while writing the UTF-8 encoded text

new String(scanner.next().getBytes(), Charset.forName("UTF-8"))

To get the complete line, specify delimiter in Scanner

scanner.useDelimiter("\n");
Saravana
  • 12,647
  • 2
  • 39
  • 57
  • scanner.next() returns a String already why would we need to do that? – Nicolas Filotto May 12 '16 at 10:37
  • From docs `Constructs a new String by decoding the specified array of bytes using the specified Charset.` – Saravana May 12 '16 at 11:34
  • You convert a String into String, it doesn't make any sense. Moreover you don't do it properly because getBytes() use the default encoding so let's say that it is ISO-8858-1, it will serialize the String in ISO-8859-1, then deserialize the result in UTF-8 which cannot simply work – Nicolas Filotto May 12 '16 at 11:42
  • In the `scanner` the input encoding has given has `UTF-8` that's why while decoding I have constructed it as `UTF-8` String – Saravana May 12 '16 at 11:57
0

Use:

new String(scanner.next().getBytes("UTF-8"), Charset.forName("UTF-8"))