1

I have lines of code to read the content of the file in Java. Basically I am using FileReader and BufferedReader. I am reading the lines correctly, however, the first character of the first line seems to be an undefined symbol. I have no idea where I got this symbol since the content of the input file is correct.

Here is the code:

FileReader readFile = new FileReader(chosenFile);
BufferedReader input = new BufferedReader(readFile);
while((line = input.readLine()) != null) {
    System.out.println(line); 
}

Console Output

File Content

Jose Da Silva Gomes
  • 3,814
  • 3
  • 24
  • 34
BoJack Horseman
  • 159
  • 5
  • 15
  • possible duplicate of http://stackoverflow.com/questions/1069922/bufferedreader-returns-iso-8859-15-string-how-to-convert-to-utf16-string – Japu_D_Cret Mar 30 '17 at 05:38
  • 1
    if you save your file with Windows Notepad your Data is ANSI encoded, but FileReader will use your platform default which may differ from the file encoding. To force encoding use `Instead of FileReader you need to use new InputStreamReader(new FileInputStream(pathToFile), )` – Japu_D_Cret Mar 30 '17 at 05:41
  • What is the correct encoding for this instance? – BoJack Horseman Mar 30 '17 at 05:51

1 Answers1

3

If it apears only in the first line, this is probably BOM (Byte Order Mark). All modern Text editors recognize this and do not present it as part of the text file. When you save the text file, there should be option to save with or without it.

If you wish to read the BOM marker in java, see here Reading UTF-8 - BOM marker

Community
  • 1
  • 1
Sharon Ben Asher
  • 13,849
  • 5
  • 33
  • 47