2

Using the same project and text file as here: Java.NullPointerException null (again) the program is outputting the data but with . To put you in the picture:

This program is a telephone directory, ignoring the first "code" block, look at the second "code" block on that link, that is the text file with the entries. The program outputs them as it should but it is giving  at the beginning of the entries read from the text file ONLY.

Any help as to how to remove it? I am using Buffered Reader with File Reader in it.

  • Encoding of Text File: UTF-8
  • Using Java 7
  • Windows 7
nquincampoix
  • 508
  • 1
  • 4
  • 17
  • Please inform: encoding of input file, OS environment you do this, default encoding of java Readers on your platform, how and wher is the data shown? (i.e. in a GUI, in a terminal window, or whatnot) – Ingo Feb 05 '13 at 14:56
  • 2
    Google for BOM (Byte order mark) – leonbloy Feb 05 '13 at 14:57

2 Answers2

5

Does the read in textfile uses UTF-8 with BOM? It looks like BOM signs: "" http://en.wikipedia.org/wiki/Byte_order_mark

Are you runnig Windows? Notepad++ sould be able to convert. If using linux or the VI(M) you can use ":set nobomb"

Jens Peters
  • 2,075
  • 1
  • 22
  • 30
1

I suppose your input file is encoded in UTF-8 with BOM.

You can either save your input file without a BOM, or handle this in Java.

The thing one might want to do here is to use an InputStreamReader with appropriate encoding. Sadly, that's not possible. The thing is, Java assumes that an UTF-8 encoded file has no BOM, so you have to handle that case manually.

A quick hack would be to check if the first three bytes of your file are 0xEF, 0xBB, 0xBF, and if they are, ignore them.

For a more sophisticated example, have a look at the UnicodeBOMInputStream class in this answer.

Community
  • 1
  • 1
Carsten
  • 17,991
  • 4
  • 48
  • 53
  • Wouldn't it be better to read the file in the encoding it is? How do you guarantee that the rest of the file is ASCII? – Ingo Feb 05 '13 at 15:15
  • @Ingo I never assumed it was ASCII, `FileReader` uses the system's default encoding. I just wanted to give the OP a quick hack to deal with it, but I have edit my answer to elaborate a bit. Better? :-) – Carsten Feb 05 '13 at 15:53