6

So I am making file reader/writer that can access a given file and save/read from it.I am having a problem while reading from the file. The contents are integers, string and double separated by "|" delimiters. I am using StringTokenizer to separate the tokens and save them to each individual variable but when I am reading the integers I get a NumberFormatException even though the string contains only an int.

Here is the code:

FileReader fr = new FileReader(filename);
BufferedReader buff = new BufferedReader(fr);
String line;

while ((line = buff.readLine()) != null) {
    StringTokenizer st = new StringTokenizer(line, "|");
    while (st.hasMoreElements()) {
         int Id = Integer.parseInt(st.nextToken());
         String Name = st.nextToken();
         double cordX = Double.parseDouble(st.nextToken());
         double cordY = Double.parseDouble(st.nextToken());
    }
}

An example line of the file :

8502113|Aarau|47.391355|8.051251

And the error :

java.lang.NumberFormatException: For input string: "8502113"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at storage.FileUtilities.readCitiesFromFile(FileUtilities.java:63)
at basics.Test.main(Test.java:16)

Am I missing something here? Is StringTokenizer tampering the string in some kind of way that I don't know?

EDIT: Here is the code that creates the file:

FileWriter fw = new FileWriter(filename, !overwrite); // For FileWriter true = append, false = overwrite, so we flip the value.
    BufferedWriter buff = new BufferedWriter(fw);
    String coordConvertor;

    for (int i = 0; i <= cities.size() - 1; i++) {
        buff.write(Integer.toString(cities.get(i).getId()));
        buff.write("|");
        buff.write(cities.get(i).getName());
        buff.write("|");
        coordConvertor = Double.toString(cities.get(i).getCoord().getX());
        buff.write(coordConvertor);
        buff.write("|");
        coordConvertor = Double.toString(cities.get(i).getCoord().getY());
        buff.write(coordConvertor);
        buff.newLine();
Akaitenshi
  • 373
  • 2
  • 15
  • 1
    I can't reproduce your problem. Maybe your file contains some invisible characters (like BOM mark usually placed at start of file). Read that part as string and instead of parsing it print its `length()` to see if number of character matches what you see. – Pshemo May 15 '16 at 14:42
  • "8502113" contain `U+FEFF` Unicode Character – Madhawa Priyashantha May 15 '16 at 14:44
  • @Pshemo Indeed you are correct. The length appears to be +1. How can I resolve this? Is there a way to trim the extra character at the start of the file? EDIT: The file is a standard .txt file – Akaitenshi May 15 '16 at 14:46
  • Usually solution is to not put that mark in the file in the first place. How are you creating that file? – Pshemo May 15 '16 at 14:48
  • FileWriter fw = new FileWriter(filename, !overwrite); Just this line – Akaitenshi May 15 '16 at 14:49
  • @YassinHajaj if you copy "8502113" from above error and assign it to a int you can reproduce it.example http://ideone.com/y3vy2T – Madhawa Priyashantha May 15 '16 at 14:51
  • `FileWriter fw = new FileWriter(filename, !overwrite);` shouldn't be able to add that BOM (byte order mark) by itself. Then it should come from input which you are storing later in that file. – Pshemo May 15 '16 at 14:51
  • I updated the question to include the code for making and writing to the file. – Akaitenshi May 15 '16 at 14:52

1 Answers1

3

There are hidden unicode characters in the String your retrieved with st.nextToken(). Use this code instead to remove them

int Id = Integer.parseInt(st.nextToken().replaceAll("\\p{C}", ""));
String Name = st.nextToken().replaceAll("\\p{C}", "");
double cordX = Double.parseDouble(st.nextToken().replaceAll("\\p{C}", ""));
double cordY = Double.parseDouble(st.nextToken().replaceAll("\\p{C}", ""));
tfosra
  • 581
  • 5
  • 12
  • This seems to have fixed the problem but I don't understand where do these unicode characters come from? Also can you explain what expression in replaceAll does exactly? – Akaitenshi May 15 '16 at 14:58
  • I just took that solution from [How can I replace non-printable Unicode characters in Java](http://stackoverflow.com/questions/6198986/how-can-i-replace-non-printable-unicode-characters-in-java) – tfosra May 15 '16 at 15:01