5

My program is reading a text file and doing actions based on the text. But the first line of the text is problematic. Apparently it starts with "". This is messing my startsWith() checks.

To understand the problem I've used this code :

   System.out.println(thisLine 
        + " -- First char : (" + thisLine.charAt(0) 
        + ") - starts with ! : " 
        + thisLine.startsWith("!"));

String thisLine is the first line in the text file.

it writes this to the console : ! use ! to add comments. Lines starting with ! are not read. -- First char : () - starts with ! : false

Why is this happening and how do I fix this? I want it to realize that the line starts with "!" not ""

Arek
  • 3,106
  • 3
  • 23
  • 32
WVrock
  • 1,725
  • 3
  • 22
  • 30

5 Answers5

4

Collecting mine and others' comments into one answer for posterity, your string probably contains unprintable control characters. Try

System.out.println( (int)thisLine.charAt(0) )

to print out their numerical code or

my_string.replaceAll("\\p{C}", "?");

to replace the control characters with '?'.

System.out.println( (int)thisLine.charAt(0) ) printed 65279 for you which would be the Unicode code point for a zero-width space, not unprintable but effectively invisible on output. (See Why is  appearing in my HTML?).

Either remove the extra whitespace character from the file, remove all control characters from the string (my_string.replaceAll("\\p{C}", "");) or use @arvind's answer and trim the string (thisLine = thisLine.trim();) before reading so it contains no whitespace at the very beginning or the very end of the string.

EDIT: Notepad won't show most 'special' characters. If you want to edit the file try a hex editor or a more advanced version of notepad such as Notepad++.

Community
  • 1
  • 1
Buurman
  • 1,914
  • 17
  • 26
  • I'm looking for a programatic way to remove them. Trimming didn't work. – WVrock Jun 25 '15 at 11:34
  • It worked thanks. But where did that character came from? I've written the text programatically. – WVrock Jun 25 '15 at 11:39
  • You might have inadvertently copy-pasted the character in when you copied the string value in, ie. even if you create a string like `String s = "abcdef";`, if you copy the `abcdef` part from somewhere else you might copy in a special character which then wouldn't show in your IDE but would actually be there. – Buurman Jun 25 '15 at 11:51
  • 1
    `65279` is `0xFEFF` which happens to be the [Byte Order Mark](https://en.wikipedia.org/wiki/Byte_order_mark#UTF-16) for a file in UTF-16 encoding. So, if someone chose to write a file with a BOM in UTF-16, the first unicode character would look like the that "invisible whitespace", which at least indicates that you are using the correct endiannes when reading the file. – JimmyB Jun 26 '15 at 08:55
2

Try truncating white spaces before:

thisLine = thisLine.trim();
System.out.println(thisLine 
        + " -- First char : (" + thisLine.charAt(0) 
        + ") - starts with ! : " 
        + thisLine.startsWith("!"));
1

Agreed to what @Arvind has said. It should address the problem if the string has leading whitespaces.

But, always remember that startsWith(String arg) returns true if the arg passed is "" (empty string)

source: Javadocs

JavaHopper
  • 5,567
  • 1
  • 19
  • 27
0

Ignore first line if it is empty..

If you are reading lines in a loop do like below:

thisLine = thisLine.trim();
if (thisLine.isEmpty()) {
    continue;
}
// Remaining logic here including sysout
Raman Shrivastava
  • 2,923
  • 15
  • 26
0

Use the following code to see for sure what the first character of the line is and how long the line is:

System.out.println(thisLine 
    + " -- First char : (" + ((int)thisLine.charAt(0))
    + ") - Line length: " +  thisLine.length());
dosw
  • 431
  • 2
  • 10
  • it is `65279` Notepad does not show anything there. – WVrock Jun 25 '15 at 11:29
  • In this case it might be the best solution to always trim() the line you read before processing it (like @Arvind already mentioned) – dosw Jun 25 '15 at 11:32