I'm experiencing an interesting issue at the moment.
I'm trying to read this file, which contains the 1000 most common english words in alphabetical order, in java:
http://www.file-upload.net/download-6679295/basicVocabulary.txt.html
This is a snippet at the beginning of the file:
a
able
about
above
according
account
across
act
action
added
afraid
after
My problem now is that, although it seems I'm reading the txt-file correctly, the first line is missing later on in my resultset/resultlist. In this case this is the letter "a", since it stands at the first position.
For making you able to reproduce my problem, please try this sample code with the txt-file above and see it for yourself (Don't forget to update the filepath). I have added the console output that comes for me in comments.
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
public class MyWrongBehaviour {
public static void main(String[] args){
MyWrongBehaviour wrong = new MyWrongBehaviour();
List<String> list = wrong.loadLanguageFile();
System.out.println("size of the list: " + list.size()); //Answer is 1000, that's the correct size
for(String s : list){
System.out.println(s); // "a" will appear, so it is somehow included
}
if(list.contains("a")){
System.out.println("found \"a\""); // doesn't get written on the console, can't find it
}
for(String s : list){
if(s.equals("a")){
System.out.println("found \"a\""); // never gets written, can't find it
}
}
}
private List<String> loadLanguageFile() {
List<String> result = null;
try (InputStream vocIn = getClass().getResourceAsStream(
"/test/basicVocabulary.txt")) {
if (vocIn == null) {
throw new IllegalStateException(
"InputStream for the basic vocabulary must not be null");
}
BufferedReader in = new BufferedReader(new InputStreamReader(vocIn,
"UTF-8"));
String zeile = null;
result = new ArrayList<>();
while ((zeile = in.readLine()) != null) {
result.add(zeile.trim());
}
} catch (IOException e) {
e.printStackTrace();
}
return result;
}
}
Has someone an idea why this is happening and what I can do to fix it? My thoughts are that there might be a charset error, although I saved the file as UTF-8, or that there is somehow an invisible character that corrupts the file, but I don't know how to identify it.
Btw: I've used a Hashset before, but with a Set the first line didn't even get added. Now it gets added, but can't find it.
Thanks for every answer and thought you're sharing with me.