1

I tried to use a Java BufferedReader to read an empty txt file.

Here's my code:

    File recentFile = new File(path);
    try {
            BufferedReader reader = new BufferedReader(newInputStreamReader(newFileInputStream(recentFile), "UTF-8"));
            String temp = reader.readLine();
            reader.close();
            if (temp == null) {System.out.println("your file is empty");}
            else {System.out.println(temp);}
    } catch (IOException ex) {}

The txt file is completely empty, but when I run the program, The command prompt prints out "?" instead of "your file is empty".

When I change "UTF-8" to "Unicode", and change my txt file encoding format to Unicode, I get a "your file is empty" from the prompt.

Why do I get this result when I use UTF-8?

btw, if this is a duplicate please let me know, I tried to search this multiple times on google but couldn't find anything helpful to me.

Kcits970
  • 640
  • 1
  • 5
  • 23
  • `... else if(temp.isEmpty()) { System.out.println("your file is empty");} else ..` ;) [String#isEmpty()](https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#isEmpty--) – xerx593 Mar 03 '20 at 11:10
  • Sorry, I still get the same result. – Kcits970 Mar 03 '20 at 11:12
  • 3
    ..i assume you are on a windows machine, and might have this "nasty-windows-BOM-UTF-8" ? – xerx593 Mar 03 '20 at 11:13
  • 3
    are you sure the file is really completely empty? what's it size in bytes? try using a hex-editor to see its content EDIT: see previous comment (if `temp` is not `null`, print its content as hexadecimal (or byte))) – user85421 Mar 03 '20 at 11:15
  • 1
    ...https://stackoverflow.com/q/4897876/592355, https://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html, [apache-commons-io:bomInputStream:(2.6)](https://commons.apache.org/proper/commons-io/javadocs/api-2.6/org/apache/commons/io/input/BOMInputStream.html) – xerx593 Mar 03 '20 at 11:18
  • @xerx593 thank you, that answered my question nicely =D – Kcits970 Mar 03 '20 at 11:22
  • welcome, thx for the feedback, glad2help :) – xerx593 Mar 03 '20 at 11:24

1 Answers1

1

The file is not completely empty; that is the only explanation. Most likely there is a byte order mark at the start. This doesn't look like a character (if you open the file in notepad, it'll probably show up as seemingly completely empty), but it does count.

Note that I believe BR will probably return 1 empty string first before it starts returning null; however, that is not what's happening here (if it was, you wouldn't have seen your program print ?).

You can check the actual bytes that are there with a hex editor. Alternatively, this snippet of java code will tell you:

try (var in = new FileInputStream("/path/to/the/file")) {
    for (int c = in.read(); c != -1; c = in.read()) {
       System.out.print("%02X", c & 0xFF);
    }
}
System.out.println();
rzwitserloot
  • 85,357
  • 5
  • 51
  • 72