7
public static void main(String[] args) throws IOException
{
    Scanner in = new Scanner(System.in);
    String fileName = in.nextLine();

    Writer out = new BufferedWriter(new OutputStreamWriter(
            new FileOutputStream("C:/temp/"+fileName+".txt"), "UTF-8"));//Ex thrown
    out.close();
}

I'm trying to create a writer that can handle chinese characters to the file name. So I can create a file called 你好.txt for example.

However I get a FileNotFoundException with the above code, it works perfectly fine for English characters but not with Chinese characters.

I followed the answers here: How to write a UTF-8 file with Java? to produce the above code but it doesn't work.

Anyone know how can I accomplish this?

Stack Trace:

Exception in thread "main" java.io.FileNotFoundException: C:\temp\??.txt (The filename, directory name, or volume label syntax is incorrect)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(Unknown Source)
    at java.io.FileOutputStream.<init>(Unknown Source)
    at java.io.FileOutputStream.<init>(Unknown Source)

Using NIO:

Path path = Paths.get("C:/temp/"+fileName+".txt");//throws ex
Charset charset = Charset.forName("UTF-8");
Path file = Files.createFile(path);
BufferedWriter  bufferedWriter = Files.newBufferedWriter(file, charset);
bufferedWriter.close();

Stack:

Exception in thread "main" java.nio.file.InvalidPathException: Illegal char <?> at index 8: C:/temp/?.txt
    at sun.nio.fs.WindowsPathParser.normalize(Unknown Source)
    at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
    at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
    at sun.nio.fs.WindowsPath.parse(Unknown Source)
    at sun.nio.fs.WindowsFileSystem.getPath(Unknown Source)
    at java.nio.file.Paths.get(Unknown Source)
Community
  • 1
  • 1
Aequitas
  • 2,205
  • 1
  • 25
  • 51
  • 1
    I believe this is OS-dependent. OS's control what strings are allowable as file names. What OS are you using? P.S. OS's don't particularly care what bytes are in the file's data; that's up to the apps that read the files. That's why the link you tried to follow won't help you. – ajb Aug 25 '15 at 04:50
  • Can you please provide stackTrace ? – akash Aug 25 '15 at 04:51
  • @TAsk Added stack trace and indicated in sscce which line throws it – Aequitas Aug 25 '15 at 04:54
  • 1
    @ajb 你好.txt is a valid file name, if I make it through explorer (windows 8) I have no problems, only if I try and make it through code. – Aequitas Aug 25 '15 at 04:56
  • 1
    You should also set the charset for the `scanner`. Change it to `Scanner in = new Scanner(System.in, "UTF-8");` and try. – Codebender Aug 25 '15 at 04:57
  • 1
    @Codebender did not work – Aequitas Aug 25 '15 at 04:59
  • @Aequitas, are you still receiving the same exception with `C:\temp\??.txt` – Codebender Aug 25 '15 at 05:00
  • 1
    @Codebender yep it's exactly the same stack trace. – Aequitas Aug 25 '15 at 05:01
  • [This question](http://stackoverflow.com/questions/2050973/what-encoding-are-filenames-in-ntfs-stored-as) might be relevant; NTFS itself uses UTF-16, and there are weird specific steps that you have to take on Windows to get certain characters working. The JRE might not be using those calls. – chrylis -cautiouslyoptimistic- Aug 25 '15 at 05:01
  • 1
    Also, please post the `\uXXXX` values for the characters you're trying to use. – chrylis -cautiouslyoptimistic- Aug 25 '15 at 05:02
  • 1
    Looking into it, it appears you cannot make this work with `java.io`, but `java.nio` will work. Still don't know all the details. – ajb Aug 25 '15 at 05:02
  • @chrylis https://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs,_part_1_of_4 the few I tested here don't work, specifically the first one I tested (一) – Aequitas Aug 25 '15 at 05:05
  • @ajb I have a very strong suspicion that NTFS requires explicit encoding steps for high-code characters, and that those steps were implemented in NIO but not (for backwards compatibility) retrofitted. – chrylis -cautiouslyoptimistic- Aug 25 '15 at 05:07
  • @chrylis one of the answers to [this question](http://stackoverflow.com/questions/14171565/java-read-write-unicode-utf-8-filenames-not-contents) seems to support your suspicion. – ajb Aug 25 '15 at 05:11
  • 1
    @chrylis I tried using NIO and no luck, will update question with what I tried. – Aequitas Aug 25 '15 at 05:14

1 Answers1

5

I have found out that this problem is related to the character encoding of eclipse console and not related to the Java.

I have used the same code and used Run Configuration differently as shown below,

enter image description here

Now after running the program I got following output in my console,

Exception in thread "main" java.io.FileNotFoundException: C:\temp\??.txt (The filename, directory name, or volume label syntax is incorrect)
    at java.io.FileOutputStream.open(Native Method)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:206)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:95)
    at Test.main(Test.java:21)

Conclusion : Here for ISO-8859-1 encoding in run configuration Scanner will not be able to read those character properly from console because console has different character encoding and you will have ?? as a filename.

Please change character encoding for your console, I firmly believe you are using some IDE. May be you have changed or your console inherited character encoding which is not suppose to encode those characters.

akash
  • 22,664
  • 11
  • 59
  • 87
  • 1
    Oh I misunderstood what you meant, Cp1252 is what I had originally, and yep changing it to UTF-8 fixes it – Aequitas Aug 25 '15 at 05:27
  • 1
  • Then what's the point of using a char set for encoding in the outputstreamwriter? Is there a way to do this without changing the encoding of eclipse? – Aequitas Aug 25 '15 at 23:09
  • @TAsk okay what if I have some other method of inputting characters into the program? because the example in the question was just a simplified one for the purpose of a SSCCE, in reality I have a save dialog where the user can type in their filename as well as a load dialog which can load in their filename. – Aequitas Aug 26 '15 at 06:52