0

I have a servlet that writes uploaded files to disk, using Apache Commons fileupload. That's all working fine in general.

It runs on both Windows and Linux servers using Tomcat. On Windows it handles files with non-ASCII file names correctly and the files are saved properly.

On Linux (CentOS 6) however the file names are not saved correctly when containing non-ASCII characters.

If have tried three different versions of writing the file. In Windows all work, in Linux none do but they produce different results.

Version 1:

String fileName = URLDecoder.decode(encFilename, "UTF-8");
String filePath = uploadFolder + File.separator + fileName;
File uploadedFile = new File(filePath);

item.write(uploadedFile);

Version 2:

String fileName = URLDecoder.decode(encFilename, "UTF-8");
String filePath = uploadFolder + File.separator + fileName;
File uploadedFile = new File(filePath);

InputStream input = item.getInputStream();                    
try {
    Files.copy(input, uploadedFile.toPath());
} catch (Exception e) {
    log.error("Error writing file to disk: " + e.getMessage());
} finally {
    input.close();
}

Uploading a file called: Это тестовый файл.txt I get the following results on Linux:

Version 1: A file named: ??? ???????? ????.txt

Version 2: Error writing file to disk: Malformed input or input contains unmappable characters: /tmp/Это тестовый файл.txt

While on a Windows machine with Tomcat 7 and Java 7 the file name is written correctly as Это тестовый файл.txt

A third version uses the approach from this post and doesn't use FileUpload. The result is the same as what's produced by version 2.

Version 3:

Part filePart = request.getPart("file");
String fileName = "";
for (String cd : filePart.getHeader("content-disposition").split(";")) {
    if (cd.trim().startsWith("filename")) {
        fileName = cd.substring(cd.indexOf('=') + 1).trim().replace("\"", "");
        fileName = fileName.substring(fileName.lastIndexOf('/') + 1).substring(fileName.lastIndexOf('\\') + 1); // MSIE fix.
    }
}

String filePath = uploadFolder + File.separator + fileName;
File uploadedFile = new File(filePath);

InputStream input = filePart.getInputStream();                    
try {
    Files.copy(input, uploadedFile.toPath());
} catch (Exception e) {
    log.error("Error writing file to disk: " + e.getMessage());
} finally {
    input.close();
}

Tomcat is running with -Dfile.encoding=UTF-8 and locale shows LANG=en_US.UTF-8

touch "Это тестовый файл.txt" produces a file with that name.

The file contents are always written correctly. (except of course where no file is written at all).

What am I missing or doing wrong?

Community
  • 1
  • 1
IamNaN
  • 6,654
  • 5
  • 31
  • 47

1 Answers1

0

I solved the problem by converting all use of java.io.File to java.nio.Files and java.nio.Path. So it seems the java.io.File api is buggy. Using this it works fine on both Windows and Linux.

// The filename is passed as a URLencoded string
String fileName = URLDecoder.decode(request.getParameter("fileName"), "UTF-8");
Path filePath = Paths.get(uploadFolder, fileName);
Part filePart = request.getPart("file");

InputStream input = filePart.getInputStream();                    
try {
    Files.copy(input, filePath);
} catch (Exception e) {
    log.error("Error writing file to disk: " + e.getMessage());
} finally {
    input.close();
}

I ran into the same problem in several other parts of the app that worked with the uploaded files and in all cases getting rid of java.io.File and using java.nio instead solved the problem.

IamNaN
  • 6,654
  • 5
  • 31
  • 47