I have a servlet that writes uploaded files to disk, using Apache Commons fileupload. That's all working fine in general.
It runs on both Windows and Linux servers using Tomcat. On Windows it handles files with non-ASCII file names correctly and the files are saved properly.
On Linux (CentOS 6) however the file names are not saved correctly when containing non-ASCII characters.
If have tried three different versions of writing the file. In Windows all work, in Linux none do but they produce different results.
Version 1:
String fileName = URLDecoder.decode(encFilename, "UTF-8");
String filePath = uploadFolder + File.separator + fileName;
File uploadedFile = new File(filePath);
item.write(uploadedFile);
Version 2:
String fileName = URLDecoder.decode(encFilename, "UTF-8");
String filePath = uploadFolder + File.separator + fileName;
File uploadedFile = new File(filePath);
InputStream input = item.getInputStream();
try {
Files.copy(input, uploadedFile.toPath());
} catch (Exception e) {
log.error("Error writing file to disk: " + e.getMessage());
} finally {
input.close();
}
Uploading a file called: Это тестовый файл.txt
I get the following results on Linux:
Version 1: A file named: ??? ???????? ????.txt
Version 2: Error writing file to disk: Malformed input or input contains unmappable characters: /tmp/Это тестовый файл.txt
While on a Windows machine with Tomcat 7 and Java 7 the file name is written correctly as Это тестовый файл.txt
A third version uses the approach from this post and doesn't use FileUpload. The result is the same as what's produced by version 2.
Version 3:
Part filePart = request.getPart("file");
String fileName = "";
for (String cd : filePart.getHeader("content-disposition").split(";")) {
if (cd.trim().startsWith("filename")) {
fileName = cd.substring(cd.indexOf('=') + 1).trim().replace("\"", "");
fileName = fileName.substring(fileName.lastIndexOf('/') + 1).substring(fileName.lastIndexOf('\\') + 1); // MSIE fix.
}
}
String filePath = uploadFolder + File.separator + fileName;
File uploadedFile = new File(filePath);
InputStream input = filePart.getInputStream();
try {
Files.copy(input, uploadedFile.toPath());
} catch (Exception e) {
log.error("Error writing file to disk: " + e.getMessage());
} finally {
input.close();
}
Tomcat is running with -Dfile.encoding=UTF-8
and locale
shows LANG=en_US.UTF-8
touch "Это тестовый файл.txt"
produces a file with that name.
The file contents are always written correctly. (except of course where no file is written at all).
What am I missing or doing wrong?