-1

I'm having troubles with uploading and parsing a file as UTF-8 strings. I use the following code:

protected void doPost(HttpServletRequest request, HttpServletResponse response) 
        throws ServletException, IOException {
    Part filePart = request.getPart("file");
    InputStream filecontent = filePart.getInputStream();
    // ...
}

And my webpage looks like this:

<%@ page language="java" contentType="text/html; charset=UTF-8"
         pageEncoding="UTF-8"%>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  </head>
  <body>
    <form action="UploadServlet" method="post" enctype="multipart/form-data">
      <input type="file" name="file" />
      <input type="submit" />
    </form>
  </body>
</html>

I found a great post about UTF-8 encoding in java webapps, but unfortunately it didn't work for me. I still have random symbols in strings in NetBeans debugger, and when I display them on a webpage, although most of them get displayed correctly, some cyrillic letters (я, с, Н, А) get replaced by '�?'

Community
  • 1
  • 1
prazuber
  • 1,352
  • 10
  • 26

2 Answers2

2

The file upload with a HTML form doesn't use any character encoding. The file is transferred byte by byte as is. See here under "multipart/form-data".

So if the original file at client side is a text file with UTF-8 character encoding, then on the server side it is also UTF-8.

Then you can use an InputStreamReader to decode the bytes as UTF-8 text:

InputStreamReader reader = new InputStreamReader(filecontent, "UTF-8");

That's it.

vanje
  • 10,180
  • 2
  • 31
  • 47
0

javax.servlet.http.Part, what you use in the very first line of your code, has a method on it getContentType() which will tell you what the content type of the uploaded form data is. Nothing you have written to date would constrain the uploaded form data to any particular character set; ergo you need to determine the character set and deal with it accordingly.

dcsohl
  • 7,186
  • 1
  • 26
  • 44