0

I have used below code to let user upload a csv file on my web page. The csv contain the following info:

12345,account,password,ABC,Tom,0
12346,account,password,ABC,Jerry,0
12347,account,password,ABC,Mary,0

doPost.java

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
{
    ServletFileUpload upload = new ServletFileUpload();
    upload.setHeaderEncoding("UTF-8");
    response.setContentType("text/html");
    response.setCharacterEncoding("UTF-8");
    request.setCharacterEncoding("UTF-8");
    String type = "";
    String mode = "";
    String name = "";
    String remark = "";
    String id = "";
    Enumeration params = request.getParameterNames();
    while (params.hasMoreElements())
    {
        String param = (String) params.nextElement();
        if (param.equals("type")) type = request.getParameter(param);
        if (param.equals("mode")) mode = request.getParameter(param);
        if (param.equals("name")) name = request.getParameter(param);
        if (param.equals("remark")) remark = request.getParameter(param);
        if (param.equals("id")) id = request.getParameter(param);
    }

    FileItemIterator iterator = upload.getItemIterator(request);
    while (iterator.hasNext())
    {
        FileItemStream item = iterator.next();

        if (!item.isFormField())
        {
            InputStream stream = item.openStream();
            //try print stream
            BufferedReader lesen = new BufferedReader(new InputStreamReader(stream));
            String line = lesen.readLine();
            while(line!=null)
            {
                System.out.println("stream: "+line);
                line = lesen.readLine();
            }

            if (type.equals("csv"))
            {
                List<BaseModel> devices = CsvParser.csv2ListBaseModel(stream);
            }
        }
    }
}   

The system.out.println would print out incorrect content in csv file:

Stream: 嚜?2345,account,password,ABC,Tom,0
Stream: 12346,account,password,ABC,Jerry,0
Stream: 12347,account,password,ABC,Mary,0

And while CsvParser.csv2ListBaseModel(stream) will return incorrect content also.

import dk.lindhardt.gwt.geie.server.CSV2TableLayout;
import dk.lindhardt.gwt.geie.shared.Cell;
import dk.lindhardt.gwt.geie.shared.TableLayout;
public class CsvParser
{
    public static List<BaseModel> csv2ListBaseModel(InputStream stream)
    {
        CSV2TableLayout csv2TableLayout = new CSV2TableLayout(stream);
        TableLayout tableLayout = csv2TableLayout.build();
        List<Cell> cells = tableLayout.getCells();
        List<BaseModel> devices = new ArrayList<BaseModel>();
        BaseModel device = null;

        for (int row = 0; row < tableLayout.rows(); row++)
        {
            device = new BaseModel();
            for (int column = 0; column < tableLayout.columns(); column++)
            {
                String value = null;
                try
                {
                    value = (String) tableLayout.getCell(row, column).getValue();
                } catch (NullPointerException npe)
                {
                    //
                }
                device.set(column + "", value);
            }

            devices.add(device);
        }
        return devices;
    }
}

Finally, when I have stored devices into database, the first word(12345) in first line will become ?12345 The csv file is UTF-8 encoding. Any suggestion is appreciated. Thanks

user3616668
  • 23
  • 1
  • 5

1 Answers1

2

This hidden byte is called BOM and is used to identify the bytes order of the unicode file.

Anyway, you can remove it from your string, for example:

yourString = yourString.replace("\uFEFF", "");

Also if you search here in SO or Google about removing BOMs from output you will find a lot of resources:

http://www.javapractices.com/topic/TopicAction.do?Id=257

Reading UTF-8 - BOM marker

Community
  • 1
  • 1
Wajdy Essam
  • 4,280
  • 3
  • 28
  • 33