1

I want to read below XML content and I'm using JAXB parser to convert XML to object. XML doc is in UTF-8 format which contains some utf-8 characters which I'm not getting through my object but getting ??? instead.

XML file data:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 <CallDetails>
            <APPOINTMENTDATE>29.11.2016</APPOINTMENTDATE>
            <APPOINTMENTTIME>29.11.2016 11:11:00</APPOINTMENTTIME>
            <ASCCODE>83000220</ASCCODE>
            <CALLDESC>작동불</CALLDESC>
            <CALLRECEIVEDBY>김정권</CALLRECEIVEDBY>
            <CALLRECEIVEDMODECODE></CALLRECEIVEDMODECODE>
            <CALLREGBYCAT></CALLREGBYCAT>
            <CALLREGBYCODE></CALLREGBYCODE>
            <CALLREGDATE>29.11.2016</CALLREGDATE>
            <CALLREGTIME>29.11.2016 09:11:00</CALLREGTIME>
            <CALLTYPECODE>SVC</CALLTYPECODE>
            <COVERAGETYPECODE>UW</COVERAGETYPECODE>
            <SPECIALREQUEST></SPECIALREQUEST>
        </CallDetails>

Reading file as below,

InputStream inputStream = null;
inputStream = new FileInputStream(path);
InputStreamReader reader = new InputStreamReader(inputStream,"UTF-8");
JAXBContext context = JAXBContext.newInstance(clazz);
Unmarshaller um = context.createUnmarshaller();
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
return um.unmarshal(is);

and getting object as below:

THIRDPARTYSERVICEORDERNO = serviceOrderListDTO.getServiceOrderList().get(0).getThirdPartyServiceOrderNo();
CALLDESC = ServiceOrderListDTO.getServiceOrderList().get(0).getCallDetailsList().getCallDesc();
System.out.println("THIRDPARTYSERVICEORDERNO : "+THIRDPARTYSERVICEORDERNO);
System.out.println("CALLDESC: "+CALLDESC);

after running this code, I'm getting output as below,

THIRDPARTYSERVICEORDERNO : AJ16110004904;
CALLDESC: ???;
theoretisch
  • 1,718
  • 5
  • 24
  • 34
Rahul Deore
  • 37
  • 1
  • 11
  • 1
    I've run into situations where the XML declaration is contrary to the actual encoding of the File. What editor/program are you using to view the original XML data that shows the characters correctly? That program is probably not relying on the XML declaration and may have detected the proper encoding by looking for Byte Order Marks. I would suggest try loading this in a tool to see what the [BOM](https://en.wikipedia.org/wiki/Byte_order_mark) is. Also, take a look at this [answer](http://stackoverflow.com/questions/10933620/display-special-characters-using-system-out-println) – Tung Dec 22 '16 at 11:33
  • 1
    Thanks Tung, it was eclipse issue. Code is working after setting text file encoding format. – Rahul Deore Dec 22 '16 at 14:38

1 Answers1

1

I have made a test of your code. The result it produces is correct, that means that in debug mode in-memory values are displayed correct. While printing those symbols to the console you will see ??? because the console window can't display by default those symbols. You have to be sure that:

  1. Encoding of the project in your IDE is set to UTF-8
  2. Fonts that are used to display the message are UTF-8-compatible. ( take a look at http://unifoundry.com/unifont.html)
  3. You should run your jre using -Dfile.encoding=UFT-8
Igor
  • 608
  • 6
  • 11