2

I am inserting characters in "Hindi" language in my XML which are being parsed and finally displayed as a "?" instead

I am parsing the XML in my java program but the hindi chars are being replaced by a ?.

Other hindi chars from my db are being displayed correctly

XML Sample:

<?xml version="1.0" ecoding="UTF-8"?>
<attributexml version=""><attribute cat_id="127"><key>ATTRIB_Courses_Offered</key><value>व्यापार लेखा में कैरियर, प्रोग्रामिंग , सॉफ्टवेयर</value></attribute><attribute cat_id="127"><key>ATTRIB_Training_for_individuals</key><value>Yes</value></attribute></attributexml>

Here is the code parsing my XML:

Map<String, CategoryBeanSpecific> keyValuePair = new HashMap<String, CategoryBeanSpecific>();
        Map<String, CategoryBean> attributesMapping = ListingAttributesHelper.getAttributesMapping();
        Map<String, String> visitingMap = new HashMap<String, String>();
        String attribTobeSingle = CafeConfigInitializer.getInstance().getConfig()
                .getAsString("CAFE", "ATTRIBUTES_TO_BE_SINGLE");
        String removeKey = null;
        try {
            if (attrib == null || attrib.length() == 0) {
                return keyValuePair;
            }
            Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
                    .parse(new ByteArrayInputStream(attrib.getBytes()));
            NodeList attrNodes = doc.getElementsByTagName(Constants.ATTRIBUTE_XML_OBJ);
            for (int i = 0; i < attrNodes.getLength(); i++) {
                int catId = listingCatId;
                if (attrNodes.item(i).hasAttributes()) {
                    try {
                        catId = Integer.parseInt(attrNodes.item(i).getAttributes().getNamedItem("cat_id")
                                .getTextContent());
                    } catch (Exception exception) {
                        logger.error(exception);
                    }
                }
                NodeList children = attrNodes.item(i).getChildNodes();

                List<String> values = new ArrayList<String>();
                for (int childIndex = 0; childIndex < children.getLength(); childIndex++) {
                    values.add(children.item(childIndex).getTextContent());
                }
Aman
  • 85
  • 9
  • The ??? usually appear when UTF-8 characters are tried to show in a different encoding (ASCII for example). Where do you see the "???"? On a JSP page or on debug-console output or both? [Here](http://stackoverflow.com/questions/14772275/utf-8-text-hindi-not-getting-displayed-on-browser-window-or-eclipse-console) is a similar issue. – mad_manny Nov 05 '15 at 12:55
  • debug console and jsp page both – Aman Nov 05 '15 at 12:56
  • It's most likely some configuration problem. Or do you have the XML in a file? If yes, you could also check the file encoding. – mad_manny Nov 05 '15 at 13:00
  • Have a look at this http://stackoverflow.com/questions/8854106/java-string-encoding-utf-8 – MrPublic Nov 05 '15 at 13:06
  • Or at this: http://stackoverflow.com/questions/16400136/why-my-dom-parser-cant-read-utf-8 – mad_manny Nov 05 '15 at 13:11

0 Answers0