3

I'm running into an issue where certain values in my .properties file do not render correctly in my UI, yet most do. Instead of letters with diacritics, I see HTML entity names. I'll explain what I've done so far:

At first I could not get any letters with diacritics to render correctly. Luckily I found this post, and I was able to make progress by using escaped Unicode in my .properties file.

(On a side note, while escaped Unicode mostly solved the issue, it made the .properties file difficult to read. Luckily, IDEA gives you an option to use escaped unicode & still read the file with the human-readable chars. Read more here.)

Now here's my current issue: In certain parts of my app, letters with diacritics appear as Latin-1 HTML entity names. For example, instead of 'ç', I see "& amp;ccedil;" (I added an extra space in between the & and amp, otherwise it renders as an ampersand). At first, I had no idea what that even meant, but after looking at this table, I know it's an ISO-8859-1 entity name.

Here's what I've tried to far, although nothing has successful given me the chars with diacritics.

  1. Although I'm still using Glassfish 2, I found this post, and tried adding the following to my web.xml

    <jsp-config>
        <jsp-property-group>
            <url-pattern>*.jsp</url-pattern>
            <page-encoding>UTF-8</page-encoding>
        </jsp-property-group>
    </jsp-config>
    

    Now, when I check the response headers in Chrome dev tools, I can see the the following:

     Content-Type:text/html;charset=UTF-8
    

    However, I still see the aforementioned HTML entity names in my UI.

  2. I tried explicitly setting the charset within the JSP itself by adding the following to the JSP where the values are pulled from the .properties file:

    <%@page contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" %>
    
  3. While trying to fix this, I've read that ISO 8895-1 is the default encoding for properties files, so I tried changing the encoding within IDEA (I'm using 11, btw). You can do so by going to Settings > File Encodings. At the bottom is an option entitled "Default encoding for properties file", and I changed it to UTF-8. However, I still see the HTML entity names.

I've been trying for a while now, and I'm finally at my wits' end. Any advice?

Community
  • 1
  • 1
elefont
  • 151
  • 1
  • 7

1 Answers1

0

I don't really understand how the properties files fit into this, but what I would suggest is just write a function to go through a string character by character and change all characters with a charcode above 128 to &#charcode; where charcode is the decimal code, and run all your fancy text through it before display. Then even if you don't actually set the page encoding correctly the browser should still be able to handle the characters properly. (Example, 'ç' would be &#231;)

For example, instead of 'ç', I see "& amp;ccedil;" (I added an extra space in between the & and amp, otherwise it renders as an ampersand

What you should do instead is change &amp;ccedil; to &ccedil; Then it will display the actual character in the browser.

developerwjk
  • 8,619
  • 2
  • 17
  • 33