-1

I'm creating a webapp which involves displaying financial data to the user. Being from the UK and using GBP £ for currency, this character is used a lot.

However, every now and then, the £ is shown as a diamond with a question mark in the middle, and on the web page it throws an invalid charachter UTF-8 byte 1 of 1 byte string.

Is there a UTF safe way to display the £ sign? Here is an example of what I am doing at the moment:

 "Rent Per Annum: £" + '${tenant.currentRent}'
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
Stephen Leake
  • 37
  • 2
  • 11
  • 4
    UTF-8 is an encoding. It's entirely capable of encoding the £ sign. It's incapable "having a hissy". If something isn't working, it's almost certainly because you've done something wrong - but it's hard to tell what based on so little information. – Jon Skeet Sep 09 '12 at 11:46
  • It's not "UTF" that's throwing the hissy fit, it means that your actual text encoding is not actually UTF-8 as you say it is. – deceze Sep 09 '12 at 11:47
  • The pound sign has the value `U+00A3` which means that it's probably one of the few characters in the UK where you need to get the encoding correct to display it. You're very likely outputting the UTF-8 encoding but the web page has encoding set to ISO-8859-1 or the other way around. – Joachim Isaksson Sep 09 '12 at 11:48
  • 1
    I would recommend reading ["The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)"](http://www.joelonsoftware.com/articles/Unicode.html) by Joel Spolsky – Uwe Keim Sep 09 '12 at 11:57
  • Okay, fair enough, I created the project through Spring Roo and Cloud Foundry in STS, and all the pages say they are UTF-8, I just wondered if it was like with the & in web links where I have to write more than just the £ sign – Stephen Leake Sep 09 '12 at 12:24
  • 2
    In the future, try asking the question the smart and neutral way instead of making assumptions which would possibly only work against you because they make actually no utter sense (which most likely explains all those downvotes you received). I have edited your question to neutralize it. – BalusC Sep 09 '12 at 14:55

2 Answers2

5

The particular problem can have at least the following one or more causes:

  1. The JSP file is not by the editor (Eclipse, Netbeans, Notepad, etc) been saved using UTF-8 encoding.

  2. The server didn't use UTF-8 to decode the characters produced by the JSP to a byte stream before sending it over network.

  3. The browser didn't use UTF-8 to encode the byte stream from the network to characters.

Those problems can be solved as follows:

  1. Configure the editor to save JSP files using UTF-8. I'm not familiar with STS, but I know that it is Eclipse based, so it'll probably be the same as in standard Eclipse. Go to Window > Preferences > General > Workspace > Text File Encoding and then pick the right encoding in the dropdown.

    enter image description here

    An alternative is to use the HTML entity £ (as suggested by the other answerer), this way it's not relevant anymore in which encoding the JSP file is saved. All characters involved in the string £ are supported by the basic ASCII encoding already (every decent character encoding used in the world basically "extends" ASCII, so it'll always work) and the HTML interpreter (the webbrower) will translate the HTML entity into the right character.

  2. The server has to be instructed to use UTF-8 to decode the JSP output. This can on a per-JSP basis be done by

    <%@page pageEncoding="UTF-8" %>
    

    or on an application-wide basis by

    <jsp-config>
        <jsp-property-group>
            <url-pattern>*.jsp</url-pattern>
            <page-encoding>UTF-8</page-encoding>
        </jsp-property-group>
    </jsp-config>
    
  3. The browser has to be instructed to use UTF-8 to encode the HTTP response. This is to be solved by setting the charset attribute of the HTTP response Content-Type header to UTF-8, which is already implicitly done by the solution to cause #2.

See also:

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • 1
    Thanks so much for the help with that (and also sorting my question out, I was just frustrated and sarcastic at the time) – Stephen Leake Sep 09 '12 at 19:17
0

A portable way of writing this in HTML as an entity is &pound; or in the general case by its character code &#163; or &#xA3; £. This way, your source is plain 7-bit ASCII, so basically independent of encoding (ignoring esoterics like EBCDIC etc). See also http://www.fileformat.info/info/unicode/char/a3/index.htm

tripleee
  • 175,061
  • 34
  • 275
  • 318