0

We have an application that outputs the £ ( pound) sign correctly when the application is run on the Mac but outputs a ? ( question mark ) when run on test server.

Below is sample of the code and generated output

    LOGGER.debug(" TESTING  file.encoding=" + System.getProperty("file.encoding"));
    LOGGER.debug(" TESTING Charset.defaultCharset=" + Charset.defaultCharset());
    try {
        LOGGER.debug(" TESTING InputStreamReader.getEncoding=" + new InputStreamReader(new FileInputStream("/tmp/PrintCharSets.java")).getEncoding() );
    } catch (FileNotFoundException e) {
        e.printStackTrace();            
    }
    
    String pound = "£";
    
    LOGGER.debug(
            " TESTING - 0 - Test Character  [" + pound + "]");
    

Output On Test Server

    TESTING  file.encoding=UTF-8
    TESTING Charset.defaultCharset=US-ASCII
    TESTING InputStreamReader.getEncoding=ASCII
    TESTING - 0 - Test Character  [?]

Output On My Mac

    TESTING  file.encoding=UTF-8
    TESTING Charset.defaultCharset=UTF-8
    TESTING InputStreamReader.getEncoding=UTF8
    TESTING - 0 - Test Character  [£]

I suppose this is due to encoding of the string. We can see there is a difference in the defaultCharset and encoding. However, would you be able to advice how I can get the £ sign to be outputted correctly on the test server via making changes in the code.

This application will run on different servers so I can't assume the encoding is consistent.

Anshul Sharma
  • 1,018
  • 3
  • 17
  • 39
Pete Long
  • 107
  • 2
  • 11
  • have look at https://stackoverflow.com/questions/1006276/what-is-the-default-encoding-of-the-jvm... Apparently it depends on the OS the JVM is running on. So I would recommend setting explicitly the char encoding on the input stream you're reading as shown in http://tutorials.jenkov.com/java-io/inputstreamreader.html#set-inputstreamreader-character-encoding. (Not author of either of these pages) – Jordan Simba Aug 05 '20 at 06:45
  • Try using unicode, i.e. `String pound = "\u00A3";` – Abra Aug 05 '20 at 07:48
  • Jordan It seems the default encoding configured on a server differs so setting will not work as it may differ from the default. As in my scenario, the defaultCharset on test server is ASCII but on my Mac it is UTF-8 Abra that works. I am just wondering if I should replace the £ char in a string with \u00A3 Thank you – – Pete Long Aug 05 '20 at 08:38

1 Answers1

0

The file is being loaded the same way on both machines, as a UTF-8 encoded text file. This means one of two things differ between the machines:

  1. The contents of the file are different
  2. One of the machines is missing a glyph for the character.

The "replacement character" is typically a diamond with a question mark in it. However, in many fonts, such a character is not available. When the replacement character is missing, a single question mark by itself is an alternate replacement character.

If you copy the file that shows "?" over to the mac, and it now shows the Pound sign, the problem is not the file or the encoding, the problem is the font.

Edwin Buck
  • 69,361
  • 7
  • 100
  • 138
  • 1
    Is it possible the output (terminal) does not support / use UTF-8? –  Aug 05 '20 at 06:40
  • Edwin copied the log file ( that displayed the £ as ? ) from test server onto my mac and it still shows as ? unfortunately – Pete Long Aug 05 '20 at 08:04