0

Working on a simple translator project in Netbeans using JavaFX. Running it from Netbeans, it compiles and works perfectly. No rendering issues:

appearance when launched from Netbeans

But when running the same executable ([project-folder]\dist\Translator.jar):

mojibake in \dist\Translator.jar

Mojibake. Same thing for ([project-folder]\dist\run##########\Translator.jar):

mojibake in \dist\run##########\Translator.jar

There are four places the text could be misformatted: a list of terms is sent to the translator, which uses a web service to retrieve the translations (1). These are then cached in files (2), and are loaded by a parser (3), which makes data available for display in the JavaFX window (4). I've inspected the files and they're valid UTF-8, and the parser only runs when it's loading an existing file, which a new deployment wouldn't have any of. So I've narrowed it down to the display in the JavaFX window.

ndm13
  • 1,189
  • 13
  • 19
  • 1
    The file encodings may be valid UTF-8, but the default encoding used by the jvm may be different to UTF-8 and the encoding in the program may not be properly specified... – fabian Nov 12 '16 at 21:18
  • @fabian How can I force a UTF-8 as the default? I thought the principle of Java was that it would work the same everywhere. – ndm13 Nov 13 '16 at 00:30
  • Better specify the encoding for whatever you use to read those files/webservice. A short description and/or code sample could be helpful (post code a [mcve] in the question itself, not just a link to the code. Using `System.out` will probably suffice to demonstrate the issue.). – fabian Nov 13 '16 at 08:00
  • @fabian Sorry about the code sample; it was too integrated to really isolate it. Was more interested in seeing if anyone else had encountered the issue than a specific use-case solution (as is the case with most StackExchange answers). Finally found a solution, answered my own question. – ndm13 Nov 15 '16 at 19:47

1 Answers1

0

I'm sorry my question wasn't fantastic, but I've come across many people with similar issues and it was hard to isolate the specific case for mine.

The crux of the issue is that Netbeans will automatically execute all JVM sessions with UTF-8 as the default encoding (as far as I'm aware). Normally this isn't an issue, but when working with languages that take advantage of disputed codepoints within the UTF-8 specification, this will likely guarantee mojibake will be spat out by any JVM that uses an encoding other than UTF-8. This is a majority of them, because the JVM specification says that the best practice is to use the host system's encoding, which is frequently not UTF-8.

The question Java compiler platform file encoding problem helped me to address the issue. Since I don't have access to the JVM arguments for every system my code will run on (which seems unrealistic), the solution below is what I personally opted for.

/**
 * Converts a string from the system default encoding into UTF-8.
 * This fixes rendering issues for UTF-8 characters where the default
 * encoding would yield mojibake.  Should be run against any Strings that
 * will be displayed to the end user directly that may contain UTF-8
 * characters.
 * 
 * @param string    The String to be re-encoded.
 * @return          the re-encoded string
 */
public static String convertToUTF8(String string){
    return new String(string.getBytes(Charset.defaultCharset()), Charset.forName("UTF-8"));
}

A simple little method that gets the job done, called as necessary.

Community
  • 1
  • 1
ndm13
  • 1,189
  • 13
  • 19