I'm having an encoding issue with JavaFX's WebView
. When loading a UTF-8 encoded file, special characters are displayed incorrectly (e.g. ’
is displayed instead of ’
). Here's an SSCCE:
WebViewTest.java
import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.web.WebView;
import javafx.stage.Stage;
public class WebViewTest extends Application {
public static void main(String[] args) {
Application.launch(args);
}
@Override
public void start(Stage stage) {
WebView webView = new WebView();
webView.getEngine().load(getClass().getResource("/test.html").toExternalForm());
Scene scene = new Scene(webView, 500, 500);
stage.setScene(scene);
stage.setTitle("WebView Test");
stage.show();
}
}
test.html
<!DOCTYPE html>
<html>
<body>
<p>RIGHT SINGLE QUOTATION MARK: ’</p>
</body>
</html>
Output of file -bi test.html
src:$ file -bi test.html
text/plain; charset=utf-8
The same thing happens in Windows using Java 17 and the latest JavaFX (I used Linux and Java 8 for the demonstration).
I've tried:
Declaring the charset in the HTML:
<meta charset="UTF-8">
(works, but I'm making an editor program, so I don't have control over the HTML)
Using the JVM argument
-Dfile.encoding=UTF-8
(doesn't work)Setting the charset using reflection (doesn't work, and throws an exception in newer Java versions):
System.setProperty("file.encoding","UTF-8"); Field charset = Charset.class.getDeclaredField("defaultCharset"); charset.setAccessible(true); charset.set(null,null);
Declaring the charset after the page loads using the DOM API (doesn't work):
webView.getEngine().getLoadWorker().stateProperty().addListener((o, oldState, newState) -> { if(newState == Worker.State.SUCCEEDED) { Document document = webView.getEngine().getDocument(); Element meta = document.createElement("meta"); meta.setAttribute("charset", "UTF-8"); document.getElementsByTagName("html").item(0).appendChild(meta); } });
Using
WebEngine.loadContent(String)
instead ofload(String)
(wouldn't work; relative links would be broken)
It appears that WebView
ignores file encodings, and uses ISO-8859-1 unless a charset is specified in the HTML.