2

I'm having an encoding issue with JavaFX's WebView. When loading a UTF-8 encoded file, special characters are displayed incorrectly (e.g. ’ is displayed instead of ). Here's an SSCCE:

WebViewTest.java

import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.web.WebView;
import javafx.stage.Stage;

public class WebViewTest extends Application {

    public static void main(String[] args) {
        Application.launch(args);
    }

    @Override
    public void start(Stage stage) {
        WebView webView = new WebView();
        webView.getEngine().load(getClass().getResource("/test.html").toExternalForm());

        Scene scene = new Scene(webView, 500, 500);
        stage.setScene(scene);
        stage.setTitle("WebView Test");
        stage.show();
    }

}

test.html

<!DOCTYPE html>
<html>
  <body>
      <p>RIGHT SINGLE QUOTATION MARK: ’</p>
  </body>
</html>

Output of file -bi test.html

src:$ file -bi test.html
text/plain; charset=utf-8

Result:
WebView encoding issue

The same thing happens in Windows using Java 17 and the latest JavaFX (I used Linux and Java 8 for the demonstration).

I've tried:

  • Declaring the charset in the HTML: <meta charset="UTF-8">

    (works, but I'm making an editor program, so I don't have control over the HTML)

  • Using the JVM argument -Dfile.encoding=UTF-8 (doesn't work)

  • Setting the charset using reflection (doesn't work, and throws an exception in newer Java versions):

    System.setProperty("file.encoding","UTF-8");
    Field charset = Charset.class.getDeclaredField("defaultCharset");
    charset.setAccessible(true);
    charset.set(null,null);
    
  • Declaring the charset after the page loads using the DOM API (doesn't work):

    webView.getEngine().getLoadWorker().stateProperty().addListener((o, oldState, newState) -> {
        if(newState == Worker.State.SUCCEEDED) {
            Document document = webView.getEngine().getDocument();
            Element meta = document.createElement("meta");
            meta.setAttribute("charset", "UTF-8");
    
            document.getElementsByTagName("html").item(0).appendChild(meta);
        }
    });
    
  • Using WebEngine.loadContent(String) instead of load(String) (wouldn't work; relative links would be broken)

It appears that WebView ignores file encodings, and uses ISO-8859-1 unless a charset is specified in the HTML.

Henry Sanger
  • 143
  • 2
  • 15

3 Answers3

1

While writing the question, I found a hacky solution:

webView.getEngine().getLoadWorker().stateProperty().addListener((o, oldState, newState) -> {
    if(newState == Worker.State.SUCCEEDED) {
        try {
            String newContent = new String(Files.readAllBytes(Paths.get(new URI(getClass().getResource("/test.html").toExternalForm()))), "UTF-8");
            webView.getEngine().executeScript("document.documentElement.innerHTML = '" + newContent.replace("'", "\\'").replace("\n", "\\n") + "'");
        } catch(Exception e) {
            e.printStackTrace();
        }
    }
});

Correctly displayed result

Henry Sanger
  • 143
  • 2
  • 15
1

WebView determines the encoding from either the HTML file or the HTTP header. This is as per the w3c specification, for information see:

As you already noted in your question, you can declare the character encoding in the head element within the HTML document and the WebView will pick it up:

<!DOCTYPE html>
<html lang="en"> 
<head>
<meta charset="utf-8"/>
...

But, you also note in your question that you don't have control over the input HTML files and whether it includes the necessary header for declaring the charset.

You can also have the HTTP protocol specify the encoding of the file using an appropriate header.

 Content-Type: text/html; charset=UTF-8

If you do that, the HTML file content will be correctly UTF-8 decoded by the WebView, even if the input file does not include a charset header.

Here is an example:

import com.sun.net.httpserver.*;
import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.web.WebView;
import javafx.stage.Stage;

import java.io.*;
import java.net.InetSocketAddress;
import java.nio.charset.StandardCharsets;
import java.util.List;
import java.util.stream.Collectors;

public class WebViewTest extends Application {

    private static final String TEST_HTML = "test.html";

    private HttpServer server;

    public static void main(String[] args) {
        Application.launch(args);
    }

    @Override
    public void init() throws Exception {
        server = HttpServer.create(new InetSocketAddress(8000), 0);
        server.createContext("/", new MyHandler());
        server.setExecutor(null); // creates a default executor
        server.start();
    }

    @Override
    public void start(Stage stage) {
        WebView webView = new WebView();
        webView.getEngine().load("http://localhost:8000/" + TEST_HTML);

        Scene scene = new Scene(webView, 500, 500);
        stage.setScene(scene);
        stage.setTitle("WebView Test");
        stage.show();
    }

    @Override
    public void stop() throws Exception {
        server.stop(0);
    }

    static class MyHandler implements HttpHandler {
        public void handle(HttpExchange httpExchange) {
            try {
                String path = httpExchange.getRequestURI().getPath().substring(1);  // strips leading slash from path, so resource lookup will be relative to this class, not the root.
                String testString = resourceAsString(path);
                System.out.println("testString = " + testString);
                if (testString != null) {
                    httpExchange.getResponseHeaders().put("Content-Type", List.of("text/html; charset=UTF-8"));
                    httpExchange.sendResponseHeaders(200, testString.getBytes(StandardCharsets.UTF_8).length);
                    try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(httpExchange.getResponseBody()))) {
                        writer.write(testString);
                        writer.flush();
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                } else {
                    System.out.println("Unable to find resource: " + path);
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

        private String resourceAsString(String fileName) throws IOException {
            try (InputStream is = WebViewTest.class.getResourceAsStream(fileName)) {
                if (is == null) return null;
                try (InputStreamReader isr = new InputStreamReader(is);
                     BufferedReader reader = new BufferedReader(isr)) {
                    return reader.lines().collect(Collectors.joining(System.lineSeparator()));
                }
            }
        }
    }
}

For this example to work, place the HTML test file from your question in the same location as your compiled WebViewTest.class, so that it can be loaded from there as a resource.

To run the example as a modular app, add the following to your module-info.java (in addition to your javafx module requirements and any other app requirements you need):

requires jdk.httpserver;
jewelsea
  • 150,031
  • 14
  • 366
  • 406
  • Thanks for your answer! This works, but relative links are broken. If I put an image in the same folder as test.html and refer to it in an , it doesn't load, because the file is being served from a different location. – Henry Sanger Jan 12 '22 at 01:46
  • I’m not sure what your ultimate goal is but you said that you are making an editor program. Resources are read-only if in a jar, which would be standard deployment, so they can’t be edited. I just used a resource lookup because that is what you had in your question. – jewelsea Jan 12 '22 at 02:47
  • 1
    It may be possible to get the relative links to work with the current resource based setup (I don’t know), but the key to this answer is to serve over http and provide a header specifying a character encoding. You can do that from any web server, including this setup if [appropriately configured](https://stackoverflow.com/questions/15902662/how-to-serve-static-content-using-suns-simple-httpserver) or an embedded server like jetty or tomcat, sourcing content from a file directory instead of a resource. – jewelsea Jan 12 '22 at 02:56
  • Perhaps the image lookup failed because you didn’t update the code to return the correct mime type when asked for image content. If you want something beyond simple text and HTML content serving, a proper http server like jetty, nginx or Apache is the best thing to use. It is hard to know what is best for you without knowing your app. Obviously, just using a file protocol is easiest if your files already have the correct meta info on encoding in the http header, but unfortunately you said that you cannot depend on that. – jewelsea Jan 12 '22 at 03:02
0

I found another simple solution using Spark Java:

WebViewTest.java

import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.web.WebView;
import javafx.stage.Stage;
import spark.Spark;
import spark.staticfiles.StaticFilesConfiguration;

public class WebViewTest extends Application {

    public static void main(String[] args) {
        Application.launch(args);
    }

    @Override
    public void start(Stage stage) {

        Spark.port(8000);

        StaticFilesConfiguration staticHandler = new StaticFilesConfiguration();
        staticHandler.configure("/");
        Spark.before((req, res) -> {
            if(req.url().endsWith(".html")) staticHandler.putCustomHeader("Content-Type", "text/html; charset=UTF-8");
            else staticHandler.putCustomHeader("Content-Type", null);
            staticHandler.consume(req.raw(), res.raw());
        });

        Spark.init();

        WebView webView = new WebView();

        webView.getEngine().load("http://localhost:8000/test.html");

        Scene scene = new Scene(webView, 500, 500);
        stage.setScene(scene);
        stage.setTitle("WebView Test");
        stage.show();
    }

}

test.html

<!DOCTYPE html>
<html>
    <body>
        <p>RIGHT SINGLE QUOTATION MARK: ’</p>
        <p>Image:</p>
        <img src="image.png">
    </body>
</html>

image.png

Example image

Result:

Working WebViewTest

Henry Sanger
  • 143
  • 2
  • 15