1

I'm writing an app on Java where I use a GET request with the OkHttp library to get some information of a webpage. The webpage is using ISO-8859-1. There is this tag at the top of the page: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>.

The code for the GETrequest is the following:

Request request = new Request.Builder()
                    .url(webpage)
                    .get()
                    .addHeader("upgrade-insecure-requests", "1")
                    .addHeader("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36")
                    .addHeader("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8")
                    .addHeader("accept-language", "es-ES,es;q=0.9")
                    .addHeader("cache-control", "no-cache")
                    .build();

Response response = client.newCall(request).execute();
String html = response.body().string();

If I print the headers of the GET request, I get this: Content-Type: text/html; charset=ISO-8859-1.

The html string contains the message that I want to use for the app, but there are some characters that are not readable. For example: the euro symbol (), it appears like a question mark (?) when printed on terminal.

I wanted to know if I can get these symbols in utf-8 enconding.

Mr. Kevin
  • 327
  • 4
  • 12
  • 4
    Possible duplicate of [How do I convert special UTF-8 chars to their iso-8859-1 equivalent using javascript?](https://stackoverflow.com/questions/5396560/how-do-i-convert-special-utf-8-chars-to-their-iso-8859-1-equivalent-using-javasc) – bowl0stu Apr 30 '18 at 15:11
  • Try using `curl` to save the page to a file, the run the `file` command on it. It’ll confirm whether the charset is what the headers claim. – Jesse Wilson Apr 30 '18 at 16:32
  • iso-8859-1 doesn't contain the Euro symbol, so the site cannot contain that symbol if it is using iso-8859-1, or it isn't actually using iso-8859-1 (but for example iso-8859-15 which is iso-8859-1 + Euro symbol and some other changes). – Mark Rotteveel Apr 30 '18 at 17:36

0 Answers0