0

Trying to get text from page but getting this output:

������ ������ done

it needs to be:

привет привет done

public static void main(String... args) throws IOException {
    URL url = new URL("https://script.google.com/macros/s/AKfycbzvHqXSXY0LgwfFeltbNS_iYCcge8re0s-uY0-lvSJ0uuMDENoS/exec?message=" + "hello" + "&langin=en&langout=ru");
    URLConnection con = url.openConnection();
    InputStream in = con.getInputStream();
    String encoding = con.getContentEncoding();
    // encoding = "UTF-8";
    encoding = encoding == null ? "UTF-8" : encoding;
    String body = IOUtils.toString(in, encoding);
    System.out.println(body);

    Document doc = Jsoup.connect("https://script.google.com/macros/s/AKfycbzvHqXSXY0LgwfFeltbNS_iYCcge8re0s-uY0-lvSJ0uuMDENoS/exec?message=" + "hello" + "&langin=en&langout=ru").get();

    System.out.println(doc.text());

    System.out.println("done");
}
Kayaman
  • 72,141
  • 5
  • 83
  • 121
E1ZY
  • 1
  • 2
  • Have you looked at the value of `con.getContentEncoding()`? – Kayaman Jul 02 '20 at 08:29
  • @Kayaman `null` before `encoding = encoding == null ? "UTF-8" : encoding;` and `"UTF-8"` after – E1ZY Jul 02 '20 at 08:52
  • Then the encoding isn't `UTF-8`. It looks to be a single character encoding, maybe [Windows-1251](https://en.wikipedia.org/wiki/Windows-1251). – Kayaman Jul 02 '20 at 08:55
  • @Kayaman is there any better encoding like Windows-1251? asking because it cant translate some characters. – E1ZY Jul 02 '20 at 10:37
  • like "И" in russian language – E1ZY Jul 02 '20 at 10:40
  • Well, in the link at the bottom there are other encodings you can try. Of course it would be better if you could receive `UTF-8` from the server. Maybe you can try setting the [request header](https://stackoverflow.com/questions/8934797/java-utf-8-encoding-not-set-to-urlconnection)? – Kayaman Jul 02 '20 at 10:44

0 Answers0