I have some problems with the encoding behaviour of JSoup library.
I want to parse the content of a webpage, and therefore I have to insert some person's names, that could also contain german umlaute as ä, ö, etc.
This is the code I am using:
doc = Jsoup.parse(new URL(searchURL).openStream(), "UTF-8", searchURL);
to parse the html of the resp. webpage.
But when I take a look into the document, the ä is shown as followed:
Käse
What am I doing wrong with the encoding?
The webpage has the following header:
<!doctype html>
<html>
<head lang="en">
<title>Käse - Semantic Scholar</title>
<meta charset="utf-8">
</html>
Someone help? Thanks :)
EDIT: I tried Stephans answer and it worked for the webpage www.semanticscholar.org, but I am also parsing another webpage, http://www.authormapper.com/
And the same code does not work for this webpage, if the name of an author contains a german umlaut. Does anyone know why this is not working? It's very embarissing for not to know this....