How do I set the format right so I can actually grab the special characters as well.
When I System.out.println(response.body());
the body already lacks UTF-8 format. All special characters are transformed into question marks.
For example String title
ends up like Would you draw this for me? ?
and I want o get Would you draw this for me?
including the emojis.
ArrayList<Entry> pullRss(){
ArrayList<Entry> output = new ArrayList<>();
try{
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("https://www.reddit.com/r/anysubreddit/.rss"))
.build();
HttpResponse<String> response = client.send(request, BodyHandlers.ofString());
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder db = factory.newDocumentBuilder();
System.out.println(response.body());
Document doc = db.parse(new ByteArrayInputStream(response.body().toString().getBytes("UTF-8")));
NodeList nList = doc.getElementsByTagName("entry");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
Element eElement = (Element) nNode;
String user = eElement.getElementsByTagName("name").item(0).getTextContent();
String userUri = eElement.getElementsByTagName("uri").item(0).getTextContent();
String id = eElement.getElementsByTagName("id").item(0).getTextContent();
String link = eElement.getElementsByTagName("link").item(0).getAttributes().getNamedItem("href").toString();
String date = eElement.getElementsByTagName("published").item(0).getTextContent();
String title = eElement.getElementsByTagName("title").item(0).getTextContent();
output.add(new Entry(user, userUri, id, link, date, title));
}
}catch (Exception e) {
e.printStackTrace();
}
return output;
}