Jsoup is returning text which I do not see in the HTML document

Question

public class Test {
   public static void main(String[] args) throws IOException {
     Document doc = Jsoup.connect("https://bs.to/Game-of-Thrones").get();
     Elements link = doc.select("p");

     System.out.println(link.text());
   }
}

This is the code I use to get the only p tag element of the given website. But I get a text, which is not in the html document. It seems to be a text which belongs to the general website though (it's in german so I don't mind posting the result text).

Also, if I loop all p elements, I get more text, that should not be in the document, but not the text that I'm looking for.

Why could that be? Thanks in advance!

Edit:

  Document doc = Jsoup.connect("https://bs.to/andere-serien")
                  .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US;    rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
                  .referrer("http://www.google.com")
                  .get();

Adding the userAgent did solve the issue, thanks Sean Patrick Floyd!

Are you sure that the text that you are seeing is not on the website? Just because the browser does not display, it does not mean the text is not there. — Jagrut, Jun 01 '16 at 08:49

score 0 · Answer 1 · edited May 23 '17 at 10:28

0

It could be they are serving different content for different user agents. Try setting your user agent to that of a real browser.

See this question for solutions:
JSoup UserAgent, how to set it right?

edited May 23 '17 at 10:28

Community

1
1

answered Jun 01 '16 at 08:51

Sean Patrick Floyd

292,901
67
465
588

Jsoup is returning text which I do not see in the HTML document

1 Answers1