-1

I want to view source code of a web page, but the JavaScript change it. E.g. https://delicious.com/search/ali this is a site page when we click CTRL+U it shows the source code which JavaScript changed not actual one. If you see code using Inspect Element than it shows the complete source code. so I want to get the complete source code. kindly let me know is there any technique to get the source code provided by the Inspect Element. I am building a software and this is the requirement of that. It is Good if the technique or api you are going to refer me is in JAVA. I am going to build a software which gets urls from this site. But because of change made by the JavaScript I can't get the actual Source code.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • 3
    first, remember - or learn - that `java != javascript` – Joshua Apr 28 '14 at 07:03
  • The source code is the one visible with Ctrl+U. The inspector shows the current state of the DOM in memory, which is different than the source code. – Sebastien C. Apr 28 '14 at 07:08
  • So you want the original HTML source? Right-click -> view source. Want it programmatically? Just download the HTML (using Ajax if javascript or httprequest if java), and it will be the original because you won't have run any scripts. – Dave Apr 28 '14 at 07:08
  • most browsers use `view-source:` that you can add to the beginning of url. That will be original html code `view-source:https://delicious.com/search/ali`, without changes made by javascript – paulitto Apr 28 '14 at 07:16
  • sebcap26 I want to get that source code which inspector shows using java program. – user2495978 Apr 28 '14 at 07:16
  • https://delicious.com/search/ali when you open this site there are several link. But when you open source code you can't find them. but when you inspect Element using Inspector feature in browser you can see those link. I want to gets all those links using Java program. – user2495978 Apr 28 '14 at 07:19
  • Recently I am using an API but that API target the source code ( CTRL+U) so it can't find any thing there and returns nothing. so I need such API (in Java) or any Code technique so that I get the same source code as the Inspect Element. So that my purpose be severed. – user2495978 Apr 28 '14 at 07:23
  • those links are added dynamically, so you need code that is already changed by javascript not the original one – paulitto Apr 28 '14 at 07:24
  • yes Paulitto. you are in the right Direction. I need the same either by using some Java code or by some Java API. Thanks – user2495978 Apr 28 '14 at 07:26
  • in php I would use curl requests to scrape page, I don't know what would be the analogue for java. Check also if [this tool](http://htmlunit.sourceforge.net/) works for you – paulitto Apr 28 '14 at 07:48
  • @user2495978: Please post the code (you can [edit] your question) with the api you're currently using, so we know a bit about your current setup. – Bergi Apr 29 '14 at 01:44
  • @Bergi currently I am using JSoup for this purpose. As JavaScript change the webpage source code so it get the latest code. but I need the Code with Inspect Element shows. – user2495978 Apr 29 '14 at 07:10
  • possible duplicate of [Jsoup Java HTML parser : Executing javascript events](http://stackoverflow.com/questions/7344258/jsoup-java-html-parser-executing-javascript-events) – Bergi Apr 29 '14 at 08:06

1 Answers1

-1

I'm not sure, but this might be what you are asking for. The code takes a URL object, gets the server's response, and returns the body of the response. This should be a HTML document in your case.

String getSource(URL url) {
    HttpURLConnection connection = url.openConnection();

    connection.setDoOutput(true);
    connection.setDoInput(true);
    connection.getOutputStream().write(42);

    byte[] bytes = new byte[512];
    try (BufferedInputStream bis = new BufferedInputStream(connection.getInputStream())) {
        StringBuilder response = new StringBuilder(500);
        int in;
        while ((in = bis.read(bytes)) != -1) {
            response.append(new String(bytes, 0, in));
        }
        return response.toString().split("\r\n\r\n")[1];
    };
}
Andrew Vitkus
  • 827
  • 7
  • 9