0

in a browser such as google chrome , when I want to get information of the page , I just select view page source but in some sites that use javascript you cant do that , so I discovered that on any element such as a button and select inspect element it will show me the information that I need , and its really great but I want to do this automatically for example for simple pages that use html , I just use wget to save the page to text to analyze it later , but for pages that use javascript and css its impossibile , so my question is there a tool or any way to get the information I need and save it to text lets say I specify the site and the element (button ) and it will download the code and will save it to text

feeela
  • 29,399
  • 7
  • 59
  • 71
Leo92
  • 724
  • 4
  • 16
  • 29
  • now I am thinking about a way , using a batch to make chrome or firefox to save the output of `Ctrl + Shift + J` into text file , is it possible ? – Leo92 Jul 05 '12 at 10:12
  • prepend `view-source:` to the beginning of the URL to see the source code behind a webpage. – starbeamrainbowlabs Jul 05 '12 at 11:12

1 Answers1

1

If you use chrome, you can do Ctrl + Shift + J to get the Developer tools window open. Click the Top left most icon (Elements) to see the DOM as it stands after JavaScript modification.

In IE (7 or above I think) use F12 to open a similar window.

And in FireFox you can use Ctrl + Shift +I to open a similar window.

And for automating this process try using http://www.phantomjs.org/ (as suggested here: wget + JavaScript?)

Edit:

There is a Save button in IE which saves the current DOM:

enter image description here

In FireFox after you have pushed Ctrl + Shift + I press Ctrl + S and it will save the current DOM.

Edit 2:

Download PhantomJS, create a file called script.js paste this into it:

system = require('system');
var page = new WebPage();
page.open(system.args[1], function (status) {
    if(status == 'success') {
        html = page.evaluate(function() {
            return document.getElementsByTagName('html')[0].innerHTML;
        }); 
        console.log(html);
    } else {
        console.log('Page could not be loaded');
    }
    phantom.exit();
});

Run phantomjs script.js http://www.website.co.uk > website.html at the comment line (script.js and phantom.exe will have to be in the current working directory). Change http://www.website.co.uk to the website you need to download and website.html so the`html file you want to save to.

Community
  • 1
  • 1
OdinX
  • 4,135
  • 1
  • 24
  • 33
  • 1
    Wow, `Ctrl + Shift + J` is the third shortcut I've learned: `Ctrl + Shift + I` works too (like in Opera) and `F12` also works (like in Firefox) – feeela Jul 05 '12 at 09:55
  • Haha, I didn't know about those other two. – OdinX Jul 05 '12 at 09:57
  • @Chief17 thank you `Ctrl + Shift + J` gave me what I want , now the hard thing to find a tool to do the same thing using a batch , I will read the topic that you have posted – Leo92 Jul 05 '12 at 10:04
  • @Chief17 but I want to this automatically using a batch file ,so I wonder If I can do this by commands – Leo92 Jul 05 '12 at 11:49
  • That's what PhantomJS can do. I've posted a working example for you that you can use right away. Check `Edit 2`. – OdinX Jul 05 '12 at 12:51
  • ok thanks for your help ,but doing this it will download the html code which I can do by wget , but I want to download the a page that use javascript that only inspect element will show that in browser – Leo92 Jul 05 '12 at 13:28
  • This will download the HTML code after javascript has been run on page load, so if there is no page content and it is loaded with ajax, you will get the resulting DOM. If I have misunderstood you, please could you try and rephrase what you are trying to achieve. – OdinX Jul 05 '12 at 13:47
  • No, its possible though. Maybe try writing a console application in C#.NET and use the web browser component to get the source code of the page. – OdinX Jul 09 '12 at 08:48