9

Obviously modifying it would be out of the question.

But you would think just reading it should not be a problem?

If i have my .js running on someone's system and I want to analyze the DOM of another URL , client side, is there a way to do this?

Something simple like pull the title tag or pull the url...maybe load the site into an iframe to accomplish this?

CS_2013
  • 1,158
  • 3
  • 13
  • 24

4 Answers4

13

You can do this using xmlhttp

function getSourceAsDOM(url)
{
    xmlhttp=new XMLHttpRequest();
    xmlhttp.open("GET",url,false);
    xmlhttp.send();
    parser=new DOMParser();
    return parser.parseFromString(xmlhttp.responseText,"text/html");      
}
Megachip
  • 359
  • 3
  • 13
4

If I am getting your question right,

A cross domain example by using yql,

var url = 'xyz.com'; // website you want to scrape
var yql = 'http://query.yahooapis.com/v1/public/yql?q=' + encodeURIComponent('select * from html where url="' + url + '"') + '&format=json&callback=?';  
$.getJSON(yql,function(data){
    if (data.results[0]){  
        console.log(data = data.results[0].replace(/<script[^>]*>[\s\S]*?<\/script>/gi, ''));  // The scraped data (the whole webpage)
    }
});

Reference: How can i get Equivalent method of HttpwebRequest in javascript

Community
  • 1
  • 1
Jashwant
  • 28,410
  • 16
  • 70
  • 105
  • yahoo query language...I'm googled out..can you just give me a brief synopsis? – CS_2013 May 21 '12 at 20:25
  • some sort of yahoo api that does the parsing for you? – CS_2013 May 21 '12 at 20:26
  • It does a lot of things. You will get everything [here](http://developer.yahoo.com/yql/console/). What it does, how it does, how to do it and more ? – Jashwant May 21 '12 at 20:28
  • It acts as a proxy. Since a server can parse any page, it can... and then it sends result back to you as `jsonp`. Since, `jsonp` is cross domain, you can use it from any domain :) – Jashwant May 21 '12 at 20:30
  • I don't want to hit another server, though that looks cool there are about 3 different ways to do it mentioned in the comment by Mic. – CS_2013 May 21 '12 at 21:08
  • SO is saying to avoid discussion. Mark as answer if my answer helped you :) – Jashwant May 21 '12 at 21:17
  • Actually...the ways above require access to second origin...your way requires...3rd party. – CS_2013 May 21 '12 at 22:56
1

If the domains do not match you will not be able to do this due to a security exception. If however you control the other domain, you should research adding a cross domain file to allow access via javascript.

Steve Binder
  • 2,270
  • 1
  • 14
  • 6
1

You could get the html source with a AJAX GET request. An then you can search in the html code or assign it to an iframe/...

Marduk
  • 101
  • 3