2

I'm trying to make a field similar to the facebook share box where you can enter a url and it gives you data about the page, title, pictures, etc. I have set up a server side service to get the html from the page as a string and am trying to just get the page title. I tried this:

function getLinkData(link) {
  link = '/Home/GetStringFromURL?url=' + link;
  $.ajax({
    url: link,
    success: function (data) {
      $('#result').html($(data).find('title').html());
      $('#result').fadeIn('slow');
    }
  });
}

which doesn't work, however the following does:

$(data).appendTo('#result')
var title = $('#result').find('title').html();
$('#result').html(title);
$('#result').fadeIn('slow');

but I don't want to write all the HTML to the page as in some case it redirects and does all sorts of nasty things. Any ideas? Thanks

Ben

Tim S. Van Haren
  • 8,861
  • 2
  • 30
  • 34
Ben
  • 1,767
  • 16
  • 32

4 Answers4

3

Try using filter rather than find:

$('#result').html($(data).filter('title').html());
lonesomeday
  • 233,373
  • 50
  • 316
  • 318
  • Thanks that solves the problem nicely, although I think it makes more sense for me to do the processing server-side as there is a fair bit to do. – Ben Nov 08 '10 at 15:12
2

To do this with jQuery, .filter is what you need (as lonesomeday pointed out):

$("#result").text($(data).filter("title").text());

However do not insert the HTML of the foreign document into your page. This will leave your site open to XSS attacks. As has been pointed out, this depends on the browser's innerHTML implementation, so it does not work consistently.

Even better is to do all the relevant HTML processing on the server. Sending only the relevant information to your JS will make the client code vastly simpler and faster. You can whitelist safe/desired tags/attributes without ever worrying about dangerous ish getting sent to your users. Processing the HTML on the server will not slow down your site. Your language already has excellent HTML parsers, why not use them?.

Angiosperm
  • 454
  • 2
  • 6
  • Fails in IE7, IE8, Opera, Safari. Your jQuery solution appears to be browser dependent. – user113716 Nov 08 '10 at 13:57
  • 1
    Thanks, it does look like server side is the best option for me as I want to grab various bits from it. I'm using .NET with C#, so I will try and find a good HTML parser for that. Html Agility Pack seems to be recommended from http://stackoverflow.com/questions/100358/looking-for-c-html-parser will give it ago. Thanks :-) – Ben Nov 08 '10 at 15:11
0

When you place an entire HTML document into a jQuery object, all but the content of the <body> gets stripped away.

If all you need is the content of the <title>, you could try a simple regex:

var title = /<title>([^<]+)<\/title>/.exec(dat)[ 1 ];
alert(title);

Or using .split():

var title = dat.split( '<title>' )[1].split( '</title>' )[0];
alert(title);
user113716
  • 318,772
  • 63
  • 451
  • 440
  • "When you place an entire HTML document into a jQuery object, all but the content of the gets stripped away" where did you read this? His second example wouldn't work if that was the case. – Angiosperm Nov 08 '10 at 13:21
  • @Angiosperm - Thanks for the down-vote. Did you test first? I did. That's exactly what happened. I also tried appending to a newly created `
    `, and got the same result. Here's an example for you. http://jsfiddle.net/YbLFy/
    – user113716 Nov 08 '10 at 13:27
  • 1
    @Angiosperm - This usually *is* the case, though it depends on the `innerHTML` implementation of the browser – Nick Craver Nov 08 '10 at 13:32
  • I did test, doing something like this http://jsfiddle.net/cauNS/ (your example slightly edited). I didn't try to just add the DOM element directly toe page as thats not what the OP asked (and would be a terrible idea in this case), so I guess there's some disconnect here. – Angiosperm Nov 08 '10 at 13:51
  • @Nick Alright this *is* pretty unreliable. Just tested in a few browers: works in Chrome 6 and FF4, fails IE9, Opera and Safari – Angiosperm Nov 08 '10 at 14:01
-1

The alternative is to look for the title yourself. Fortunately, unlike most parse your own html questions, finding the title is very easy because it doesn;t allow any nested elements. Look in the string for something like <title>(.*)</title> and you should be set.

(yes yes yes I know never use regex on html, but this is an exceptionally simple case)

Joeri Hendrickx
  • 16,947
  • 4
  • 41
  • 53
  • $(string) parses the string into a collection of DOM elements. .find(selector) doesn't work because it's searching for the *descendants* of the elements in the collection that match the selector, not the elements themselves. – Angiosperm Nov 08 '10 at 13:17
  • @Angiosperm - `$(string)` does *many* things, you've oversimplified things here. Also, it creates a document fragment. – Nick Craver Nov 08 '10 at 13:33
  • @Nick yeah, really bad wording there :\ But isn't the document fragment just used while parsing the HTML, and only its descendants get added to the returned jQuery collection? – Angiosperm Nov 08 '10 at 13:53
  • @Nick Hmm yeah, looks like I lost a bit there :) This way it makes more sense. I didn't know JQuery could work on a fragment that isn't attached to the main dom though. – Joeri Hendrickx Nov 08 '10 at 14:46
  • it can, and it's *much* faster to do many, many operations this way, and insert it when finished. – Nick Craver Nov 08 '10 at 14:52