4

I have a string, that may or may not be valid HTML, but it should contain a Title tag.
I want to replace the content of the title with new content.

Example 1:

lorem yada yada <title>Foo</title> ipsum yada yada  

Should turn into:

lorem yada yada <title>Bar</title> ipsum yada yada  

Example 2:

lorem yada yada <title attributeName="value">Foo</title> ipsum yada yada  

Should turn into:

lorem yada yada <title attributeName="value">Bar</title> ipsum yada yada  

I don't want to parse html with regex - just replace the title tag... Please don't send me here...

EDIT: After numerous down votes and a lot of patronizing attitude -
I am aware (as admitted in the original post) that usually Regex is not the way to handle HTML. I am open to any solution that will solve my problem, but till now every JQuery / DOM solution did not work. Being "right" is not enough.

Community
  • 1
  • 1
seldary
  • 6,186
  • 4
  • 40
  • 55

3 Answers3

5

It's difficult to do such a thing reliably with regex (read: "will not work for all cases"), thus using some kind of proper parser is best if possible.

That said, here is a simple expression that would work for your examples:

var re = /(<title\b[^>]*>)[^<>]*(<\/title>)/i;
str = str.replace(re, "$1Bar$2");

Some things that this does not handle and will not work right with: comments, quotes, CDATA, etc.

Qtax
  • 33,241
  • 9
  • 83
  • 121
4
function replaceTitle( str, replacement ) {
    var tmp = document.createElement("ihatechrome");
    tmp.innerHTML = str;
    tmp.getElementsByTagName("title")[0].innerHTML = replacement;
    return tmp.innerHTML;   
}

replaceTitle( "lorem yada yada <title>Foo</title> ipsum yada yada", "Bar" );
//"lorem yada yada <title>Bar</title> ipsum yada yada"

For some reason, google chrome makes requests if there are img tags with src. Doesn't make any sense but that's what happens.

Edit:

This seems to work in chrome (does not load images):

var doc = document.implementation.createHTMLDocument("");

doc.body.innerHTML = "<img src='/'>";

doc.body.innerHTML; //"<img src="/">"
Esailija
  • 138,174
  • 23
  • 272
  • 326
  • Won't this execute any ` – Qtax Jun 19 '12 at 11:47
  • And will there be any chance that text other than the title's inner text will be changed? even slightly? – seldary Jun 19 '12 at 11:49
  • It will load images but not execute scripts or style tags – Esailija Jun 19 '12 at 11:50
  • @Esailija - This is why I thought of regex... I don't want any side effects. just to replace the string. – seldary Jun 19 '12 at 11:51
  • @seldary fine, so use a regex then – Esailija Jun 19 '12 at 11:56
  • I just checked this with a random html page, and your solution REMOVED the and tags, among some more changes to the original string. I thought it was obvious, but I can't allow this to alter the input beyond the title's content. – seldary Jun 19 '12 at 12:27
0

Please god don't try and parse html with a regex (I know you said you aren't parsing it, but you are...) jQuery has a perfectly good set of primitives to manipulate html that isn't in the DOM:

var htmlishString = "almost <title>HTML</title>";
var $fakeDiv = jQuery('<div />');
$fakeDiv.html(htmlishString).find('title').text('Bar');
var manipulatedString = $fakeDiv.html()

http://jsfiddle.net/4kQkx/

tobyodavies
  • 27,347
  • 5
  • 42
  • 57
  • Have you tested this? the third line is throwing "Uncaught TypeError: Cannot read property 'slice' of undefined". – seldary Jun 20 '12 at 04:58
  • @seldary it works in chrome, but i did not test in any other browser – tobyodavies Jun 20 '12 at 05:10
  • It crashed with this: view-source:http://blogs.msdn.com/b/ericlippert/archive/2012/06/18/implementation-defined-behaviour.aspx – seldary Jun 20 '12 at 05:18
  • it also works in FF. Not sure I understand the relevance of that link – tobyodavies Jun 20 '12 at 05:24
  • admittedly this also appears to suffer the fetching scripts bug in chrome – tobyodavies Jun 20 '12 at 05:26
  • I Ran your code on the html of that page (which contains a tag), and it crashed, while your (rather simple) example worked. It doesn't help me if it fails on real world input, and suffers from bugs. – seldary Jun 20 '12 at 05:27
  • @seldary yeah, due to downloading and running scripts which are executed on diff host, thus same-origin policy breaks it... – tobyodavies Jun 20 '12 at 05:33