3

I would like to place an iframe right below the start of the body tag. This has some issues since the body tag can have various attributes and odd whitespace. My guess is this will will require regular expressions to do correctly.

EDIT: This solution has to work with php 4 & performance is a concern of mine. It's for this http://drupal.org/node/586210#comment-2567398

Josh Darnell
  • 11,304
  • 9
  • 38
  • 66
mikeytown2
  • 1,744
  • 24
  • 37

3 Answers3

5

You can use DOMDocument and friends. Assuming you have a variable html containing the existing HTML document as a string, the basic code is:

$doc = new DOMDocument();
$doc->loadHTML(html);
$body = $doc->getElementsByTagName('body')->item(0);
$iframe = $doc->createElement('iframe');
$body->insertBefore($iframe, $body->firstChild);

To retrieve the modified HTML text, use

$html = $doc->saveHTML();

EDIT: For PHP4, you can try DOM XML.

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
  • I should have been more descriptive in my question about the limitations. This is for drupal 6 & the boost module. As such I need to support php 4... yeah I know it sucks. http://drupal.org/node/586210#comment-2567398 – mikeytown2 Feb 07 '10 at 07:32
  • DOM XML is deprecated, I wouldn't advise using that. Drupal has to be backwards-compatible but not to the extent of using deprecated features deliberately. This extension is not even part of standard PHP distro as of 5.0. – Rowlf Feb 07 '10 at 08:07
  • Rowlf, PHP 4 is deprecated too. In fact, there have been no releases since August 2008 (not even security fixes). See also http://stackoverflow.com/questions/1734072/official-end-of-support-for-php4 . If you're using a deprecated programming language, it shouldn't be surprising to use deprecated libraries. – Matthew Flaschen Feb 07 '10 at 08:32
  • I don't understand your point. Drupal is widely used on any and all versions of PHP from 4 to 5.3. It's only been recently upgraded to be 5.3-compatible, which was a huge step in the right direction. Using a blast-from-the-past DOM extension in one of the core modules is the _wrong_ direction to go in. – Rowlf Feb 07 '10 at 08:51
  • My point is simple. PHP 4 is blast from the past language, so as long as they are forced to support it, why not conditionally use an API that version provides? – Matthew Flaschen Feb 07 '10 at 08:58
  • Now I see. It wasn't quite clear from your previous posts that you suggested using both libraries conditionally. – Rowlf Feb 07 '10 at 09:04
  • DOM XML is not part of core PHP; so I can't use that either... http://www.php.net/manual/en/domxml.installation.php The whole point of the boost module is it works with very bad hosts and makes Drupal fast; thus I'm limited in what I can do. – mikeytown2 Feb 08 '10 at 07:08
4

Both PHP 4 and PHP 5 should be happy with preg_split():

/* split the string contained in $html in three parts: 
 * everything before the <body> tag
 * the body tag with any attributes in it
 * everything following the body tag
 */
$matches = preg_split('/(<body.*?>)/i', $html, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE); 

/* assemble the HTML output back with the iframe code in it */
$injectedHTML = $matches[0] . $matches[1] . $iframeCode . $matches[2];
Rowlf
  • 1,752
  • 1
  • 13
  • 15
  • This fails when the body tag is capitalized, which is perfectly valid. Before you add /i, read http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Matthew Flaschen Feb 07 '10 at 08:41
  • @Matthew: I'd rather read and agree with this one, thanks: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1733489#1733489 – Rowlf Feb 07 '10 at 08:47
1

Using regular expressions brings up performance concerns... This is what I'm going for

<?php
$html = file_get_contents('http://www.yahoo.com/');
$start = stripos($html, '<body');
$end = stripos($html, '>', $start);
$body = substr_replace($html, '<IFRAME INSERT>', $end+1, 0);
echo htmlentities($body);
?>

Thoughts?

mikeytown2
  • 1,744
  • 24
  • 37