0

need change the value of a tag html, in a file html. I tried to use function preg_replace() but I am not able to change anything.

the html file:

 ...
 <div id="phrase_of_day">
     <div>
         <span class="icon quote"></span>
         <h1>Frase do Dia</h1>
         <blockquote><p>value to change</p></blockquote>
     </div>
 </div>
 ...

I try this:

$url = '../index.html';

$file = file_get_contents($url);

$o = preg_replace('/.*<div id="phrase_of_day">.*<blockquote><p>(\w+)<\/p><\/blockquote>/','hello world', $file);

file_put_contents('test.html', $o);

Anyone know where I was wrong?

UPDATE

I try with DOMDocument class, like Madara Uchiha as suggested, but now I have a problem of encoding special characters.

example:

origin: <h1>Gerar Parágrafos</h1>
after: <h1>Gerar Par&Atilde;&iexcl;grafos</h1>

code:

libxml_use_internal_errors(true);
$document = new DOMDocument('1.0', 'UTF-8');
$document->loadHTMLFile($url);
$document->encoding = 'UTF-8';

$blockquote = $document
    ->getElementById("phrase_of_day") //Div
    ->getElementsByTagName("blockquote")->item(0);

$new_value = new DOMElement("p", "New Value for Element");
$blockquote->replaceChild($new_value, $blockquote->childNodes->item(0));

$document->saveHTMLFile('test.html');
libxml_use_internal_errors(false);
Miguel Borges
  • 7,549
  • 8
  • 39
  • 57
  • 2
    Please refrain from parsing HTML with RegEx as it will [drive you į̷̷͚̤̤̖̱̦͍͗̒̈̅̄̎n̨͖͓̹͍͎͔͈̝̲͐ͪ͛̃̄͛ṣ̷̵̞̦ͤ̅̉̋ͪ͑͛ͥ͜a̷̘͖̮͔͎͛̇̏̒͆̆͘n͇͔̤̼͙̩͖̭ͤ͋̉͌͟eͥ͒͆ͧͨ̽͞҉̹͍̳̻͢](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). Use an [HTML parser](http://stackoverflow.com/questions/292926/robust-mature-html-parser-for-php) instead. – Madara's Ghost Oct 06 '12 at 16:12

2 Answers2

3

With DOM, like a sane human being:

<?php

$html = <<<HTML
 <div id="phrase_of_day">
     <div>
         <span class="icon quote"></span>
         <h1>Frase do Dia</h1>
         <blockquote><p>value to change</p></blockquote>
     </div>
 </div>
HTML;

$document = new DOMDocument();
$document->loadHTML($html);

$blockquote = $document
    ->getElementById("phrase_of_day") //Div
    ->getElementsByTagName("blockquote")->item(0);

$new_value = new DOMElement("p", "New Value for Element");
$blockquote->replaceChild($new_value, $blockquote->childNodes->item(0));

echo $document->saveHTML();
Madara's Ghost
  • 172,118
  • 50
  • 264
  • 308
1

You should not use regex to parse HTML.

But, if you really want to, then you should use this regex >>

$o = preg_replace(
  '/(<div id="phrase_of_day">.*?<blockquote><p>)([^<]+)(<\/p><\/blockquote>)/s', 
  '$1hello world$3',
  $file);

Check this demo.

Community
  • 1
  • 1
Ωmega
  • 42,614
  • 34
  • 134
  • 203