0

I have previosly saved a string to a .txt like this:

$text = "<div class='highlight'><div><p>".$date.".</p> <h1> ".$heading."</h1>".$textbox."</div></div>";

I now want to extract $date, $heading and $textbox from the txtfile back to variables, for the purpose of editing and I have no clue how to do this.

Can anyone help me?

tssmid
  • 139
  • 1
  • 2
  • 8
  • 2
    can't you make your html syntax somewhat more simple so you can use explode() otherwise you will need regex –  May 30 '12 at 11:14
  • 2
    Looks like you need to separate your template and variables. Store a template with pre-defined placeholders for variables. And store your data in a separate store. Some key-val store. – Vikas May 30 '12 at 11:17
  • 1
    Use the below solution (posted by me) if you need to continue saving your variables in the same format as you are doing now.. But I think you need to consider above suggestions by Vikar and Webtecher. – verisimilitude May 30 '12 at 11:24

2 Answers2

1

You need to use a DOM parser to parse the HTML.

http://simplehtmldom.sourceforge.net/

Code posted from the above site.

$html = file_get_html('http://www.google.com/');

// Find all images
foreach($html->find('img') as $element)
       echo $element->src . '
'; // Find all links foreach($html->find('a') as $element) echo $element->href . '
';

OR PHP's DOM

$str = file_get_contents("a.txt");
   $DOM = new DOMDocument;
   $DOM->loadHTML($str);

//get all H1 $items = $DOM->getElementsByTagName('h1');

//display all H1 text for ($i = 0; $i < $items->length; $i++) echo $items->item($i)->nodeValue . "
";

verisimilitude
  • 5,077
  • 3
  • 30
  • 35
0

[Edit - after reading the comments it seems regex is not the way to go. Please try using SimpleHtmlDom parser]

$html = new simple_html_dom();
$html->load($yourstring);
$date = $html->find('p')->innertext;
$heading = $html->find('h1')->innertext;
$textbox = $html->find('div div')->innertext; 

you can find the documentation for Simple Html Dom here - http://simplehtmldom.sourceforge.net/manual.htm

A less efficient way - preg_match('#

(.).(.).(.*)#', $text, $matches); $date = $matches[0]; $heading = $matches[1]; $textBox = $matches[2];

Mukesh Soni
  • 6,646
  • 3
  • 30
  • 37