0

I have this kind of Code:

<br>
Réaménagement des éclairages : Couloir de circulation de l'accueil – Salle de restauration de l'EHPA Résidence «
        Loubayssens »

</td>

I try to obtain:

<p>Réaménagement des éclairages : Couloir de circulation de l'accueil – Salle de restauration de l'EHPA Résidence «
        Loubayssens »
</p>

I would like to remove the <br> tag and to encapsulate the lines of text in <p> tags but I'm unable to capture the entire line of text when it is on several lines.

I try:

<pre>
$pattern = '/<br>(\s*)([\w]([.*]|[\n])[\S|\w])(\s*)<\/td>/i';
$replacement = "\<p>$2</p></td>";
$source = preg_replace($pattern, $replacement, $source);
</pre>

I try also /is but doesn't work.

Could you give me some hints ?

Prix
  • 19,417
  • 15
  • 73
  • 132
Patrick D
  • 33
  • 4
  • 4
    [Looks like another job for ...](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – pguardiario Aug 02 '13 at 08:34
  • can you add more example and what you expected to get? – Angga Aug 02 '13 at 08:36
  • `/m` switch for multi-line? – Mark Baker Aug 02 '13 at 08:37
  • @Prix I will do that, but I get an Html page with cUrl. And the Html Page in not well formed. I just pass through the html doc to try to get it more w3c compliant. Before using DomDocument – Patrick D Aug 02 '13 at 09:15

3 Answers3

0

use this regex <br>([\w\W\n]+)</td> this will get all inside the tag br and td including the \n. click this link for detailed example.

and this link with replacement to <p>.

Angga
  • 2,305
  • 1
  • 17
  • 21
  • Thx All, it works with links provided by @Angga (thx). But in real example it fails. I get

    sentence but missing

    – Patrick D Aug 02 '13 at 08:57
  • `` or `` is just string you can replace with anything, the the only problem is just how to place your sentence into regex group, in my example $1. And if this answered your problem please mark it's as accepted answer. – Angga Aug 02 '13 at 09:08
0

Very simple example of using DOMDocument to remove the br tag within the td tag and encapsulate the text with p tag along with removing the empty spaces from start and end of it:

<?php
$str = <<<HTML
<table>
<tr>
<td>
<br>
    Réaménagement des éclairages : Couloir de circulation de l'accueil – Salle de restauration de l'EHPA Résidence «
            Loubayssens »

</td>
<tr>
</table>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($str);
$td = $dom->getElementsByTagName('td')->item(0);
foreach($td->childNodes as $child)
{
    if ($child->nodeName == 'br')
        $td->removeChild($child);
}
$element = $dom->createElement('p', trim($td->nodeValue));
$td->parentNode->replaceChild($element, $td);
echo $dom->saveHTML();

Live DEMO.

Prix
  • 19,417
  • 15
  • 73
  • 132
  • Waouu. seems to close all my headaches – Patrick D Aug 02 '13 at 09:20
  • @PatrickD glad you liked it, its just an example to get you started, you can load the cURL result on `$dom->loadHTML` and from there you navigate yourself to the places you need to fix, you can also use xpath if you have elements with ID or that can be easily traceable. – Prix Aug 02 '13 at 09:22
0

try this regex pattern:

$pattern = '/<br>([^<]*)<\/td>/i';
$replacement = "<p>$1</p></td>";
$source = preg_replace($pattern, $replacement, $source);

simple and dirty solution, better than wasting your time using parsers for a simple task...

Desolator
  • 22,411
  • 20
  • 73
  • 96