-1
    <tr>
        <td width="300" bgcolor="#cccccc" style="text-align: right;">
         <strong>&nbsp;&nbsp;&nbsp;Sometext<br />
         </strong>
        </td>
        <td width="125" bgcolor="#009900" style="text-align: center;">
         <strong><span style="color: rgb(255, 255, 255);">
          <span style="font-size: larger;">Pricetoreplace</span>
          </span>
         </strong>
        </td>
    </tr>

I need to remove whole <tr>....</tr> row, if it contain the "Pricetoreplace" text in it. I've tried next:

$content = preg_replace('~(<tr.*[\'"]Pricetoreplace[\'"].*tr>)~', '', $content);

But it didnt work.

gtktuf
  • 81
  • 1
  • 8
  • What do you mean "it didn't work"? Was there an error? Did it not delete anything? – kchason Nov 15 '17 at 14:20
  • 1
    You should *never* parse HTML with regex. Use [a PHP DOM parser](http://simplehtmldom.sourceforge.net/) instead. – Jay Blanchard Nov 15 '17 at 14:21
  • It did not delete anything. – gtktuf Nov 15 '17 at 14:21
  • 2
    @gtktuf first off, you're going to replace everything from the first instance to the last `tr>` so your regex is not going to do what you expect (you use greedy quantifiers `.*` instead of lazy quantifiers `.*?`). Second, your `.` doesn't match new line characters, you should use `[\s\S]` instead or turn on the `s` flag to match newline characters with the `.` character. Again, though, you shouldn't even be using regex for this. – ctwheels Nov 15 '17 at 14:24
  • 1
    @gtktuf you really should be using something like [this](https://stackoverflow.com/questions/9478330/php-how-can-i-retrieve-a-div-tag-attribute-value) question does. – ctwheels Nov 15 '17 at 14:26
  • @ctwheels. Thx. I understand that the use of regular expressions is not very suitable for this task, right?Based on your first reference – gtktuf Nov 15 '17 at 14:27
  • 1
    @gtktuf yes. It's usually bad practice to parse HTML or XML with regex. Regex should only be used for parsing HTML or XML if it's a known subset. In your case it doesn't appear to be so. I would recommend you use an HTML/XML parser and have it do the heavy lifting for you. – ctwheels Nov 15 '17 at 14:29
  • If you *really* do want a regex for this, you can use something like `Pricetoreplace<[\s\S]*?tr>`, but I would **highly** recommend you don't go this direction. – ctwheels Nov 15 '17 at 14:33
  • @ctwheels Thx again. I'll try something like this: [link](https://stackoverflow.com/questions/3308530/php-strip-a-specific-tag-from-html-string) – gtktuf Nov 15 '17 at 14:34
  • @gtktuf you're welcome! I hope you find a proper solution. If you do, you can answer your own question. I would also suggest changing the title to something like `How to remove HTML tag if it contains specific string`. This would allow future users that might be struggling with the same issue to easily find a solution (yours) and might help generate traffic to this question (and hopefully give you upvotes). See [this](https://stackoverflow.com/help/how-to-ask) and [this](https://meta.stackexchange.com/questions/10647/how-do-i-write-a-good-title) for more info about writing effective titles. – ctwheels Nov 15 '17 at 14:43
  • @gtktuf [this](https://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) post might help you. The second answer provides a method to get `tr` elements as well as extract the content, which should get you partway there. – ctwheels Nov 15 '17 at 14:54
  • Possible duplicate of [How do you parse and process HTML/XML in PHP?](https://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – miken32 Nov 15 '17 at 21:07

1 Answers1

4

One way would be to use an xpath query:

*//td[contains(., 'Pricetoreplace')]/parent::tr

Here, we look for a td which text() property contains Pricetoreplace and then look up the corresponding parent tr. The latter will be removed from the DOM.


In PHP:
<?php

$html = <<<DATA
    <tr><td class="some other class">some text here</td></tr>
   <tr>
        <td width="300" bgcolor="#cccccc" style="text-align: right;">
         <strong>&nbsp;&nbsp;&nbsp;Sometext<br />
         </strong>
        </td>
        <td width="125" bgcolor="#009900" style="text-align: center;">
         <strong><span style="color: rgb(255, 255, 255);">
          <span style="font-size: larger;">Pricetoreplace</span>
          </span>
         </strong>
        </td>
    </tr>
DATA;

# set up the DOM
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);

# set up the xpath
$xpath = new DOMXPath($dom);

foreach ($xpath->query("*//td[contains(., 'Pricetoreplace')]/parent::tr") as $row) {
    $row->parentNode->removeChild($row);
}
echo $dom->saveHTML();
?>


This yields
<tr><td class="some other class">some text here</td></tr>
Jan
  • 42,290
  • 8
  • 54
  • 79
  • That's the answer, but in my case i need to replace: `$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);` `$dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'));` to solve some problems with encoding. And there's no classes like: `class="some other class"` in the whole posts, wich i need to rebuild with this php script-that was the main problem. Ty for this method. – gtktuf Nov 16 '17 at 08:13
  • @gtktuf: Glad to help. – Jan Nov 16 '17 at 09:21