2

Is it possible to write a regular expression to replace everything between <div id=”somevalue123” class=”text-block”> and </div>? I can do this but the problem I am having is that there are other div nodes within the string.

Here is the current regular expression that I am using:

public static function replaceStringBetween($start, $end, $new, $source, $limit = 1)
{
    // Reinitialize the replacement count
    self::$replacement_count = 0;

    // Try to perform the replacement
    $result = preg_replace('#('.preg_quote($start) . ')(.*)('.preg_quote($end) 
        . ')#is', '$1' . $new . '$3', $source, $limit, $count);
    if ($count > 0)
    {
        self::$replacement_count++;
        return $result;
    }

    // As a fallback, try again with a different method
    $result = preg_replace ("#{$start}(.*){$end}#is", $new, $source, $limit, $count);
    if ($count > 0)
    {
        self::$replacement_count++;
        return $result;
    }

    // Return the original
    return $source;
}

I am passing an HTML file as the source, of course. Thanks

Frank Farmer
  • 38,246
  • 12
  • 71
  • 89

2 Answers2

2

A simple to use PHP parser which I have used to do exactly this in the past is the Simple HTML DOM Parser. You would use the selector div#somevalue123.

Iiridayn
  • 1,747
  • 21
  • 43
0

Regular expressions are not capable of supporting arbitrary nesting. You may want to consider a push-down automaton (parser) for arbitrary nesting.

In practice, you could design a series of regular expressions to parse a fixed number of these. However, once you start getting into handling error conditions and (parse) errors, you are really trying to shoe horn a regular expression into the place of a parser.

This seems like you may want to reconsider your approach and design in the modularity you seek, rather than putting it in after the fact by using a regular expression bait-and-switch.

Brian Stinar
  • 1,080
  • 1
  • 14
  • 32
  • 2
    Modern "regexes" are perfectly capable of supporting arbitrary nested. I wish these lies would stop being propagated. – tchrist Jun 18 '11 at 02:29
  • Can you show me an example? I am not intentionally trying to propagate untruths. I remember this from my formal languages and automata class four years ago. – Brian Stinar Jun 21 '11 at 14:17
  • 1
    [Here’s an example](http://stackoverflow.com/questions/4840988/the-recognizing-power-of-modern-regexes/4843579#4843579) for you. – tchrist Jun 21 '11 at 14:35
  • Thanks. The next time I need to implement capture on something with arbitrary nesting, I am going to try and use a modern regex. – Brian Stinar Jun 21 '11 at 14:46