0

Suppose we have this input:

<div wrap>1</div>
<div>2</div>
<div wrap>3</div>
<div wrap>4</div>
<div wrap>5</div>

The required output should be:

<div class="wrapper">
  <div wrap>1</div>
</div>
<div>2</div>
<div class="wrapper">
  <div wrap>3</div>
  <div wrap>4</div>
  <div wrap>5</div>
</div>

Also, suppose that these elements are direct children of the body element and there can be other unrelated element or text nodes before or after them.

Notice how consecutive elements are grouped inside a single wrapper and not individually wrapped.

How would you handle body's DOMNodeList and insert the wrappers in the correct place?

Following the conversation (comments) about wrapping only direct children of the body element,

For this input:

<body>
  <div wrap>1
    <div wrap>1.1</div>
  </div>
  <div>2</div>
  <div wrap>3</div>
  <div wrap>4</div>
  <div wrap>5</div>
</body>

The required output should be:

<body>
  <div class="wrapper">
    <div wrap>1
      <div wrap>1.1</div>
      <!–– ignored ––>.
    </div>
  </div>
  <div>2</div>
  <div class="wrapper">
    <div wrap>3</div>
    <div wrap>4</div>
    <div wrap>5</div>
  </div>
</body>

Notice how elements that are not direct descendants of the body element are totally ignored.

double-beep
  • 5,031
  • 17
  • 33
  • 41

1 Answers1

1

It's been interesting to write and would be good to see other solutions, but here is my attempt anyway.

I've added comments in the code rather than describing the method here as I think the comments make it easier to understand...

// Test HTML
$startHTML = '<div wrap>1</div>
<div>2</div>
<div wrap>3</div>
<div wrap>4</div>
<div wrap>5</div>';

$doc = new DOMDocument();
$doc->loadHTML($startHTML);

$xp = new DOMXPath($doc);
// Find any div tag with a wrap attribute which doesn't have an immediately preceeding
// tag with a wrap attribute, (or the first node which means it won't have a preceeding
// element anyway)
$wrapList = $xp->query("//div[@wrap='' and preceding-sibling::*[1][not(@wrap)]
                           or position() = 1]");

// Iterate over each of the first in the list of wrapped nodes
foreach ( $wrapList as $wrap )  {
    // Create new wrapper 
    $wrapper = $doc->createElement("div");
    $class = $doc->createAttribute("class");
    $class->value = "wrapper";
    $wrapper->appendChild($class);

    // Copy subsequent wrap nodes (if any)
    $nextNode = $wrap->nextSibling;
    while ( $nextNode ) {
        $next = $nextNode;
        $nextNode = $nextNode->nextSibling;
        // If it's an element (and not a text node etc)
        if ( $next->nodeType == XML_ELEMENT_NODE ) {
            // If it also has a wrap attribute - copy it
            if ($next->hasAttribute("wrap") ) {
                $wrapper->appendChild($next);
            }
            // If no attribute, then finished copying
            else    {
                break;
            }
        }
    }
    // Replace first wrap node with new wrapper
    $wrap->parentNode->replaceChild($wrapper, $wrap);
    // Move the wrap node into the wrapper
    $wrapper->insertBefore($wrap, $wrapper->firstChild);
}
echo $doc->saveHTML();

As it's using HTML, the end result is all wrapped in the standard tags as well, but the output (formatted) is...

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
    <body>
        <div class="wrapper">
            <div wrap>1</div>
        </div>
        <div>2</div>
        <div class="wrapper">
            <div wrap>3</div>
            <div wrap>4</div>
            <div wrap>5</div>
        </div>

    </body>
</html>

Edit:

If you only want it to apply to direct descendants of the <body> tag, then update the XPath expression to include it as part of the criteria...

$wrapList = $xp->query("//body/div[@wrap='' and preceding-sibling::*[1][not(@wrap)]
                       or position() = 1]");
Nigel Ren
  • 56,122
  • 11
  • 43
  • 55
  • clear, concise, easy to read, easy to test, easy to predict. Thank you sir :) – Andreas Myriounis Feb 24 '19 at 20:49
  • A notice: while this code works fine for the input I provided, keep in mind that it will add wrappers to any level of childNodes, not just to body's children. – Andreas Myriounis Feb 25 '19 at 15:40
  • It's difficult to always code for generic solutions and although I tried testing various combinations it's impossible to catch all situations. If you need something particular I can try and look at it. – Nigel Ren Feb 25 '19 at 15:41
  • I'm trying to constrain the wrapper only to body's children (direct descendants) and not any level deeper than that but xpath is not really my thing..how would you structure the query to achieve that? – Andreas Myriounis Feb 25 '19 at 15:46
  • Could you add an example onto the question. – Nigel Ren Feb 25 '19 at 15:48
  • Just updated it. Just to clarify, your answer was perfectly valid for my initial question. Thanks for helping me further on this :) – Andreas Myriounis Feb 25 '19 at 16:04
  • 1
    No problems - it's interesting to challenge myself as well - you should be able to do this by updating the XPath expression to `//dody/div[@wra...`. So your saying that the div tag MUST be one level under the body tag. – Nigel Ren Feb 25 '19 at 16:05
  • thanks! It wont let me update your answer, could you include this fix as well, since the selection of direct descendants is part of the question now? – Andreas Myriounis Feb 25 '19 at 16:16