0

I have following code:

<?php
$html = '<div>
    <div class="block">
        <div class="id">10</div>
        <div class="name">first element</div>
    </div>
    <div class="block">
        <div class="name">second element</div>
    </div>
    <div class="block">
        <div class="id">30</div>
        <div class="name">third element</div>
    </div>
</div>';

preg_match_all('/<div class="block">[\s]+<div class="id">(.*?)<\/div>[\s]+<div class="name">(.*?)<\/div>[\s]+<\/div>/ms', $html, $matches);

print_r($matches);

I want to get array with id and name, but the second position doesn't have id, so my preg match skipped this one. How can I generate array without skip and print sth like this [ ... [id => 0 // or null, name => 'second element'] ...]?

Progman
  • 16,827
  • 6
  • 33
  • 48
smiady
  • 115
  • 1
  • 1
  • 6
  • 1
    See if [this](https://www.phpliveregex.com/p/I0L#tab-preg-match-all) satisfies your requirements - just made the div with id class optional. – Computable Feb 18 '23 at 13:50
  • 2
    [Don't use regular expressions to process HTML](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – fusion3k Feb 18 '23 at 13:52

1 Answers1

1

Use DOMDocument to solve this task; there are a lot of good reasons not to use regular expressions.

Assuming your HTML code is stored in $html variable, create an instance of DOMDocument, load the HTML code, and initialize DOMXPath:

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML($html, LIBXML_NOBLANKS);
$dom->formatOutput = True;
$xpath = new DOMXPath($dom);

Use DOMXPath to search for all <div> nodes with class "name" and prepare an empty array for the results:

$nodes = $xpath->query('//div[@class="name"]');
$result = array();

For each node found, run an additional query to find the optional node with class "id", then add a record to the results array:

foreach ($nodes as $node) {
    $id = $xpath->query('div[@class="id"]', $node->parentNode);
    
    $result[] = array(
        'id' => $id->count() ? $id->item(0)->nodeValue : null,
        'name' => $node->nodeValue
    );
}

print_r($result);

This is the result:

Array
(
    [0] => Array
        (
            [id] => 10
            [name] => first element
        )

    [1] => Array
        (
            [id] => 
            [name] => second element
        )

    [2] => Array
        (
            [id] => 30
            [name] => third element
        )

)
fusion3k
  • 11,568
  • 4
  • 25
  • 47