-3

I have a string of HTML where some inherited div present and I need to extract only the top level div, for example-

$html= '<div class="test">
            <div>
                <div>Some text 1</div> 
                <div>Image content 2</div>
            </div>
            <div>
                <div>Some text 2</div> 
                <div>Image content 2</div>
            </div>
            ....
        </div>';
$regex ='/<div\sclass=[\"\']test[\"\']>.*?<\/div>/is';
preg_match($regex, $html, $matches);    

But the real problem is the result shows me only the first Some text 1</div>, Please help me to figure out where I made the mistakes?

I need to grab the entire class test 'div' as a result matches.

<div>
    <div>Some text 1</div> 
    <div>Image content 2</div>
</div>
<div>
     <div>Some text 2</div> 
     <div>Image content 2</div>
</div>
Murad Hasan
  • 9,565
  • 2
  • 21
  • 42
  • 2
    A best practice would be to parse the html using a library and extract what you need. From the official documentation, try [this](http://www.php.net/manual/en/book.dom.php) – Dan Ionescu Feb 27 '17 at 12:24
  • @DanIonescu, I am using the file_get_content and then using the regular expression I want to grab those form. – Murad Hasan Feb 27 '17 at 12:27
  • if you insist, try the regex:
    ([\s\S]((.|\n)*))<\/div> you can excape it where necessary
    – Dan Ionescu Feb 27 '17 at 12:32
  • Possible duplicate of [How do you parse and process HTML/XML in PHP?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – simbabque Feb 27 '17 at 16:40

1 Answers1

0

The following regex should do it :

(?s)(?<=<div\sclass="test">\n).*(?=<\/div>)

see demo / explanation

PHP

<?php
$regex = '/(?s)(?<=<div\sclass="test">\n).*(?=<\/div>)/';
$str = '<div class="test">
            <div>
                <div>Some text 1</div>
                <div>Image content 2</div>
            </div>
            <div>
                <div>Some text 2</div>
                <div>Image content 2</div>
            </div>
            ....
        </div>';
preg_match($regex, $str, $matches);
print_r($matches);
?>
m87
  • 4,445
  • 3
  • 16
  • 31