-1

In my WordPress v5.5.3, I am building bookmark page of external domain and from each page I would like to identify an element which has specific text as value, and capture the value of its next element:

<html>
        <div>

            <ul class="list">
                <li>
                    <span class="item">
                        <span class="list-item">Name</span>
                        <span>John</span>
                    </span>
                </li>
                <li>
                    <span class="item">
                        <span class="list-item">Age</span>
                        <span>25</span>
                    </span>
                <li>
            </ul>



            <ul class="list">
                <li>
                    <span class="item">
                        <span class="list-item">Brand</span>
                        <span>Honda</span>
                    </span>
                </li>
                <li>
                    <span class="item">
                        .....
                    </span>
                <li>
            </ul>



            <ul class="list">
                <li>
                    <span class="item">
                        <span class="list-item">City</span>
                        <span>New York</span>
                    </span>
                </li>
                <li>
                    ....
                <li>
            </ul>

        </div>
    </html>

There are no unique class or IDs for all the elements.

I would like get the name values (in example John from this page) and store as variable, which will be next element of <span>Name</span> and is a unique element.

With preg_match, if there is a unique ID I would have captured with:

preg_match('~<span id="name"[^>]*>(.*?)</span>~si', $url, $name);
$name = $name[0];

In the above scenario how can I capture the value John?

theKing
  • 714
  • 3
  • 14
  • 36

1 Answers1

0

I understand that maybe you want to use preg_match only. But my experience shows that is better solution to parse HTML is using of DOMDocumen and DOMXPath query.

https://www.php.net/manual/ru/domxpath.query.php

Using it you can extract any data you need.

BambinoUA
  • 6,126
  • 5
  • 35
  • 51