0

how to get link url using regex

 `<tr class="ipl-zebra-list__item">
        <td class="ipl-zebra-list__label">Official Sites</td>
        <td>
            <ul class="ipl-inline-list">
                    <li class="ipl-inline-list__item">
                        <a href="https://www.indiegogo.com/projects/super-troopers-2">IndieGoGo page</a>
                    </li>
                    <li class="ipl-inline-list__item">
                        <a href="https://www.facebook.com/SuperTroopersMovie/">Official site</a>
                    </li>
            </ul>
        </td>
    </tr>

il try use this code but not happening

`$arr['sites'] = $this->match_all('/<a.*?>(.*?)<\/a>/ms', $this->match('/Official Sites <a href="(.*?)".*?<\/a>(<\/tr>)/ms', $html, 1), 1);

il try the code 2.. il get only name IndieGoGo page and Official site

        $arr['sites1'] = $this->match_all('/<a.*?>(.*?)<\/a>/ms', $this->match('/Official Sites(.*?)(<\/tr>)/ms', $html, 1), 1);

please help to get only url https://www.indiegogo.com/projects/super-troopers-2 and https://www.facebook.com/SuperTroopersMovie/

here my imdb php https://jadwal21.my.id/imdb.txt

1 Answers1

1

Don't use regex to parse HTML, instead use something like DOMDocument.

<?php
$dom = new DOMDocument();

$dom->loadHtml('
<tr class="ipl-zebra-list__item">
    <td class="ipl-zebra-list__label">Official Sites</td>
    <td>
        <ul class="ipl-inline-list">
            <li class="ipl-inline-list__item">
                <a href="https://www.indiegogo.com/projects/super-troopers-2">IndieGoGo page</a>
            </li>
            <li class="ipl-inline-list__item">
                <a href="https://www.facebook.com/SuperTroopersMovie/">Official site</a>
            </li>
        </ul>
    </td>
</tr>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xpath = new DOMXPath($dom);

$arr['sites1'] = [];
foreach ($xpath->query("//li[@class=\"ipl-inline-list__item\"]/a") as $link) {
    $href = $link->getAttribute('href');

    $arr['sites1'][] = $href;
}

print_r($arr['sites1']);

https://3v4l.org/KCfXCC

Result:

Array
(
    [0] => https://www.indiegogo.com/projects/super-troopers-2
    [1] => https://www.facebook.com/SuperTroopersMovie/
)
Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106