The best thing to do here, because you're dealing with markup, is to treat the data as what it is: HTML. Parse the DOM tree, iterate over the nodes you're interested in and get the data like so:
$dom = new DOMDocument;
$dom->loadHTML($data);//parses the markup string
$li = $dom->getElementsByTagName('li');
$liValues = [];
foreach ($li as $elem) {
$liValues[] = $elem->textContent; // not nodeValue if the nodes can contain other elements, like in your case an "a" tag
}
To get the first, last and middle elements, just write this:
$length = $li->length; // or count($li); to get the total number of li's
$nodes = [
$li->item(0), //first
$li->item(round($length/2)), // middle
$li->item($length -1), // last
];
//or, given that you don't need the actual nodes
$nodes = [
$li->item(0)->textContent,
$li->item(round($length/2))->textContent,
$li->item($length -1)->textContent,
];
Or just get the parent node (ul
or ol
), and use $parent->firstChild
and $parent->lastChild
to get the first and last elements in the list, as CD001 suggested in the comments.
Useful links:
You can use the attributes
property of the DOMNode
elements in the DOMNodeList
returned by getElementsByTagName
to further exapand on the data you store