-2

that's my first question here. :) Was searching around with my problem for a few days, but it is not yet fully solved. What I have is a bunch of text. There is some price data divided by exact phrase "promoted-after" . So here is my RegEx:

'/price-([\d $гр€\.]*)/i'

It awesomely works for ALL the prices it founds including prices before divider. But when I modify it to:

'/promoted-after.*price-([\d $гр€\.]*)/is'

It correctly bypasses the top part, but then saves only one last price of all the data. How can it be modified to correctly save only all the prices AFTER "promoted-after" tag? Here is the example of input:

price- 2680 $
a lot of some random html code here
price- 3250 $
a lot of some good html code here
price- 3450 $
promoted-after
price- 400 $
a lot of some strange html code here
price- 401 $
a lot of some awesome html code here
price- 402 $
a lot of some ugly html code here
price- 403 $
a lot of some nice html code here
price- 404 $
a lot of some best html code here

P.S. I use preg_match_all

EDIT: Ok, let's just ignore that it's HTML. Let it be plain text. What is the overall logical construction behind such a task should be?

Tim Yoshi
  • 1
  • 1

1 Answers1

1

As an alternative you might use DOMDocument and DOMXPath and use an xpath expression to find the div with the id promoted-after and then find all the siblings p/strong.

You could get their value using nodeValue.

$dom = new DOMDocument();
$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$items = $xpath->query('//div[@id="promoted-after"]/following-sibling::p/strong');
foreach($items as $item) {
    echo $item->nodeValue . "<br>";
}

Result

400 $
401 $
402 $
403 $
404 $

Demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70