0

Im using simple html dom to parse html. https://simplehtmldom.sourceforge.io/

Given the following code

<div class="item-price offer-price price-tc default-price">
$129.990
<span class="discount-2">-35%</span>
</div>

How can I select just the price? Im using $html->find(div.offer-price, 0)->plaintext; but it selects the content of the span too.

hanshenrik
  • 19,904
  • 4
  • 43
  • 89
Ricardo Mehr
  • 310
  • 1
  • 3
  • 12
  • 1
    I'm not sure about the library you're using, but in a proper DOM the `DIV` will have a list of child nodes, including the text nodes. The first child node of that `DIV` is what you want. – JAAulde Aug 26 '20 at 02:31
  • _"Im using simple html dom"_ welcome to one of the worst DOM libraries for PHP ever written. Consult the list in the linked duplicate – Phil Aug 26 '20 at 02:41
  • 3
    Ricardo, PHP has a great DOM implementation, along with XPath. Here's how I would do it with those built in libs. https://pastebin.com/8SrB62SB Hopefully you can translate it to the library you're using, or just convert your code to PHP's built in functionality. – JAAulde Aug 26 '20 at 02:48
  • Also, FWIW, I don't think this question should have been closed. Or at least not closed as a duplicate of "how to parse html with PHP." This is more of a question regarding understanding of the DOM than it is understanding how to get a DOM representation with PHP. ¯\_(ツ)_/¯ I have voted to reopen. – JAAulde Aug 26 '20 at 02:55
  • 1
    Thanks @JAAulde, your comments have been really helpful. – Ricardo Mehr Aug 26 '20 at 04:36
  • The question has been re-opened – Phil Aug 27 '20 at 13:23

1 Answers1

1

not sure how to do it in simplehtmldom, but you can use DOMDocument + DOMXPath to extract it,

<?php

$html='<div class="item-price offer-price price-tc default-price">
$129.990
<span class="discount-2">-35%</span>
</div>
';
echo  (new DOMXPath(@DOMDocument::loadHTML($html)))->query("//div[contains(@class,'item-price')]/text()")->item(0)->textContent;

bonus: both DOMDocument and DOMXPath are php builtins, no external library required to use em

hanshenrik
  • 19,904
  • 4
  • 43
  • 89