1

Using Node and Puppeteer to scrape a website.

I'm wanting to return the innerText of a class but there is another span element nested which is returning both.

<div class="inner_sm">
  <h1 class="pdp_address ">79 Etwell Street&nbsp;
   <span>East Victoria Park, WA 6101</span>
  </h1>
  <span class="pdp_price">$670,000
    <span class="price_feature">Under Offer</span>
  </span>
</div>

Target is .pdp_price wanting the result to be '$670,000' but getting '$670,000Under Offer'

const data = await page.evaluate(() => {

const address = document.querySelector('#app > div > div > article > div.pdp_header > div > div > h1').innerText.replaceAll('\n',',')
        const bed = document.querySelector('.bed')?.innerText || ""
        const bath = document.querySelector('.bath')?.innerText || ""
        const car = document.querySelector('.car')?.innerText || ""
        const price = document.querySelector('.pdp_price').innerText

return `${address}, ${price}, ${bed}, ${bath}, ${car} \n`
      })

I've tried a few things but haven't been able to make it work.

const price = document.querySelector('.pdp_price:not(.price_feature)').innerText
const price = document.querySelector('.pdp_price')?.innerText.replaceAll(',','*') || ""
const cleanPrice = price.remove('.price_feature').innerText
  • Does this answer your question? [How to get the text node of an element?](https://stackoverflow.com/questions/6520192/how-to-get-the-text-node-of-an-element) – ggorlen Oct 29 '22 at 04:24
  • The [dupe suggestion](https://stackoverflow.com/questions/6520192/how-to-get-the-text-node-of-an-element) isn't specific to Puppeteer, but if you plop it into your `evaluate` and return the text result you should be good to go. Use a [non-jQuery solution](https://stackoverflow.com/a/73693244/6243352) if the page doesn't have jQuery or you don't want to rely on it. Assuming first child, `document.querySelector('.pdp_price').firstChild.textContent.trim();` should basically work for this particular HTML, but untested. – ggorlen Oct 29 '22 at 04:24

0 Answers0