I have a simple puppeteer script to scrape an announces website. I need to get the content of the page and after I've inspected the DOM I'm able to see that all the contents will have the same class for the div that contain the link and the text. How I can get the contents of each div with a loop?
This is an example of the html structure of the page, there are about twentyfive divs with the same class, each one is an announcement.
<div class="container">
<div class="item-card bordertop show-in-related-free-list">
<!-- link and text are nested inside here -->
</div>
</div>
This is the JS code I have at the moment. I've created it using headless-recorder-v2 chrome extension.
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch({
headless: false,
slowMo: 300
})
const page = await browser.newPage()
const navigationPromise = page.waitForNavigation()
await page.goto('https://city.example.com/')
await page.setViewport({ width: 1280, height: 607 })
await page.waitForSelector('.bakec > #app > .alert > .btn')
await page.click('.bakec > #app > .alert > .btn')
await page.waitForSelector('.row > .col-md-4:nth-child(1) > .card > .cursor-pointer > .card-title-home')
await page.click('.row > .col-md-4:nth-child(1) > .card > .cursor-pointer > .card-title-home')
await navigationPromise
await page.waitForSelector('#lightbox-vm18 > .modal-dialog > .modal-content > .modal-footer > .btn-primary')
await page.click('#lightbox-vm18 > .modal-dialog > .modal-content > .modal-footer > .btn-primary')
await page.waitForSelector('.bakec > #app > main > .container')
await page.click('.bakec > #app > main > .container')
await page.waitForSelector('#app > main > .container > .item-card:nth-child(3) > .item-container')
// Here I want to loop over announces and store each link and title inside an array
//await page.click('#app > main > .container > .item-card:nth-child(3) > .item-container')
//await navigationPromise
//await browser.close()
UPDATE
I've added this lines of code to my script. I'm able to get an array of the desired elements but how I can loop them, will a foreEach
loop do the trick or I need to use a for
loop??
const nodes = await page.$$('.item-heading > .item-title > a')
const announces = []
nodes.forEach( (el) => {
let href = el.getProperty('href')
announces.push(href)
})
console.log(announces);
I get an array of this kind if I try to loop the nodes
variable
[
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }, Promise { <pending> },
Promise { <pending> }
]