0

After scraping a page , I have selected the footer of a table using cheerio with:

const $ = cheerio.load(data);
const foot = $('#tblAcctBal > tfoot > tr');
o = $(foot).html();
console.log(o);

results in the following html:

tr> <th rowspan=\"1\" colspan=\"1\"></th>
<th rowspan=\"1\" colspan=\"1\"></th>
<th rowspan=\"1\" colspan=\"1\"></th>
<th rowspan=\"1\" colspan=\"1\"></th>
<th rowspan=\"1\" colspan=\"1\"></th>
<th rowspan=\"1\" colspan=\"1\">$0.00</th>
<th rowspan=\"1\" colspan=\"1\">$0.00</th>
<th rowspan=\"1\" colspan=\"1\">$0.00</th>
<th rowspan=\"1\" colspan=\"1\">$0.00</th>
<th rowspan=\"1\" colspan=\"1\">$0.00</th>undefined</tr>\n

I'm trying to get an array of the text values in the footer. I've tried:

$(foot).each( function (th) {
    console.log($(th).text().trim())
  })

but I'm getting no output. How do I fix this?

user1592380
  • 34,265
  • 92
  • 284
  • 515
  • There are no `#tblAcctBal > tfoot` in your shown HTML, so there's no way to say why this isn't working. If all the values are empty or 0, what's the point of scraping? Keep in mind that at best, only HTML shown in `view-source:` can be scraped by Cheerio if you're using it with a plain HTTP request library like fetch or axios. – ggorlen Mar 01 '23 at 19:43
  • @ggorlen, as you suggested a while back, I am using puppeteer to get the html, and I have verified that the appropriate data is present in the footer (please see https://stackoverflow.com/questions/75568937/extracting-header-and-footer-table-fields-with-puppeteer). Please see edits above. – user1592380 Mar 01 '23 at 19:53
  • Also thanks for "Keep in mind that at best, only HTML shown in view-source: can be scraped by Cheerio if you're using it with a plain HTTP request library like fetch or axios." - thats a good shortcut to be aware of. – user1592380 Mar 01 '23 at 19:55

2 Answers2

0

Select the th elements and loop over those.

const feet = $('#tblAcctBal > tfoot > tr > th');

for (const el of feet){
  console.log($(el).text())
}

const values = feet
  .map((i, el) => $(el).text())
  .toArray()

console.log(values)

As a side note, .each() and other iterating functions in cheerio supply both the index and element in the function signature.

feet.each((index, el) => {
  console.log(index, $(el).text())
})

In the example code, the selector returns a single tr element, which would need something like .children() to get each th element.

const row = $('#tblAcctBal > tfoot > tr')
console.log(row.length) // 1
$(row).children().each((i, el) => {
  console.log(i, $(el).text())
})
Matt
  • 68,711
  • 7
  • 155
  • 158
0

if they're really there you can just do:

$('tfoot tr th').get().map(el => $(el).text())
pguardiario
  • 53,827
  • 19
  • 119
  • 159