Get nth child of an ul using cheerio

Question

i am trying to scrape amazon books using Cheerio and request in nodeJS

But i can't figure how to get Print length and publication date from HTML code below

<table id="productDetailsTable" cellspacing="0" cellpadding="0" border="0">
  <tbody>
    <tr>
      <td class="bucket">
        <h2>Product Details</h2>
        <div class="content">
          <ul>
            <li>
              <b>File Size:</b>
              2544 KB
            </li>
            <li>
              <b>Print Length:</b>
              658 pages
            </li>
            <li>
              <b>Publisher:</b>
              Anchor; 1st edition (September 15, 2009)
            </li>
          </ul>
        </div>
      </td>
    </tr>
  </tbody>
</table>

Any kind of help will be appreciated.Thanks.

i couldn't properly indent this HTML. sorry for that. – Shafayat Alam Apr 28 '16 at 10:24 — Shafayat Alam, Apr 28 '16 at 10:24
#productDetailsTable ul li:nth-child(1) – Atul Jul 25 '18 at 06:38 — Atul, Jul 25 '18 at 06:38

score 0 · Answer 1 · answered Nov 26 '22 at 01:14

You can do this by adapting the approaches in cheerio: Get normal + text nodes and How to get a text that's separated by different HTML tags in Cheerio. The .content() method gives normal and text nodes:

const $ = cheerio.load(html);
const result = [...$("#productDetailsTable .bucket .content li")].map(e =>
  [...$(e).contents()]
    .map(e => $(e).text().trim())
    .filter(Boolean)
);
console.log(result);

Which gives:

[
  [ 'File Size:', '2544 KB' ],
  [ 'Print Length:', '658 pages' ],
  [ 'Publisher:', 'Anchor; 1st edition (September 15, 2009)' ]
]

Consider also

const obj = Object.fromEntries(result.map(([a, b]) => [a.slice(0, -1), b]));

which produces:

{
  'File Size:': '2544 KB',
  'Print Length:': '658 pages',
  'Publisher:': 'Anchor; 1st edition (September 15, 2009)'
}

If you need the publication date specifically, try:

console.log(obj.Publisher.match(/(?<=\().+(?=\))/g)[0]);

which prints September 15, 2009.

Get nth child of an ul using cheerio

1 Answers1