4

This question has definitely been asked multiple times, but I've looked everywhere and none of the answers worked for me.

So I have the following Div:

<div class="dataTables_info" id="dt-card-entry_info" role="status" aria-live="polite">
    Showing 1 to 20 of 761,871 entries
    <span class="select-info">
        <span class="select-item">
            1 row selected
        </span>
        <span class="select-item">
            
        </span>
        <span class="select-item">
            
        </span>
    </span>
</div>

I am trying to get the text in the parent div: Showing 1 to 20 of 761,871 entries

I tried:

const text = await page.$eval('div#dt-card-entry_info.dataTables_info', el => el.textContent)

and also

 const text = await page.evaluate(() => {
        const el = document.querySelector('#dt-card-entry_info')
        return el.innerText
    })

From Browser Console, this works:

$('#dt-card-entry_info').text()

and also this:

$('#dt-card-entry_info')[0].innerText

or this:

$('#dt-card-entry_info')[0].textContent
francis
  • 3,852
  • 1
  • 28
  • 30
  • can you try `document.getElementById('dt-card-entry_info')` https://developer.mozilla.org/en-US/docs/Web/API/Document/getElementById Browser console has jQuery available to it, but puppiteer does not iirc – Pogrindis Feb 12 '21 at 14:35
  • And how do I get the text? – francis Feb 12 '21 at 14:36
  • Am suspecting it has to do with the nested `span` element. – francis Feb 12 '21 at 14:38
  • The accepted answer is functionally identical to OP's original code, just written in a slightly different (more verbose) style, so I don't think this is a [mcve]. Most likely, OP wasn't waiting for their element to load, the script was being blocked running headlessly or the element was in a frame or shadow DOM. – ggorlen Mar 29 '23 at 13:09
  • Canonical: [how to get text inside div in puppeteer](https://stackoverflow.com/questions/55237748/how-to-get-text-inside-div-in-puppeteer) – ggorlen Mar 29 '23 at 13:22

1 Answers1

2

You can use

document.getElementById

You want the text content so use :

var res = document.getElementById('dt-card-entry_info').textContent;

Your method can be used like this then :

const text = await page.evaluate(() => {
        const el = document.getElementById('dt-card-entry_info');
        return el.textContent;
    })

I don't like the await pageEval in the const def, so I would change it outside the scope of the eval.

This is because the pageEval is a promise, so you will need in turn to return a promise of the string content. Read More Here

[EDITED - SEE BELOW]

You can it working here : https://jsfiddle.net/9s4zxvLk/

Edit:

const text = await page.evaluate(async () => {
    return document.getElementById('dt-card-entry_info').textContent;
})
console.log(text);
Pogrindis
  • 7,755
  • 5
  • 31
  • 44
  • 1
    This first example looks OK and can be shortened to `await page.$eval("#dt-card-entry_info", el => el.textContent)`. But this is the same as OP's code, so it's hard to understand why it was accepted. The second version [doesn't work](https://blog.appsignal.com/2023/02/08/puppeteer-in-nodejs-common-mistakes-to-avoid.html#trying-to-access-variables-from-an-evaluate-callback) because `text` is outside the scope of the callback, which runs inside the browser. It'll always log `''`. – ggorlen Mar 29 '23 at 13:06
  • @ggorlen--onLLMstrike sorry i missed this comment, you're right, it was a couple years ago. My JS is still awful. The ops code /I think/ didn't work because the `await` is pointless on a non `async` method. I haven't tested (on my phone here), but the edit should work. – Pogrindis Jul 13 '23 at 19:16
  • Thanks for the response, although I'd just use my suggested code from my first comment. Your code has an extraneous `async`, which never needs to be added to a function unless it has `await` in it somewhere. Also, `page.$eval("#...", el => )` is a shortcut for `document.getElementById`. Not only is it less to type and cleaner to read, but if the element can't be found, you'll get a much clearer error message, something to the effect of "unable to find matching selector" rather than "evaluation failed: cannot read properties of null". – ggorlen Jul 13 '23 at 19:36
  • "`await` is pointless on a non `async` method" doesn't make sense--syntactically, it's _impossible_ to use `await` outside an `async` method, so it's not a matter of being pointless. We're still facing the fact that all of this is identical to OP's code, extraneous `async` or not. – ggorlen Jul 13 '23 at 19:38