2

I'm not entirely sure I understand what I'm asking for, and I'm hoping someone can explain. I'm attempting to scrape a website using Puppeteer on NodeJS. I've gotten as far as selecting the element I need and accessing it's properties, however, I cannot access the property I need to pull the information I want. The information I want is within the green box below, however I cannot get past the __reactEventHandlers$kq2rgk91p6 as that just returns undefined.

enter image description here

I used the following selector, which works and accesses all other properties, just not the one I want.

    const checked = await page.evaluate(() => document.querySelector(stockSelector));
  • Which page are you scraping? It's hard to help without seeing the site itself and the exact data you're trying to get on it. Generally speaking, it's unusual to need to dip into React implementation details to find what you're after. – ggorlen May 05 '22 at 17:23

2 Answers2

1

If I understand correctly (without the URL and minimal reproducible code it is hard to guess), this is the issue: according to the docs, various eval functions can transfer only serializable data (roughly, the data JSON can handle, with some additions). Your code returns a DOM element, which is not serializable (it has methods and circular references). Try to retrieve the data in the browser context and returns only serializable data. For example:

const data = await page.evaluate(
  selector => document.querySelector(selector)
    .__reactEventHandlers$kq2rgk91p6.children[1].props.record.Stock,
  selector,
);

If the array in the .Stockproperty is serializable, you will get the data.

vsemozhebuty
  • 12,992
  • 1
  • 26
  • 26
  • After posting this question, I've discovered that the numbers following the eventhandler are randomly generated on each session, making it hard for me to specify the property itself. Still in the process of figuring out how to do so with just the reactEventHandler as reference. – Invalid Sun May 25 '20 at 21:37
  • If this property is enumerable, you can try to iterate over all own properties to find one that begins with 'reactEventHandler'. – vsemozhebuty May 25 '20 at 22:19
0

I am using this function to extract React props, it helps to deal with the random characters at the end of react event handler. If you are not sure which childIndex to use, check React Chrome extension to navigate to the element.

const extractProps = async (elementHandle, childIndex) => {
    
    let elementHandlerProperties = await elementHandle.getProperties()
    
    for (let elProp of elementHandlerProperties) {
        let key = elProp[0]
        if (key.startsWith("__reactEventHandler")) {
            let reactEventHandler = elProp[1]
            let children = await reactEventHandler.getProperty("children")
            let child = await children.getProperty(childIndex.toString())
            let reactProps = await child.getProperty("props")
    
            return reactProps
        }
    }
    return null
}

Usage:

const selector = ".some-class"
const elementHandle = await page.$(selector);
let reactProps = await extractProps(elementHandle, 1)
let prop1 = await reactProps.getProperty("prop1")
console.log(await prop1.jsonValue())
Handsome Greg
  • 170
  • 11