5

enter image description here

I need to get a text from the span tag and to verify whether the text equals to "check".

How can I achieve this in puppeteer?

Below is the example of the code I've written, if anyone could put me help me figure this out, please.

const puppeteer = require("puppeteer");

(async () => {
  const browser = await puppeteer.launch({
    headless: false,
    // "slowMo": 50,
    args: ["--start-fullscreen"],
    defaultViewport: null,
  });
  //Page
  const page2 = await browser.newPage();

  await page2.goto("https://www.flipkart.com");
  await page2.waitFor(2000);
  await page2.$x("//input[@class='_2zrpKA _1dBPDZ']").then(async (ele) => {
    await ele[0].type(username);
  });
  await page2.waitFor(2000);
  await page2.$x("//input[@type='password']").then(async (ele) => {
    await ele[0].type(password);
  });
  await page2.waitFor(2000);
  await page2
    .$x("//button[@class='_2AkmmA _1LctnI _7UHT_c']")
    .then(async (ele) => {
      await ele[0].click();
    });
  await page2.waitFor(2000);
  await page2.$x("//input[@class='LM6RPg']").then(async (ele) => {
    await ele[0].type("iPhone 11");
  });
  await page2.waitFor(2000);
  await page2.$x("//button[@class='vh79eN']").then(async (ele) => {
    await ele[0].click();
  });
  await page2.waitFor(2000);
  await page2.$x("//div[@class='col col-7-12']/div").then(async (ele) => {
    await ele[0].click();
  });
  await page2.waitFor(2000);
  let [element] = await page2.$x('//span[@class="_2aK_gu"]');
  let text = await page2.evaluate((element) => element.textContent, element);
  if (text.includes("Check")) {
    console.log("Check Present");
  }
  if (text.includes("Change")) {
    console.log("Change Present");
  }
})();

Emmanuel Neni
  • 325
  • 4
  • 11
Rajesh G
  • 473
  • 3
  • 6
  • 20
  • Can you show us your code ? What have you tried/searched ? When you have text output, [don't take a picture but copy paste the output in your POST](https://unix.meta.stackexchange.com/questions/4086/psa-please-dont-post-images-of-text) The html can be copied as well with right click -> copy as outerHTML. – Gilles Quénot Jan 10 '20 at 17:25
  • I tried the following: await this.page.$x('//span[@class="_2aK_gu"]/text()') – Rajesh G Jan 10 '20 at 17:33
  • Thanks to post the whole code – Gilles Quénot Jan 10 '20 at 17:52

3 Answers3

4

//get the xpath of the webelement

const [getXpath] = await page.$x('//div[]');

//get the text using innerText from that webelement

const getMsg = await page.evaluate(name => name.innerText, getXpath);

//Log the message on screen

console.log(getMsg)
Amir Sikandar
  • 161
  • 1
  • 5
  • 2
    Please don't post only code as answer, but also provide an explanation what your code does and how it solves the problem of the question. Answers with an explanation are usually more helpful and of better quality, and are more likely to attract upvotes. – Mark Rotteveel Aug 22 '20 at 10:04
  • `page.evaluate(name => name.innerText, getXpath);` can be written as `getXpath.evaluate(el => el.innerText);` (although `getXpath` isn't the clearest variable name, and I'm not sure why `name` was used to refer to a DOM element). – ggorlen Dec 31 '21 at 03:49
3

Here is the complete code for getting div or any html element data using xpath....

const puppeteer = require("puppeteer");

async function scrape () {

    const browser = await puppeteer.launch({headless: false});
    const page =  await browser.newPage();
    await page.goto("https://twitter.com/elonmusk", {waitUntil: "networkidle2"})
    await page.waitForXPath('/html/body/div[1]/div/div/div[2]/main/div/div/div/div/div/div[2]/div/div/section/div/div/div[1]/div/div/article/div/div/div/div[2]/div[2]/div[1]/div/div/div[1]/div[1]/div/div[1]/a/div/div[1]/span/span');

    let [el] = await page.$x('/html/body/div[1]/div/div/div[2]/main/div/div/div/div/div/div[2]/div/div/section/div/div/div[1]/div/div/article/div/div/div/div[2]/div[2]/div[1]/div/div/div[1]/div[1]/div/div[1]/a/div/div[1]/span/span');

    // console.log()

    const names = await page.evaluate(name => name.innerText, el);
    console.log(names);


    await browser.close();
};

scrape();
Furqan Ali
  • 542
  • 4
  • 7
1

You can get the text form the selected element like this:

await page.goto(url, {waitUntil: "networkidle2"});
await page.waitForXPath('//span[@class="_2aK_gu"]');
//assuming it's the first element
let [element] = await page.$x('//span[@class="_2aK_gu"]');
let text = await page.evaluate(element => element.textContent, element);

Note that page.$x returns an array of ElementHandles, so the code here assumes it's the first element. I'd suggest you chose a more specific XPath than a class as many elements may have it.


For the condition:

if (text.includes("Check"))
    //do this
else if (text.includes("Change"))
    //do that
mbit
  • 2,763
  • 1
  • 10
  • 16
  • I'm not using jest-puppeteer ! So could you please provide me a solution to get text without using jest puppeteer ? – Rajesh G Jan 12 '20 at 16:55
  • @RajeshG edited the answer, let me know if it's not clear. – mbit Jan 12 '20 at 19:51
  • Agreed ! Tried this earlier and I received the following error in my console...'Error: Evaluation failed: TypeError: Cannot read property 'textContent' of undefined at __puppeteer_evaluation_script__:2:34' – Rajesh G Jan 13 '20 at 05:21
  • @RajeshG probably the classname is a random string that changes so it can't find the element. can you provide more details on top of that snapshot? like page's url or html – mbit Jan 13 '20 at 05:25
  • URL:- https://www.flipkart.com/apple-iphone-11-black-128-gb/p/itm06bac28995200?pid=MOBFKCTSYAPWYFJ5&lid=LSTMOBFKCTSYAPWYFJ580EM6T&marketplace=FLIPKART&srno=s_1_1&otracker=search&otracker1=search&fm=SEARCH&iid=c6b61ecd-a5d6-411b-baa8-3ad6b15c9ad0.MOBFKCTSYAPWYFJ5.SEARCH&ppt=sp&ppn=sp&ssid=gsr9ittk6o0000001578892687146&qH=f6cdfdaa9f3c23f3 Change
    Check pincode
    – Rajesh G Jan 13 '20 at 05:32
  • Req:- If that particular span class contains 'Change' - then do some operation, else if contains 'Check' - then do some operation – Rajesh G Jan 13 '20 at 05:44
  • @RajeshG ran the code on the website and it worked fine. are you sure you're also waiting for the xpath using `await page.waitForXPath()`? – mbit Jan 13 '20 at 05:44
  • Yes..If I use wait for xpath it is throwing Timeout error – Rajesh G Jan 13 '20 at 05:48
  • @RajeshG it's just a simple if else, see my edited answer – mbit Jan 13 '20 at 05:49
  • @RajeshG I cannot reproduce the issue you're having as it's working fine on my side. set the `headless` to false see if the website is loading correctly, then inspect the element check if you can get the element using `$x('//span[@class="_2aK_gu"]')` in console. – mbit Jan 13 '20 at 05:53
  • Headless is set to false... And defaultview port is also set to null. – Rajesh G Jan 13 '20 at 06:03
  • @RajeshG did you check the element? were you able to get it with `$x()` I put? – mbit Jan 13 '20 at 06:07
  • Yes I'm able to get the element ! – Rajesh G Jan 13 '20 at 06:11
  • @RajeshG then there is something going on with your code, edit your answer and post your code so I can take a look – mbit Jan 13 '20 at 06:12
  • @RajeshG I checked the relevant part and it works fine on my side on the url you provided. This is my suggestion: comment out the code up to the last part and run the code. if it works correctly, as it should, it means the issue is most probably in the previous parts (clicking/authentication) that doesn't lead to the correct page – mbit Jan 13 '20 at 06:58
  • Yeah ... Since the clicking upon the product , it opens in the new tab, it looks like the page is not getting retained. Any solution for this ? – Rajesh G Jan 13 '20 at 07:10
  • @RajeshG if the click opens a new tab it's a different problem. the reason it doesn't find the element is it looks for the element in the wrong tab/page. you need to look for the element in the opened tab. create a new question and I'll post an answer there since it's irrelevant to the topic here. – mbit Jan 13 '20 at 07:26
  • https://stackoverflow.com/questions/59712283/puppeteer-finds-the-element-in-the-wrong-tab – Rajesh G Jan 13 '20 at 07:49