-1

I am trying to scrape the results from a Quora search query using ImportXML.

The URL is of this form: https://www.quora.com/search?q=scrape%20Quora&time=year

I've tried using ImportXML, and can't get anything to work. As an example, I inspected the questions, and found they were inside a div with a class name of 'q-text puppeteer_test_question_title'. So I tried to import like this, but I just get #N/A:

importxml("https://www.quora.com/search?q=scrape%20Quora&time=year","//div[@class='q-text puppeteer_test_question_title']")

This is clearly not working: is there a fix or just not possible (and why)? Thank you.

Rubén
  • 34,714
  • 9
  • 70
  • 166
  • Does this answer your question? [Scraping data to google sheets from a website that uses JavaScript](https://stackoverflow.com/questions/74237688/scraping-data-to-google-sheets-from-a-website-that-uses-javascript) – Rubén Dec 29 '22 at 21:39

2 Answers2

1

Quora (as of now) runs on JavaScript and google sheets import formulae do not support the scrapping of JS elements:

enter image description here

player0
  • 124,011
  • 12
  • 67
  • 124
1

You can try to fetch the first 3 responses this way (quickly written, could be improved)

function myFunction() {
  var options = {
     'muteHttpExceptions': true,
     'followRedirects': false
   };
  var url = 'https://www.quora.com/search?q=scrape%20Quora&time=year'
  var jsonStrings = UrlFetchApp.fetch(url,options).getContentText().split('window.ansFrontendGlobals.data.inlineQueryResults.results["')
  jsonStrings.forEach((jsonString,i) => {
    if (i > 0) {
      console.log(jsonString.split('"] = ')[1].split('\n')[0])
    }
  })
}

and then parse the complex json inside. However, other answers are transmitted by quora when scrolling down by ajax asynchronous request.

Mike Steelson
  • 14,650
  • 2
  • 5
  • 20
  • Thank you Mike. That actually did something! Thanks for your help - will now try to figure out how to use this. – Jack Samson Jul 09 '22 at 11:21