-4

I would like to scrape live data from this equity table and paste it in a excel file

I have tried using Python's beautifulsoup4 package, however, the data doesn't reside directly in the page html, rather, it seems to be loaded using some javascript or similar

stevec
  • 41,291
  • 27
  • 223
  • 311
Evan Strom
  • 65
  • 1
  • 7
  • 1
    There's *Download in csv* option. – pogibas Jul 23 '18 at 12:57
  • I know that but i want to extract live data for the table in that case the download csv option is useless as it gives u historical data – Evan Strom Jul 23 '18 at 13:07
  • @EvanStrom beautifulsoup (or R's rvest package) are great ways of scraping data from websites. However, after taking a look at the page html, I can see the data itself isn't in the page html. I suspect it's being loaded through some AJAX requests. I don't know how to scrape it but [this](https://stackoverflow.com/questions/260540/how-do-you-scrape-ajax-pages) may help. I don't want to discourage you, but I suspect it could be fairly difficult unless you know some javascript/AJAX/jQuery or have someone to help you – stevec Jul 23 '18 at 13:24

1 Answers1

2

Here's how to do it

Open the page in chrome. Now open the developer console in chrome. Click on the 'Network' tab. Now refresh the page.

This tab shows you requests as they're made (you can see about 8 or so items).

Manual inspection gives us the one we want:

https://www.nseindia.com/live_market/dynaContent/live_watch/stock_watch/niftyStockWatch.json

This is the link where the data resides.

Now, to get it into a csv (which can be opened in excel), use R's rvest package:

library(rvest)
library(jsonlite)

url <- "https://www.nseindia.com/live_market/dynaContent/live_watch/stock_watch/niftyStockWatch.json"
page_html <- read_html(url)
data <- html_nodes(page_html, "p")
data <- html_text(data)

data <- fromJSON(data)
write.csv(data$data, "scrapedData.csv", row.names=FALSE)

If you want this to be 'live' data, you can run the scrape at (say) 5 second intervals.

stevec
  • 41,291
  • 27
  • 223
  • 311