Scraping data in R

Question

Is there any way to scrape data in R for:

General Information/Launch Date from this Website: https://www.euronext.com/en/products/etfs/LU1437018838-XAMS/market-information

So far, I have used this code, but the generated XML file does not contain Information that I Need:

library(rvest)
library(XML)

url <- paste("https://www.euronext.com/en/products/etfs/LU1437018838-XAMS/market-information",sep="")

download.file(url, destfile = "scrapedpage.html", quiet=TRUE)
content <- read_html("scrapedpage.html")

content1 <- htmlTreeParse(content, error=function(...){}, useInternalNodes = TRUE)

"the generated XML file does not contain Information that I Need" What information is that exactly? How does that differ from what you get? — camille, Aug 09 '18 at 14:19
You could use xpathSApply to parse the data you need from the content variable. This will involve a bit of manual work to specify exactly which pieces of the page you require. — Kharoof, Aug 09 '18 at 14:29
When you open the link, you can see: General Information/Launch Date, and I need the Information: 16 May 2017. But it is not shown in XML file, that what I mean. — Thang Do, Aug 09 '18 at 15:06

score 0 · Accepted Answer · answered Aug 09 '18 at 15:12

What you are trying to scrap is in an AJAX object called factsheet (I dont know javascript so I cant tell you more). Here is a solution to get what you want : Get the URL of the data used by javascript using the network analysis from your browser (XHR thing). See here.

library(rvest)
url <- read_html("https://www.euronext.com/en/factsheet-ajax?instrument_id=LU1437018838-XAMS&instrument_type=etfs")
launch_date <- url %>% html_nodes(xpath = "/html/body/div[2]/div[1]/div[3]/div[4]/strong")%>%
  html_text()

Scraping data in R

1 Answers1