I had a look at this Question: Inputting NA where there are missing values when scraping with rvest with a great answer!
Goal: Achieve the same result with xpath.
It seems in the example css identifiers are used:
xx <- read_html("https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference?sort=status&direction=desc&page=14")
xx %>% html_nodes(xpath = "/html/body/main/section[2]/div/article") %>%
map_df(~list(title = html_nodes(.x, css = 'header h3 a') %>%
html_text() %>% {if(length(.) == 0) NA else .}, # replace length-0 elements with NA
length = html_nodes(.x, css = 'a time') %>%
html_text() %>% {if(length(.) == 0) NA else .}))
Question: How can it be done with xpath?
xpath should acutally be:
'/header/h3/a'
What i tried:
## XPath
xx <- read_html("https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference?sort=status&direction=desc&page=14")
xx %>% html_nodes(xpath = "/html/body/main/section[2]/div/article") %>%
map_df(~list(title = html_nodes(.x, xpath = '/header/h3/a') %>%
html_text() %>% {if(length(.) == 0) NA else .}, # replace length-0 elements with NA
length = html_nodes(.x, xpath = '/a/time') %>%
html_text() %>% {if(length(.) == 0) NA else .}))