2

Is there a way within R to list (find) all links for a given webpage? I'd like to enter a URL and produce a directory tree of all links from that site. The purpose is to find the relevant sub-page to scrape.

Here is link to similar question on SO but without R solution. Thanks.

Link here to similar question on SO

There is a suggested solution with Linkchecker but that runs under Python, is there something within R?

Community
  • 1
  • 1
Maximilian
  • 4,177
  • 7
  • 46
  • 85
  • This isn't a good question for SO since it's not about programming. But you should check the web task view: http://cran.r-project.org/web/views/WebTechnologies.html - there should be all you need. – bdecaf May 14 '15 at 10:27

1 Answers1

3

I think rvest can do what you are looking for...

library("rvest")
# some url with a bunch of links...
url <- "http://www.drudgereport.com"
url %>% html %>% html_nodes("a") %>% xml_attr("href")
cory
  • 6,529
  • 3
  • 21
  • 41