3

I am trying to scrape data from a restricted-access website (which my institution's library gives me access to) using the XML package in R. The library gives me access using EZproxy.

base <- "the URL"
tabs <- readHTMLTable(base)

These lines give me the output below. There seems to be something going on with the cookies. Both browsers on my computer have cookies enabled. How do I successfully scrape data from the website? Thanks in advance!

$`NULL`
                                                                                                                                                                          V1
1                                                                                                                                                                           
2                                                                                                                                                           Cookies disabled
3                                                                                                                                                                           
4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again.

$`NULL`
  V1 V2 V3
1         

$`NULL`
                V1
1 Cookies disabled

$`NULL`
  V1
1   
2   
3
user1389960
  • 433
  • 4
  • 11
  • 1
    Maybe you'll want to look at using **RCurl** and then parsing the results with the **XML** package? – joran May 11 '12 at 17:41
  • Or use the new `httr` package that provides easy to use wrappers around `RCurl` with options to authenticate: http://cran.r-project.org/web/packages/httr/index.html – Andrie May 11 '12 at 19:23

0 Answers0