1

I'm interested in scraping titles of journals from Web of Knowledge using Rand rvest. However, I'm having problems submitting the proper form. I'm interested in a list of all Econometrica articles from 1960-1970. I'm automatically logged in using access from my local university library.

When I run

library("rvest")
library("httr")
link = "http://isiknowledge.com/wos"
form = html_session(link) %>% html_form() # returns list of 6 forms
form[[4]] = set_values(form[[4]], # set values in form number 4
    product = "WOS",
    range = "ALL",
    action = "search",
    period = "Range Selection", 
    startYear = "1960",
    endYear = "1970",
    range = "ALL",
    'value(select1)' = "SO",
    'value(input1)' = "econometrica",
    formUpdated = "TRUE") 

submit_form(html_session(link), form = form)

I have two problems: First, it submits with '' and not "Econometrica", and second, I receive the following error message: Error in if (!(submit %in% names(submits))) { : argument is of length zero.

There's a Python alternative here but the code has to be in R. Any help on how to make progress would be much appreciated.

1 Answers1

2

I've had similar trouble with ISI pages, and the problem was that they at least sometimes design their forms with no submit buttons (submission is handled using JavaScript). I examined the link you posted, and that seems to be the case with the fourth form on that page (though I'm not sure if the search image serves as a submit button).

If this is the problem, then my answer to the question "Submit form with no submit button in rvest" might provide the solution for your case.

In brief, you can inject a submit button into your version of the code and then submit that. Details of how to do that are in the linked post.

Community
  • 1
  • 1
Tripartio
  • 1,955
  • 1
  • 24
  • 29