2

I am trying to extract the unemployment rate data from this site. In the form, there is a select tag with some options. I can extract the table from default year 2007 to 2017. But I am having a hard time to set a value for from_year and to_year. Here is the code I have so far:

session = html_session("https://data.bls.gov/timeseries/LNS14000000")
form = read_html("https://data.bls.gov/timeseries/LNS14000000") %>% html_node("table form") %>% html_form()
set_values(form, from_year = 2000, to_year = as.numeric(format(Sys.Date(), "%Y"))) # nothing happened if I set the value for years
submit_form(session, form)

It doesn't work as expected.

ulfelder
  • 5,305
  • 1
  • 22
  • 40
Gejun
  • 4,012
  • 4
  • 17
  • 22
  • I don't think you can do this without using something like `RSelenium`. See here for an example... https://stackoverflow.com/questions/43307090/how-to-select-dropdown-box-using-rselenium/43307980#43307980 However, I notice in this case that bls has an API, so that is probably worth a look first... https://www.bls.gov/developers/ – Andrew Gustar Apr 24 '17 at 16:46

1 Answers1

2

Thanks so much @Andrew!

I can use the api to extract the data.

library(rjson)
library(blsAPI)

uer1 <- list(
  'seriesid'=c('LNS14000000'),
  'startyear'=2000,
  'endyear'=2009)

response <- blsAPI(uer1, 2, TRUE)

The response looks like:

    year period periodName value    seriesID
1   2009    M12   December   9.9 LNS14000000
2   2009    M11   November   9.9 LNS14000000
3   2009    M10    October  10.0 LNS14000000
4   2009    M09  September   9.8 LNS14000000
5   2009    M08     August   9.6 LNS14000000
6   2009    M07       July   9.5 LNS14000000
...

Note that there are some query limits in the api.

api limits

Gejun
  • 4,012
  • 4
  • 17
  • 22