0

I'm having trouble accessing the Energy Information Administration's API through R (https://www.eia.gov/opendata/).

On my office computer, if I try the link in a browser it works, and the data shows up (the full url: https://api.eia.gov/series/?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=json).

I am also successfully connected to Bloomberg's API through R, so R is able to access the network.

Since the API is working and not blocked by my company's firewall, and R is in fact able to connect to the Internet, I have no clue what's going wrong.

The script works fine on my home computer, but at my office computer it is unsuccessful. So I gather it is a network issue, but if somebody could point me in any direction as to what the problem might be I would be grateful (my IT department couldn't help).

library(XML)

api.key = "e122a1411ca0ac941eb192ede51feebe"
series.id = "PET.MCREXUS1.M"

my.url = paste("http://api.eia.gov/series?series_id=", series.id,"&api_key=", api.key, "&out=xml", sep="")

doc = xmlParse(file=my.url, isURL=TRUE) # yields error

Error msg:

No such file or directoryfailed to load external entity "http://api.eia.gov/series?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=json"
Error: 1: No such file or directory2: failed to load external entity "http://api.eia.gov/series?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=json"

I tried some other methods like read_xml() from the xml2 package, but this gives a "could not resolve host" error.

twistedqbit
  • 105
  • 1
  • 6
  • Why not first get with `httr` and proceed? Also you request for `json` but use `XMLParse`, is that intentional? Use this: `https://api.eia.gov/series/?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=xml` – NelsonGon Jul 02 '19 at 08:55
  • Try this: `res <- httr::GET(my.url); jsonlite::fromJSON(httr::content(res,"text"))` or this: `xml2::read_xml(httr::content(res,"text"))` – NelsonGon Jul 02 '19 at 09:02
  • Sorry, I just tried json as well as XML. Fixed. – twistedqbit Jul 02 '19 at 10:19

2 Answers2

0

To get XML, you need to change your url to XML:

my.url = paste("http://api.eia.gov/series?series_id=", series.id,"&api_key=", 
               api.key, "&out=xml", sep="")

res <- httr::GET(my.url)
xml2::read_xml(res)

Or :

res <- httr::GET(my.url)
XML::xmlParse(res)

Otherwise with the post as is(ie &out=json):

    res <- httr::GET(my.url)
   jsonlite::fromJSON(httr::content(res,"text")) 

or this:

xml2::read_xml(httr::content(res,"text"))

Please note that this answer simply provides a way to get the data, whether it is in the desired form is opinion based and up to whoever is processing the data.

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • Using httr:GET also runs in to trouble: Error in curl::curl_fetch_memory(url, handle = handle) : Could not resolve host: api.eia.gov – twistedqbit Jul 02 '19 at 10:28
  • As said, the code to simply fetch the data is OK since it's working on my home computer. However, at my corporate desktop in my office I am never able to connect to the api for some reason. I can access the url in browser and R is connected to other online API's, so i need some help figuring out why the connection fails. – twistedqbit Jul 02 '19 at 10:30
  • Sorry, I cannot reproduce. I'm on a Unix system although I don't think that makes any difference. Try using another api to test if you can access it. Perhaps you're under a firewall? – NelsonGon Jul 02 '19 at 10:31
  • See this answer: https://stackoverflow.com/questions/39285570/error-in-curlcurl-fetch-memoryurl-handle-handle-couldnt-connect-to-ser It just might help/ – NelsonGon Jul 02 '19 at 10:32
  • I do have access to Bloomberg api through R and I can open the url in question in my browser. Can i still be a firewall issue do you think? – twistedqbit Jul 02 '19 at 10:37
  • Try changing your proxy settings as shown here: https://stackoverflow.com/questions/6467277/proxy-setting-for-r – NelsonGon Jul 02 '19 at 10:39
  • Tried both suggestions. Both give same errors. (On the second approach you suggested i used >set http_proxy=http://staff-proxy.ul.ie.8080 in Power shell, and managed to set the desired proxy in R, but the original error message persists.) – twistedqbit Jul 02 '19 at 10:58
  • 1
    Sorry, I am unable to help further. Hopefully someone figures it out and helps you solve it. – NelsonGon Jul 02 '19 at 10:59
0

If it does not have to be XML output, you can also use the new eia package. (Disclaimer: I'm the author.)

Using your example:

remotes::install_github("leonawicz/eia")
library(eia)
x <- eia_series("PET.MCREXUS1.M")

This assumes your key is set globally (e.g., in .Renviron or previously in your R session with eia_set_key). But you can also pass it directly to the function call above by adding key = "yourkeyhere".

The result returned is a tidyverse-style data frame, one row per series ID and including a data list column that contains the data frame for each time series (can be unnested with tidyr::unnest if desired).

Alternatively, if you set the argument tidy = FALSE, it will return the list result of jsonlite::fromJSON without the "tidy" processing.

Finally, if you set tidy = NA, no processing is done at all and you get the original JSON string output for those who intend to pass the raw output to other canned code or software. The package does not provide XML output, however.

There are more comprehensive examples and vignettes at the eia package website I created.

leonawicz
  • 1
  • 2
  • Based on the domain/URL of your link(s) being the same as, or containing, your user name, you appear to have linked to your own site/a site you're affiliated with. If you do, you *must disclose that it's your site*. If you don't disclose affiliation, it's considered spam. See: [**What signifies "Good" self promotion?**](//meta.stackexchange.com/q/182212) and [the help center on self-promotion](//stackoverflow.com/help/promotion). Disclosure must be explicit, but doesn't need to be formal. When it's your own *personal* content, it can just be something like "on my site…", "on my blog…", etc. – Makyen Jul 18 '19 at 21:36
  • @Makyen For clarification, is the explicit disclaimer at the start of the answer insufficient? Reading those suggested links, my interpretation is that the answer stands well on its own. The supplemental link was to the complete library documentation, which would also later be revised after the software completes peer review and is hosted elsewhere. – leonawicz Jul 18 '19 at 21:52
  • It would be more clear if the disclaimer and the link were closer together, or maybe if you just repeat the `eia` name near the link. As it is, there's really nothing connecting your first statement to the link at the bottom and the link text doesn't say that it's a link to the `eia` package you authored. Basically, when reading your answer, it's not clear that you're linking to the same thing you're discussing, or somewhere else that contains examples and vignettes written by someone else. Or, at least, I did not make that connection which I first quickly read your answer. – Makyen Jul 18 '19 at 21:58