1

I am trying to download the XBRL data from the SEC site, using the finstr package in R.

The vignette references Apple financial statements from 2013-14. I am going after Abbott (CIK 1800) for mine. I've looked through the data records on the SEC site and the submission is in this folder:

https://www.sec.gov/Archives/edgar/data/1800/000110465920023904

The Apple xml file is named aapl-20140927.xml (the CIK followed by the submission date). I've gone into the file through a browser and identified the relevant data.

The Abbott xml file that has the same info is named abt-20191231x10k59d41b_htm.xml , again with the relevant data.

Following the vignette, I've added this code:

xbrl_url2020 <- "https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231x10k59d41b_htm.xml"
xbrl_url2019 <- 
  "https://www.sec.gov/Archives/edgar/data/1800/000104746919000624/abt-20181231.xml"
old_o <- options(stringsAsFactors = FALSE)
xbrl_data_aapl2020 <- xbrlDoAll(xbrl_url2020)

This then returns:

Error in fileFromCache(file) : 
Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd'

In addition: Warning message:
In download.file(file, cached.file, quiet = !verbose) :
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd': HTTP status was '404 Not Found'

I've read through other submissions here and not sure whether it's a schema issue, whether I've gone after the wrong file (there's no other in the folder which has the info in totality) or whether it's something else.

Also I've noticed a comment saying the datasets on the SEC site https://www.sec.gov/dera/data/financial-statement-data-sets.html have all the relevant info. The issue with the sets is that they are the submitted data and not the ratified so may be different to the final results published.

Appreciate any help possible.

  • The package incorrectly treats absolute https URLs as relative. See https://stackoverflow.com/a/53666384 for more information and how to fix. – Matthew Wightman Sep 13 '20 at 10:05
  • Thank you but tried it and same result! – Foothill_trudger Sep 13 '20 at 20:23
  • What exactly did you try? Can you share the change that you made to the code? The error that you're seeing does look very much like it's treating an `https` URL as relative. – pdw Sep 15 '20 at 10:10
  • It wasn't complex! In the first line I merely took the 's' off: `xbrl_url2020 <- "http://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231x10k59d41b_htm.xml"` It appears, from other thread, that there is a caching problem but I am trying to find out more – Foothill_trudger Sep 16 '20 at 11:21

0 Answers0