0

I have the following two xml links, each of which I want to use as the file parameter to open:

table1 = 'https://www.sec.gov/Archives/edgar/data/1103804/000110380417000040/xslForm13F_X01/Form13fInfoTable.xml'
table2 = 'https://www.sec.gov/Archives/edgar/data/1103804/000110380417000040/Form13fInfoTable.xml'

What I have tried:

  • Using raw strings (r'https:// ...)
  • Excluding https:// in each of the path names (to get rid of the colon on a Windows system)
  • Using 'r' within open(), which should be unnecessary anyway because it is the default

There are a number of similar SO questions, none of which offer solutions that get rid of this error. Although the following is seemingly innocuous I cannot get past the error. For instance,

d = open(table1, 'r')
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-110-07d32326a11e> in <module>()
----> 1 d = open(table1, 'r')

FileNotFoundError: [Errno 2] No such file or directory: 'www.sec.gov/Archives/edgar/data/1103804/000110380417000040/xslForm13F_X01/Form13fInfoTable.xml'
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235

1 Answers1

1

This is the open function you want.

import urllib

urllib.urlopen('http://example.com') #python 2

urllib.request.urlopen('http://example.com') #python 3
Carl Shiles
  • 434
  • 3
  • 15
  • 1
    I was following along blindly with [this](http://www.austintaylor.io/lxml/python/pandas/xml/dataframe/2016/07/08/convert-xml-to-pandas-dataframe/) walkthrough, which uses `open`. Thanks, and in 3.x this is `urllib.request.urlopen` – Brad Solomon Jun 18 '17 at 20:01