1

I have problems to open a txt file from an external URL. The code below works fine when reading a downloaded txt file from my PC like

URL='grst0120.txt'

But it doesn't works if I try to read the same txt file from an external site like

URL='https://downloads.usda.library.cornell.edu/usda-esmis/files/xg94hp534/0c4841048/8w32rn389/grst0120.txt'

The code below opens a txt file from the USDA website an prints all the lines with the word "December". The code works fine when opening a downloaded txt file from my PC but I need another method to open the same file from the internet. I appreciate any help. Code...

import re

URL = "https://downloads.usda.library.cornell.edu/usda-esmis/files/xg94hp534/0c4841048/8w32rn389/grst0120.txt"

# The code fails with this external URL but it works fine if I download the txt file and 
# I change the URL pointing to my PC location, like, URL = "grst0120.txt". 

Stocks = []
LineNum = 0
pattern = re.compile("December", re.IGNORECASE)

with open (URL, 'rt') as myfile:
    for line in myfile:
        LineNum += 1
        if pattern.search(line) != None:
            Stocks.append((LineNum, line.rstrip('\n')))
for Stocks_found in Stocks:
    print("Line " + str(Stocks_found[0]) + ": " + Stocks_found[1])
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
Davide
  • 15
  • 2

2 Answers2

3

open() does not accept URLs but only paths to local files instead. For Python 3.x you can use urllib instead:

import urllib.request

URL = "https://downloads.usda.library.cornell.edu/usda-esmis/files/xg94hp534/0c4841048/8w32rn389/grst0120.txt"

data = urllib.request.urlopen(URL)

for line in data:
    print(line) 
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
0

One way I could see is using the urllib module to download the Textfile to a folder, then opening it from there.

https://stackabuse.com/download-files-with-python/

The use of urllib is explained quite well on that site. Though I'm sure, there's a more efficient way to execute your task, this might be one way.

ShadowNovo
  • 48
  • 3