I am trying to accomplish a rather simple task...
I am looking to loop through all .csv
files in a specified github repository, specifically, this one
The following minimal, complete, reproducible example should demonstrate the problem:
import pandas as pd, urllib, requests, os, glob
base_url = 'https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series'
# https://stackoverflow.com/questions/39065921/what-do-raw-githubusercontent-com-urls-represent
base_raw_url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series'
#base_dir = os.listdir(base_url)
#base_raw_dir = os.listdir(base_raw_url)
# https://stackoverflow.com/questions/61036695/import-multiple-csv-files-from-github-folder-python-covid-19
csv_files = glob.glob(base_raw_url+'/*.csv')
print(csv_files)
[]
csv_files
is an empty list, and both os.listdir()
attempts result in:
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series'
How can I simply loop through the directory? I am looking to ultimately get the complete path (url) for each of the .csv
files.