0

I am trying to download data from the site below using python.

ftp://chain.physics.unb.ca/gps/data/nvd/

When I open the link in a browser, I am redirected to the page below.

I have the appropriate credentials to get access to the data. After having put in the credentials, I am taken to the following page. I am taken to the following page.

Once I have access to this page, I can loop around the URL and use the wget library to download the required data. the URL to a datafile looks like, ftp://chain.physics.unb.ca/gps/data/nvd/arvc/2017/03/arvc060B.17_.gz.

I believe that I can pick it up from the data page with the appropriate loop to navigate around the data using the URL. I am having a really hard time trying to get through the Credential page to the URL with the FTP protocol.

Please advise.

  • Does this answer your question? [Python: download a file over an FTP server](https://stackoverflow.com/questions/11768214/python-download-a-file-over-an-ftp-server) – metatoaster Feb 04 '20 at 03:05
  • Thank you for your reply. I am not exactly sure where to place the username and password. Lets say my user name is abcdef@gmail.com and passsword is abcdef. Any suggestions on how the link to the following file would like? ftp://chain.physics.unb.ca/gps/data/nvd/arvc/2017/03/arvc060B.17_.gz. – chintan thakrar Feb 04 '20 at 03:12
  • If you read through all the comments and answers you will find what you want in that thread. – metatoaster Feb 04 '20 at 03:14
  • I ran the following code. import wget file23 = "C:\\Users\\chint\\Documents\\arvc060B.17_.gz" wget('ftp://chintanthakrar2014@gmail.com:abcdefg@chain.physics.unb.ca/gps/data/nvd/arvc/2017/03/arvc060B.17_.gz', file23) but I encountered TypeError: 'module' object is not callable I have changed the credentials for privacy purposes – chintan thakrar Feb 04 '20 at 03:39
  • You just ran `import wget`, and it's telling you that the module is not callable. Perhaps you want to use `wget.download(...)` as [the documentation on the landing page on pypi](https://pypi.org/project/wget/) states? Also you might want to try the builtin `ftplib` module on the second answer in the linked thread instead? – metatoaster Feb 04 '20 at 04:08
  • Yup the whet download worked perfectly. Thank you sooo much for your help. I highly appreciate it. – chintan thakrar Feb 04 '20 at 21:46

1 Answers1

0

Something like this, perhaps?

# write all files in one folder, with formatted date and time, to a text file...
import ftplib
import datetime 
from datetime import datetime 

ftp = ftplib.FTP('ftp.yours.com', 'u_name', 'pswd')  

#ftp.nlst()
#directory = ftp.nlst('/emm1/')

ftp.cwd('')
ftp.retrlines('LIST')  

filenames = []  
ftp.retrlines('NLST', filenames.append)  

# writes file name and modified date and file size.
with open('C:\\your_path\\test.txt', 'w')  as f:
    for filename in filenames:  
        datetimeftp = ftp.sendcmd('MDTM ' + filename)
        modifiedTimeFtp = datetime.strptime(datetimeftp[4:], "%Y%m%d%H%M%S").strftime("%d %b %Y %H:%M:%S")
        size = ftp.size(filename)
        filesize = "{:.2f}".format(size/(1024))
        f.write(filename)
        f.write(':')
        f.write(modifiedTimeFtp)
        f.write(':')
        f.write(filesize + ' KB')
        f.write('\n')
f.close()

Or, maybe this?

import ftplib
from ftplib import FTP
ftp = FTP()
from datetime import datetime


filenames = []
data = []

ftp = ftplib.FTP('ftp.anything.com', 'u_name', 'ps_wd')  


def get_dirs_ftp(folder=""):
    contents = ftp.nlst(folder)
    folders = []
    for item in contents:
        if "." not in item:
            folders.append(item)
    return folders
def get_all_dirs_ftp(folder=""):
    dirs = []
    new_dirs = []
    new_dirs = get_dirs_ftp(folder)
    while len(new_dirs) > 0:
        for dir in new_dirs:
            dirs.append(dir)

        old_dirs = new_dirs[:]
        new_dirs = []
        for dir in old_dirs:
            for new_dir in get_dirs_ftp(dir):
                new_dirs.append(new_dir)
    dirs.sort()
    return dirs

#allfiles = []
# get parent and child folders in directory
all_dirs = get_all_dirs_ftp()

# create a list to append metadata
dir_list = []

for dir in all_dirs:
    ftp.cwd('/'+dir+'/')
    print(dir)
    dir_list.append(dir)
    ftp.dir(dir_list.append)

    len(dir_list)


# you probably want to dump the results to a file...
outF = open('C:/your_path/filenames.csv', 'w')
for line in dir_list:
  # write line to output file
  outF.write(line)
  outF.write("\n")
outF.close()
print('Done!!')
ASH
  • 20,759
  • 19
  • 87
  • 200