1

I am trying to read a CSV file from a folder in FTP. The file has 3072 rows. However, when I am running the code, it is not reading all the rows. Certain rows from the bottom are getting missed out.

## FTP host name and credentials
ftp = ftplib.FTP('IP', 'username','password')

## Go to the required directory
ftp.cwd("Folder_Name")

names = ftp.nlst()
final_names= [line for line in names if '.csv' in line]

latest_time = None
latest_name = None

#os.chdir(filepath)

for name in final_names:
    
    time1 = ftp.sendcmd("MDTM " + name)
    if (latest_time is None) or (time1 > latest_time):
        latest_name = name
        latest_time = time1

file = open(latest_name, 'wb')

ftp.retrbinary('RETR '+ latest_name, file.write)

dat = pd.read_csv(latest_name)

The CSV file to be read from FTP is as given below-

enter image description here

The output from the code is as-

enter image description here

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992

1 Answers1

2

Make sure you close the file, before you try to read it, using file.close(), or even better using with:

with open(latest_name, 'wb') as file:
    ftp.retrbinary('RETR '+ latest_name, file.write)

dat = pd.read_csv(latest_name)

If you do not need to actually store the file to local file system, and the file is not too large, you can download it to memory only:
Reading files from FTP server to DataFrame in Python


Though, pandas.read_csv documentation claims that it supports FTP directly.
So this should do too:

pd.read_csv("ftp://username:password@example.com/remote/path/" + latest_name)
Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992