0

I've been looking into this a while, but I'm finding it hard to find any examples for my specific case, I want to grab all CSV's in a folder on an FTP, then combine them together and then display them. I've been able to grab single files fine, but when mixing it with multiple and combining I tend to get an error stating

TypeError                                 Traceback (most recent call last)
<ipython-input-14-7b3417be9f4e> in <module>
     19    print (mycsvdir)
     20 
---> 21 csvfiles = glob.glob(os.path.join(mycsvdir , '*.csv'))
     22 dataframes = []
     23 for csvfile in csvfiles:

c:\users\xxx\appdata\local\programs\python\python37-32\lib\ntpath.py in join(path, *paths)
     74 # Join two (or more) paths.
     75 def join(path, *paths):
---> 76     path = os.fspath(path)
     77     if isinstance(path, bytes):
     78         sep = b'\\'

TypeError: expected str, bytes or os.PathLike object, not list

​ I combined all of them into a single file, and it shouldn't just be a list, so I'm guessing I've done something fundamentally wrong. full code-

import glob
import os
import pandas as pd
import ftplib
from ftplib import FTP
def grabFile(ftp_obj, filename):
    localfile = open(filename, 'wb')
    ftp.retrbinary('RETR ' + filename, localfile.write, 1024)

ftp = FTP('f20-preview.xxx.com')
ftp.login(user='xxx', passwd = 'xxx')
ftp.cwd('/testfolder/')


mycsvdir = []
ftp.dir(mycsvdir.append)
files = []
for line in mycsvdir:
   print (mycsvdir)

csvfiles = glob.glob(os.path.join(mycsvdir , '*.csv'))
dataframes = []
for csvfile in csvfiles:
    df = pd.read_csv(csvfile)
    dataframes.append(df)

result = pd.concat(dataframes, ignore_index=True)

result.to_csv('all.csv', index=False)


data = pd.read_csv('all.csv') 
data.head()  

I 'm relatively new to python and a lot of my experience comes from reading very old posts and lessons on the matter, I apologize for my naivete

Hot Java
  • 409
  • 5
  • 16

1 Answers1

0
mycsvdir = []
...    
csvfiles = glob.glob(os.path.join(mycsvdir, '*.csv'))

mycsv is a list. os.path.join expects a str, bytes or os.PathLike object for the first argument.

>>> root = 'a:\\b\\'
>>> f = 'foo.txt'
>>> os.path.join(root,f)
'a:\\b\\foo.txt'

With a list of file names, iterate over the list and create a path for each name.

>>> fnames = ['a.txt', 'b.txt', 'c.txt']
>>> for name in fnames:
    print(os.path.join(root,name))

a:\b\a.txt
a:\b\b.txt
a:\b\c.txt
>>> 

Related:
Using Python's ftplib to get a directory listing, portably
Python: How to get list of file and use wildcard in FTP directory?

There are a number of others searching with python ftp get list of files or python ftp list files

wwii
  • 23,232
  • 7
  • 37
  • 77
  • I've looked over them and understand, So am i understanding correctly you think the best path is to output a list of all files then run a command for each? we more wish to combine the content of each file and output that into a single file. I've read through both links and I'm able to output everything from the directory but I still get the same ending error. – Hot Java Aug 02 '19 at 15:57
  • Get a list of all the files in the ftp directory, I don't think you will need `glob` if you just issue a command to the ftp server and tell it which directory you want to look at. Once you get the list of file names, iterate over them, filter for csv files (`'filename'.endswith('.csv)'`), and retrieve those files from the server. Search for `python concatenate multiple csv files` or something similar - there are a plethora of SO Q&A's regarding combining many csv files into one file. – wwii Aug 02 '19 at 17:56
  • You can [catch the error](https://docs.python.org/3/tutorial/errors.html#handling-exceptions) and inspect/print/log the relevant *variables* in the except suite to see what is happening. If you are somehow passing a list to `os.path.join` you have to figure out why that is happening. Without example inputs/data it isn't easily testable for us. Please read [mcve]. – wwii Aug 02 '19 at 17:58