1

I use the following lines of code to loop over different files in a folder:

import os

files_in_folder_1 = [os.path.join(path1, f) for f in os.listdir(path1) if os.path.isfile(os.path.join(path1, f))]

files_in_folder_2 = [os.path.join(path2, f) for f in os.listdir(path2) if os.path.isfile(os.path.join(path2, f))]

for file1, file2 in zip(files_in_folder_1, files_in_folder_2):
    with open(file1) as f1, open(file2) as f2:

        dftask = pd.read_csv(file2)
        dfresource = pd.read_csv(file1)

At the end of all operations I want to save the files in another directory with the same filename. However how should I do that? I use this:

dftask.to_csv(r'path\file1.csv')
dfresource.to_csv(r'path\file2.csv')

However when using this line of code the csv. file is constantly overwritten inside the loop over all files.

What is the solution?

F1990
  • 627
  • 2
  • 9
  • 20

4 Answers4

2

os.path.basename will give you the file name just join it to the new path and save it however you want:

new_dir = "path/to/dir/"
for file1, file2 in zip(files_in_folder_1, files_in_folder_2):
    dftask = pd.read_csv(file2)
    dfresource = pd.read_csv(file1)
    # work on df's  .......

    # save to new dir   
    dftask.to_csv(os.path.join(new_dir,os.path.basename(file2)))
    dfresource.to_csv(os.path.join(new_dir,os.path.basename(file1)))

If you are using file.open to open the files first you can get the name from the .name attribute:

new_dir = "path/to/dir/"
for file1, file2 in zip(files_in_folder_1, files_in_folder_2):
    with open(file1) as f1, open(file2) as f2:
        dftask = pd.read_csv(file2)
        dfresource = pd.read_csv(file1)

    dftask.to_csv(os.path.join(new_dir, file2.name))
    dfresource.to_csv(os.path.join(new_dir,file1.name))
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

As far as I understand your question you can use 'path\' + file1.name.split("/")[-1] to save all files with original names.

zuku
  • 649
  • 8
  • 24
0

With the above code when the loop exits file1 and file2 each will contain individual DataFrame from last iteration of the loop.

If you want to consolidate all the DataFrames should create list containing each of the individual DataFrames and concat them.

import os

files_in_folder_1 = [os.path.join(path1, f) for f in os.listdir(path1) if os.path.isfile(os.path.join(path1, f))]

files_in_folder_2 = [os.path.join(path2, f) for f in os.listdir(path2) if os.path.isfile(os.path.join(path2, f))]

dftask_list = []
dfresource_list = []
for file1, file2 in zip(files_in_folder_1, files_in_folder_2):
    with open(file1) as f1, open(file2) as f2:

        dftask_list.append(pd.read_csv(file2))
        dfresource_list.append(pd.read_csv(file1))

dftask = pd.concat(dftask_list)
dfresource = pd.concat(dfresource_list)

Note: You may need to reset index after this.

dftask = dftask.reset_index(drop=True)
dfresource = dfresource.reset_index(drop=True)
shanmuga
  • 4,329
  • 2
  • 21
  • 35
0

You can use os.path.split() , the function returns a tuple, where the second element would be the filename . Example -

f1name = os.path.split(file1)[1]
f2name = os.path.split(file2)[1]

Then you can use os.path.join() to join it with the other directory and get the resultant path. Example -

file1newpath = os.path.join(otherdir, os.path.split(file1)[1])
file2newpath = os.path.join(otherdir, os.path.split(file2)[1])

Then you can use the above names to save the file -

dftask.to_csv(file1newpath)
dfresource.to_csv(file2newpath)

Demo for os.path.split() -

>>> import os.path
>>> os.path.split(r'C:\Users\temp\somedir\somefile.csv')[1]
'somefile.csv'
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176