merge 1000s of csv with same name in different subdirectories

Question

I have 1000 of subdirectories (error1 - error1000) with three different csv files (rand.csv, run_error.csv, swe_error.csv). Each vsc has index row. I need to merge the csv files that have the same filename, so I end up with e.g. rand_merge.csv with index row and 1000 rows of data.

I followed Merge multiple csv files with same name in 10 different subdirectory, which gets me

KeyError: 'filename'

I can't figure out how to fix it, so any help is appreciated. Thx

Update: Here's the exact code, which came from linked post above:

import pandas as pd
import glob

CONCAT_DIR = "./error/files_concat/"

# Use glob module to return all csv files under root directory. Create DF from this.
files = pd.DataFrame([file for file in glob.glob("error/*/*")], columns=["fullpath"])


# Split the full path into directory and filename
files_split = files['fullpath'].str.rsplit("\\", 1, expand=True).rename(columns={0: 'path', 1:'filename'})


# Join these into one DataFrame
files = files.join(files_split)


# Iterate over unique filenames; read CSVs, concat DFs, save file
for f in files['filename'].unique():
    paths = files[files['filename'] == f]['fullpath'] # Get list of fullpaths from unique filenames
    dfs = [pd.read_csv(path, header=None) for path in paths] # Get list of dataframes from CSV file paths
    concat_df = pd.concat(dfs) # Concat dataframes into one
    concat_df.to_csv(CONCAT_DIR + f) # Save dataframe

Please include the code that generates the error you are seeing. — Engineero, Jul 01 '19 at 16:39

score 0 · Answer 1 · answered Jul 01 '19 at 20:40

0

I found my mistake. I needed a "/" after rsplit, not "\"

files_split = files['fullpath'].str.rsplit("/", 1, expand=True).rename(columns={0: 'path', 1:'filename'})

answered Jul 01 '19 at 20:40

zhl

3
2

merge 1000s of csv with same name in different subdirectories

1 Answers1