0

I have a folder that contains a variable number of files, and each file has a variable string in the name. For example:

my_file V1.csv
my_file V2.csv
my_file something_else.csv

I would need to:

  1. Load all the files which name start with "my_file"
  2. Concatenate all of them in a single dataframe

Right now I am doing it with individual pd.read_csv functions for each file, and then merging them with a concatenate.

This is not optimal as every time the files in the source folder change, I need to modify the script.

Is it possible to automate this process, so that it works even if the source files change?

Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119
Roberto Bertinetti
  • 555
  • 1
  • 4
  • 10
  • This question is already covered https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe – devesh Aug 14 '18 at 15:11

2 Answers2

1

You can combine glob, pandas.concat and pandas.read_csv fairly easily. Assuming the CSV files are in the same folder as your script:

import glob

import pandas as pd

df = pd.concat([pd.read_csv(f) for f in glob.glob('my_file*.csv')])
asongtoruin
  • 9,794
  • 3
  • 36
  • 47
0
for filename in os.listdir(directory):
     if filename.startswith("my_file") and filename.endswith(".csv"): 
         # do some stuff here
         continue
     else:
         continue