1

I want to input csv file depend on datetime name. example:

a_20210101043036.csv
a_20210201043034.csv
a_20210302103422.csv

I need to input the file on first date to last date. After than concat the origin file and check the duplicates. I have an issue about input csv file depend on datetime name. How can I do this?

import csv
import re
import json
import pandas as pd
import os

df1 = pd.read_csv('test.csv',delimiter='|')
 
for filename in os.listdir(r'~/Desktop/work/KINGFOOK/'):

    df2 = pd.read_csv(filename)
    df = pd.concat([df1,df2])

df.drop_duplicates(subset ="id",keep = 'last', inplace = True)
print (df)
hang
  • 25
  • 6
  • Does this answer your question? [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe) – Prayson W. Daniel Jun 04 '21 at 05:24
  • Look for answer with `pathlib` implementation as it's modern Python. If you want to sort, read all CSV with `pathlib.Path(csv_directory).glob("*.csv")` sort, read to data frame then remove duplicates. – Prayson W. Daniel Jun 04 '21 at 05:27
  • Thank you for yours solution. I know I can write ```df2 = pd.read_csv('a_*.csv')``` But how can I input the data to df2 depend on far to near? – hang Jun 04 '21 at 05:40
  • Are all your files named like your example or are there variations? – RajeshM Jun 04 '21 at 06:21
  • yes, all are datetime name. – hang Jun 04 '21 at 06:57
  • around 10000 files. – hang Jun 04 '21 at 06:58

0 Answers0