Appending multiple CSV files and creating a new column with the filename in python

Question

I am trying to work with pandas library if there a way possible to make the filename as a column name for example, my files names are with dates.

stock_2019-10-11.csv,
stock_2019-11-11.csv.

I want to make 2 different columns with the filenames and get the append the values something I expect to get out a CSV file as :

   coulmns-primary_key, article_numerber,stock_2019-10-11,stock_2019-11-11

data-0   101,201,4,2
data-1   102,301,5,2

something like above, the new columns have values coming in from the CSV's merged.

import pandas as pd
import glob
import os
import sys
import csv

data = [] # pd.concat takes a list of dataframes
for csv in globbed_files(my directiry of files):
    frame = pd.read_csv(csv,encoding='utf_16',error_bad_lines=False,index_col=False)
    frame['filename'] = os.path.basename(csv)
    data.append(frame)

frame1 = pd.concat(data, ignore_index=True

)

Does this answer your question? [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe) — shaik moeed, Nov 14 '19 at 09:55
@LIONELJOSEPH I meant that you will add to the question a sample of them, like you did for the output, so we can see the headers. And if headers are different between files, please explain the logic for each different name. — Aryerez, Nov 14 '19 at 11:28
@Aryerez all the files have the same columns , I tried setting the index not helpful though New_PID stock New_WAWI DY-120010771 1 120010771 DY-120010796 2 120010796 — LIONEL JOSEPH, Nov 14 '19 at 11:44

score 0 · Answer 1 · answered Nov 14 '19 at 11:51

Firstly add filename as column name toa particular file, then add each file to dataframe.write dataframe to csv (considering each file has a 1 column.Customize the column header as per your columns)

import pandas as pd

df=pd.DataFrame()
filenames=["C:/Users/sghungurde/Documents/server2.csv","C:/Users/sghungurde/Documents/server3.csv"]
i=0
while(i<len(filenames)):
    extracting filename from filepath 
    c1= (filenames[i].split("/")[4]).split(".")[0]
    reading csv file and assigning column name to header
    f1=pd.read_csv(filenames[i],names=[c1])
    adding file column to dataframe
    df[c1]=f1[c1]
    i+=1

print(df)
writing final df merging result to csv
df.to_csv("C:/Users/sghungurde/Documents/merge.csv",index=False)

output

     server2       server3
 209.10.31.50  609.10.31.50
 204.12.31.53  704.12.31.53
 203.12.31.53  903.12.31.53
 102.71.99.13  102.71.99.13

is it possible to add an index to the records? As we cannot add Index to a data frame — LIONEL JOSEPH, Nov 14 '19 at 12:41
while we dump data frame to csv ,index is automatically get applied,if we don't want we can set index=False — SRG, Nov 14 '19 at 17:15

Appending multiple CSV files and creating a new column with the filename in python

1 Answers1