I have written a code (thanks to) that groupe the column that I need to remain as it is and sum of the targeted columns:
import pandas as pd
import glob as glob
import numpy as np
#Read excel and Create DF
all_data = pd.DataFrame()
for f in glob.glob(r'C:\Users\Sarah\Desktop\IDPMosul\Data\2014\09\*.xlsx'):
df = pd.read_excel(f,index_col=None, na_values=['NA'])
df['filename'] = f
data = all_data.append(df,ignore_index=True)
#Group and Sum
result = data.groupby(["Date"])["Families","Individuals"].agg([np.sum])
#Save file
file_name = r'C:\Users\Sarah\Desktop\U2014.csv'
result.to_csv(file_name, index=True)
the problem is here :
#Save file
file_name = r'C:\Users\Sarah\Desktop\U2014.csv'
result.to_csv(file_name, index=True)
the code gives me the result that I want however it only takes into account the last file that it iterates through, I need to save all the sums from different files thank you