I have csv files (let's say = 30) and I want to calculate the average of all 30 csv using corresponding values and make a new output.csv file.
Example csv file: (I have 13 colums and 16 rows)
| Dataset | VALUE1 | VALUE2 |
|:---- |:------:| -----:|
| Name1 | 2.4 | 4.2 |
| Name2 | 3.5 | 9.3 |
| Name3 | 4.6 | 11.5 |
Now I have 30 csv files like this where first row is header and 1st colum also contains string names.
What I want to do is to take average of all the 30 csv files (e.g., add value1, name1) of 30 csv files and in output file having average of these 30 values and this should be done for each and every position (except for sure the first row and first colum) as they are containing string values.
I tried with pandas and numpy both but till now no luck.
My code:
import pandas as pd
from pathlib2 import Path
import numpy as np
root = '../Dataset'
#print(tool_files_path)
file_names_list = []
ls=[]
entries = Path(root)
for entry in entries.iterdir():
if entry.is_dir():
for file in entry.iterdir():
if file.is_file():
if file.name == 'summary_x.csv':
file_names_list.append(file)
#print(file)
#file = pd.read_csv(file)
#print file
#all_files_default = pd.concat(file))
print file_names_list
df_final = pd.DataFrame()
range = [i for i in range(1,13)]
for file_name in file_names_list:
df = pd.read_csv(file_name, skiprows=0, usecols=range)
print df
df_final = df_final.add(df.reset_index(), fill_value=0)
#print df_final
#print os.getcwd()
df_final.to_csv('output.csv')
Edit: With updated code, there is dataframes addition but the index of colums is not as it is in original file and there are empty cells, I suppose because there were 0.0
added many times