I am trying to analyze mechanical computation results using Python and export them to Excel automatically. My raw data consists of various types of computation (each stored in a different text file). For each type, I have various load cases (loading conditions). For each load case, I output the results are various datapoints. Finally, at each datapoint, I need to store one vector result (with 3 components). So basically, the general structure is:
computation --> load --> datapoint --> component = value
After searching for similar questions, for instance:
- python: combine multiple files into a matrix with 1 and 0
- Construct pandas DataFrame from items in nested dictionary
- Python: how to export a dictionary of nested lists to Excel
and many others, it looks like the pandas library of Python seems the most appropriate, especially the dataframe and its ability to generate pivot_table or crosstab.
So far, I created a list of dictionary similar to this:
dd=[]
files = ["computation1.out","computation2.out"]
for files in files:
filename=file.split(".")[0]
with open(file,"r") as f:
for row in f:
...
#process file and extract
# loadnumber
# datapointnb
# component number icomp
# value
# Creating a new Python dictionary
mydict = {'Computation': filename,
'Load Case': RefValues[iref],
'Datapoint': EntValues[ient],
'Compo': icomp,
'Value': float(row)}
# Append it to the list
dd.append(mydict)
print(json.dumps(dd, sort_keys=True, indent=4))
df = pd.DataFrame.from_dict(dd)
print(df)
df.to_excel("myoutout.xlsx", sheet_name='Data', float_format="%.4f", merge_cells=False)
This generates something like this:
[
{
"Component": 1,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 57,
"Value": 0.0004761905
},
{
"Component": 2,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 57,
"Value": -9.333414e-18
},
{
"Component": 3,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 57,
"Value": 0.0
},
{
"Component": 1,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 84,
"Value": 0.0009523809
},
...
]
Computation LoadCase Datapoint Component Value
0 computation1 1.0 57 1 4.761905e-04
1 computation1 1.0 57 2 -9.333414e-18
2 computation1 1.0 57 3 0.000000e+00
3 computation1 1.0 84 1 9.523809e-04
...
The conversion from the list of dictionaries to a dataframe and the export to Excel went well but I can't seem to create automatically pivot table or restructure the data in lines and columns:
|computation1 |computation2...
|-----------------------------------------------|---
|loadcase1 |loadcase2 |...|loadcase1...
|compo1 compo2 compo3 |compo1 compo2 compo3 |...|compo1...
-----------------------------------------------------------|---
datapoint1 |val1 val2 val3 |val4 val5 val6 |...|...
... |... |...
datapoint3 |val7 ... |...
-----------------------------------------------------------|---
I have two questions:
- Would this data structure (list of dictionaries) be the most efficient once the amount of data gets bigger and if not, what would you recommend? I also investigated nested dictionaries but seemed more complicated to export to Excel.
- How can I restructure my dataframe to export to Excel using the appropriate format?
Thank you for you help, Christophe