You should use json
file format as saving your DataFrame
.
Consider the following simplified example:
import pandas as pd
df1 = pd.DataFrame([[0,1], [2,3]], columns=["first", "second"])
df2 = pd.DataFrame([[4,5,6], [7,8,9]])
df1["dataf"] = [df2, df2]
print("\nDataFrame df1:")
print("**************")
print(df1)
print("\ndataf column:")
print("***************")
print(df1["dataf"])
print("\ndataf column cell:")
print("********************")
print(df1["dataf"][0])
print("Type of dataf cells:", type(df1["dataf"][0]))
Out:
DataFrame df1:
**************
first second dataf
0 0 1 0 1 2
0 4 5 6
1 7 8 9
1 2 3 0 1 2
0 4 5 6
1 7 8 9
dataf column:
***************
0 0 1 2
0 4 5 6
1 7 8 9
1 0 1 2
0 4 5 6
1 7 8 9
Name: dataf, dtype: object
dataf column cell:
********************
0 1 2
0 4 5 6
1 7 8 9
Type of dataf cells: <class 'pandas.core.frame.DataFrame'>
Now save our DataFrame as json
using pandas.DataFrame.to_json:
df1.to_json("test.json")
Loading our data back with pandas.read_json:
df1 = pd.read_json("test.json")
print("\nDataFrame df1:")
print("**************")
print(df1)
print("\ndataf column:")
print("***************")
print(df1["dataf"])
print("\ndataf column cell:")
print("********************")
print(df1["dataf"][0])
print("Type of dataf cells:", type(df1["dataf"][0]))
Out:
DataFrame df1:
**************
first second dataf
0 0 1 {'0': {'0': 4, '1': 7}, '1': {'0': 5, '1': 8},...
1 2 3 {'0': {'0': 4, '1': 7}, '1': {'0': 5, '1': 8},...
dataf column:
***************
0 {'0': {'0': 4, '1': 7}, '1': {'0': 5, '1': 8},...
1 {'0': {'0': 4, '1': 7}, '1': {'0': 5, '1': 8},...
Name: dataf, dtype: object
dataf column cell:
********************
{'0': {'0': 4, '1': 7}, '1': {'0': 5, '1': 8}, '2': {'0': 6, '1': 9}}
Type of dataf cells: <class 'dict'>
We can simply convert our effected columns to DataFrame
s using pandas.DataFrame.apply
df1["dataf"] = df1["dataf"].apply(lambda x: pd.DataFrame(x))
print("\nDataFrame df1:")
print("**************")
print(df1)
print("\ndataf column:")
print("***************")
print(df1["dataf"])
print("\ndataf column cell:")
print("********************")
print(df1["dataf"][0])
print("Type of dataf cells:", type(df1["dataf"][0]))
Out:
DataFrame df1:
**************
first second dataf
0 0 1 0 1 2
0 4 5 6
1 7 8 9
1 2 3 0 1 2
0 4 5 6
1 7 8 9
dataf column:
***************
0 0 1 2
0 4 5 6
1 7 8 9
1 0 1 2
0 4 5 6
1 7 8 9
Name: dataf, dtype: object
dataf column cell:
********************
0 1 2
0 4 5 6
1 7 8 9
Type of dataf cells: <class 'pandas.core.frame.DataFrame'>
You can see that our core code is so simple as follows:
df1.to_json("test.json")
df1 = pd.read_json("test.json")
df1["dataf"] = df1["dataf"].apply(lambda x: pd.DataFrame(x))