I am trying to write a function that receives a list of data frames, then check if those data frames have the same features (columns).
So, First I need to read each data frame from the list, then extract its columns' names and store these names in another list. Finally, compare these lists and return true if they are equaled. I don't reach the comparison step yet, because there is an issue in extracting the names of the columns.
Here is the example I have tried:
# Basic libraries
import os
import pandas as pd
import numpy as np
def merge_df(lis):
df_list=[]
j=0
for i in lis:
name = "df" + str(j)
print(name)
name = pd.DataFrame(i)
name = name.values.tolist()
df_list.append(name)
j+=1
print(df_list)
data_dict = {'First':[100, 90, np.nan, np.nan],
'Second': [30, 45, 56, np.nan],
'Third':[np.nan, 40, 80, np.nan],
'Forth': [30,40,50,np.nan]}
df1 = pd.DataFrame(data_dict)
data_dict2 = {'First':[100, 90, 4,3],
'Second': [30, 45, 56,0],
'Third':[np.nan, 40, 80, 5],
'Forth': [30,40,50,np.nan]}
df2 = pd.DataFrame(data_dict2)
lis = [df21,df2]
#the size of lis is >= 2
merge_df(lis)
Since the two data frames have the same features First,Second,Third,Forth
, I expect that the function will return yes.
I am sure the problem in name = name.values.tolist()
because the data-frame is treated as a string. Also, the same in df_list.append(name)
.
Then, it's normal to get this error DataFrame constructor not properly called!
.
So, are there any issues with this function that I have to take care of?