I have a list of numpy arrays - for example:
Lets call this LIST_A:
[array([ 0. , -11.35190205, 11.35190205, 0. ]),
array([ 0. , 36.58012599, -36.58012599, 0. ]),
array([ 0. , -41.94408202, 41.94408202, 0. ])]
I have a list of lists that are indicies for each of the numpy arrays in the above list of numpy arrays:
Lets call this List_B:
[['A_A', 'A_B', 'B_A', 'B_B'],
['A_A', 'A_D', 'D_A', 'D_D'],
['B_B', 'B_C', 'C_B', 'C_C']]
I want to create a pandas dataframe
from these objects and I'm not sure how I can do this without first creating series objects for each of the numpy arrays
in LIST_A with their associated index in LIST_B (i.e. LIST_A[0]
's index is LIST_B[0]
etc) and then doing a pd.concat(s1,s2,s3...)
to get the desired dataframe.
In the above case I can construct the desired dataframe as follows:
s1 = pd.Series(list_a[0], index=list_b[0])
s2 = pd.Series(list_a[1], index=list_b[1])
s3 = pd.Series(list_a[2], index=list_b[2])
df = pd.concat([s1,s2,s3], axis=1)
0 1 2
A_A 0.000000 0.000000 NaN
A_B -11.351902 NaN NaN
A_D NaN 36.580126 NaN
B_A 11.351902 NaN NaN
B_B 0.000000 NaN 0.000000
B_C NaN NaN -41.944082
C_B NaN NaN 41.944082
C_C NaN NaN 0.000000
D_A NaN -36.580126 NaN
D_D NaN 0.000000 NaN
In my actual application the size of the above lists are in the hundreds so I don't want to create hundreds of series objects and then concatenate them all (unless this is the only way to do it?).
I've read through various posts on SO such as: Adding list with different length as a new column to a dataframe and convert pandas series AND dataframe objects to a numpy array but haven't been able to find an elegant solution to a problem where hundreds of series objects need to be created in order to produce the desired dataframe.