2

I have the following code:

sample_data = OrderedDict((df.name, df['col'].sample(n=3)) for df in test_cases[1:])
sample = pd.DataFrame(sample_data)

Which gives the following dataframe:

col1   col2
A      NaN
P      NaN
NaN    E
NaN    R
U      NaN
NaN    Y

How do I get the following dataframe:

 col1   col2
 A      E
 P      R
 U      Y
a1234
  • 761
  • 3
  • 12
  • 23

3 Answers3

3

Another possible solution is to use dropna(), reset_index() and concat().

pd.concat([df[x].dropna().reset_index(drop=True) for x in df.columns], axis=1)

Code

import pandas as pd
import numpy as np
li=[['A',np.nan],['P',np.nan],[np.nan,'E'],[np.nan,'R'],['U',np.nan],[np.nan,'Y']]
df=pd.DataFrame(li,columns=['col1','col2'])
df2=pd.concat([df[x].dropna().reset_index(drop=True) for x in df.columns], axis=1)
print(df2)

Output

  col1 col2
0    A    E
1    P    R
2    U    Y
Bitto
  • 7,937
  • 1
  • 16
  • 38
2

You can use list comprehension to find the not null values and reconstruct the dataframe,

pd.DataFrame([df.loc[df[col].notna(), col].values for col in df.columns]).T


    0   1
0   A   E
1   P   R
2   U   Y

Or

a = np.array([df.loc[df[col].notna(), col].values for col in df.columns]).T

pd.DataFrame(a, columns = df.columns)

    col1    col2
0   A       E
1   P       R
2   U       Y
Vaishali
  • 37,545
  • 5
  • 58
  • 86
1

IIUC

df.apply(lambda x : sorted(x,key=pd.isnull)).dropna()
Out[485]: 
  col1 col2
0    A    E
1    P    R
2    U    Y

If the performance is matter check justify

BENY
  • 317,841
  • 20
  • 164
  • 234