4

If I have a pandas data frame like this:

      A   B   C   D   E   F   G   H 
 0    0   2   3   5  NaN NaN NaN NaN
 1    2   7   9   1   2  NaN NaN NaN
 2    1   5   7   2   1   2   1  NaN
 3    6   1   3   2   1   1   5   5
 4    1   2   3   6  NaN NaN NaN NaN

How do I move all of the numerical values to the end of each row and place the NANs before them? Such that I get a pandas data frame like this:

      A   B   C   D   E   F   G   H 
 0   NaN NaN NaN NaN  0   2   3   5
 1   NaN NaN NaN  2   7   9   1   2  
 2   NaN  1   5   7   2   1   2   1 
 3    6   1   3   2   1   1   5   5
 4   NaN NaN NaN NaN  1   2   3   6  
Zmann3000
  • 806
  • 5
  • 13
  • look at [this](https://stackoverflow.com/questions/44558215/python-justifying-numpy-array/44559180#44559180) answer from Divakar and use : `pd.DataFrame(justify(df.values,np.nan,side='right'),columns=df.columns)` – anky Sep 08 '19 at 04:38

2 Answers2

3

One row solution:

df.apply(lambda x: pd.concat([x[x.isna()==True], x[x.isna()==False]], ignore_index=True), axis=1)
Manualmsdos
  • 1,505
  • 3
  • 11
  • 22
2

I guess the best approach is to work row by row. Make a function to do the job and use apply or transform to use that function on each row.

def movenan(x):
    fl = len(x)
    nl = len(x.dropna())
    nanarr = np.empty(fl - nl)
    nanarr[:] = np.nan
    return pd.concat([pd.Series(nanarr), x.dropna()], ignore_index=True)

ddf = df.transform(movenan, axis=1)
ddf.columns = df.columns

Using your sample data, the resulting ddf is:

     A    B    C    D    E    F    G    H
0  NaN  NaN  NaN  NaN  0.0  2.0  3.0  5.0
1  NaN  NaN  NaN  2.0  7.0  9.0  1.0  2.0
2  NaN  1.0  5.0  7.0  2.0  1.0  2.0  1.0
3  6.0  1.0  3.0  2.0  1.0  1.0  5.0  5.0
4  NaN  NaN  NaN  NaN  1.0  2.0  3.0  6.0

The movenan function creates an array of nan of the required length, drops the nan from the row, and concatenates the two resulting Series.
ignore_index=True is required because you don't want to preserve data position in their columns (values are moved to different columns), but doing this the column names are lost and replaced by integers. The last line simply copies back the column names into the new dataframe.

Valentino
  • 7,291
  • 6
  • 18
  • 34