0

I have a Python DataFrame with 20000+ values as below. And I want to efficiently rearrange df with NaN goes after string of values.

    IT1     IT2     IT3     IT4     IT5     IT6
0   qwe     NaN     NaN     rew     NaN     NaN
1   NaN     NaN     sdc     NaN     NaN     wer
2   NaN     NaN     NaN     NaN     NaN     NaN
3   asd     fsc     ws      zd      ews     df 
.....

to

    IT1     IT2     IT3     IT4     IT5     IT6
0   qwe     rew     NaN     NaN     NaN     NaN
1   sdc     wer     NaN     NaN     NaN     NaN     
2   NaN     NaN     NaN     NaN     NaN     NaN
3   asd     fsc     ws      zd      ews     df 
.....

So each row can have no values like index = 2, or all values like index = 3. Is there a way to efficiently rearrange my dataframe df? Thanks in advance

pcu
  • 1,204
  • 11
  • 27
EJ Kang
  • 455
  • 2
  • 5
  • 17

2 Answers2

1

One way, albeit slowly, apply, dropna, and tolist:

 df.apply(lambda x: pd.Series(x.dropna().tolist()),1)\
   .set_axis(df.columns, axis=1, inplace=False)

Output:

   IT1  IT2  IT3  IT4  IT5  IT6
0  qwe  rew  NaN  NaN  NaN  NaN
1  sdc  wer  NaN  NaN  NaN  NaN
2  NaN  NaN  NaN  NaN  NaN  NaN
3  asd  fsc   ws   zd  ews   df
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
1

You can write a custom function that sorts a rows, then replaces the index (the columns) with the columns in the original order. Simply apply it to the dataframe row-wise

def row_sort(s):
    s2 = s.sort_values()
    s2.index = s.index
    return s2

df.apply(row_sort, axis=1)
# returns:
   IT1  IT2  IT3  IT4  IT5  IT6
0  qwe  rew  NaN  NaN  NaN  NaN
1  sdc  wer  NaN  NaN  NaN  NaN
2  NaN  NaN  NaN  NaN  NaN  NaN
3  asd   df  ews  fsc   ws   zd
James
  • 32,991
  • 4
  • 47
  • 70
  • It works perfect, but only one problem. I want to keep the order, but drop nan value, but your code does not keep the original order. For example, in index 3, it should be in asd fsc ws zd ews df order. Thanks! – EJ Kang Dec 20 '17 at 06:57