2

I have dataframe not sequences. if I use len(df.columns), my data has 3586 columns. How to re-order the data sequences?

ID  V1  V10 V100 V1000 V1001 V1002 ...  V990 V991 V992 V993 V994
A   1   9.0 2.9  0.0   0.0   0.0   0.0  0.0  0.0  0.0  0.0  0.0
B   1   1.2 0.1  3.0   0.0   0.0   0.0  1.0  0.0  0.0  0.0  0.0
C   2   8.6 8.0  2.0   0.0   0.0   0.0  2.0  0.0  0.0  0.0  0.0
D   3   0.0 2.0  0.0   0.0   0.0   0.0  3.0  0.0  0.0  0.0  0.0
E   4   7.8 6.6  3.0   0.0   0.0   0.0  4.0  0.0  0.0  0.0  0.0

I used this df = df.reindex(sorted(df.columns), axis=1) (based on this question Re-ordering columns in pandas dataframe based on column name) but still not working.

thank you

Arief Hidayat
  • 937
  • 1
  • 8
  • 19

1 Answers1

3

First get all columns without pattern V + number by filtering with str.contains, then sorting all another values by Index.difference, add together and pass to DataFrame.reindex - get first all non numeric non matched columns in first positions and then sorted V + number columns:

L1 = df.columns[~df.columns.str.contains('^V\d+$')].tolist()

L2 = sorted(df.columns.difference(L1), key=lambda x: float(x[1:]))

df = df.reindex(L1 + L2, axis=1)
print (df)
   ID   V1  V10  V100  V990  V991  V992  V993  V994  V1000  V1001  V1002
A   1  9.0  2.9   0.0   0.0   0.0   0.0   0.0   0.0    0.0    0.0    0.0
B   1  1.2  0.1   3.0   1.0   0.0   0.0   0.0   0.0    0.0    0.0    0.0
C   2  8.6  8.0   2.0   2.0   0.0   0.0   0.0   0.0    0.0    0.0    0.0
D   3  0.0  2.0   0.0   3.0   0.0   0.0   0.0   0.0    0.0    0.0    0.0
E   4  7.8  6.6   3.0   4.0   0.0   0.0   0.0   0.0    0.0    0.0    0.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252