1

I have a dataframe as shown below:

8964_real  8964_imag   8965_real  8965_imag   8966_real  8966_imag   8967_real  ...  8984_imag   8985_real  8985_imag   8986_real  8986_imag   8987_real  8987_imag
0  112.653120   0.000000  117.104887   0.000000  127.593406   0.000000  129.522106  ...   0.000000  125.423552   0.000000  127.888477   0.000000  136.160979   0.000000
1   -0.315831  16.363974   -2.083329  22.443628   -2.166950  15.026253    0.110502  ... -26.613220    8.454297 -35.000742   11.871405 -24.914035    7.448329 -16.370041
2   -1.863497  10.672129   -6.152232  15.980813   -5.679352  18.976117   -5.775777  ... -11.131600  -18.990022  -9.520732  -11.947319  -4.641286  -17.104710  -5.691642
3   -6.749938  14.870590  -12.222749  15.012352  -10.501423   9.345518   -9.103459  ...  -2.860546  -29.862724  -5.237663  -28.791194  -5.685985  -24.565608 -10.385683
4   -2.991405 -10.332938   -4.097638 -10.204587  -12.056221  -5.684882  -12.861357  ...   0.821902   -8.787235  -1.521650   -3.798446  -2.390519   -6.527762  -1.145998

I have to convert above dataframe such that values in columns "_real" should come under one column and values under "_imag" should come under another column

That is totally there should be two columns at the end , one for real and other for imag.What could be the most efficient way to do it?

I refer this link . But this is good for one column,but I need two. Another idea , I got was use regex to select columns containing "real" and do as said in above link (and similarly for imag) ,but felt it a bit round about.

Any help appreciated.

EDIT: For example, real should be like

real
112.653120
-0.315831
-1.863497
-6.749938
-2.991405
---------
117.104887
-2.083329
-6.152232
-12.222749
-4.097638
---------
127.593406
-2.166950
-5.679352
-10.501423
-12.056221

I have made a dotted line to make it clear

Fasty
  • 784
  • 1
  • 11
  • 34

1 Answers1

3

Create MultiIndex by split, so possible reshape by DataFrame.stack:

df.columns = df.columns.str.split('_', expand=True)

print (df.head(10))
         8964                   8965                   8966             \
         real       imag        real       imag        real       imag   
0  112.653120   0.000000  117.104887   0.000000  127.593406   0.000000   
1   -0.315831  16.363974   -2.083329  22.443628   -2.166950  15.026253   
2   -1.863497  10.672129   -6.152232  15.980813   -5.679352  18.976117   
3   -6.749938  14.870590  -12.222749  15.012352  -10.501423   9.345518   
4   -2.991405 -10.332938   -4.097638 -10.204587  -12.056221  -5.684882   

         8967       8984        8985                   8986             \
         real       imag        real       imag        real       imag   
0  129.522106   0.000000  125.423552   0.000000  127.888477   0.000000   
1    0.110502 -26.613220    8.454297 -35.000742   11.871405 -24.914035   
2   -5.775777 -11.131600  -18.990022  -9.520732  -11.947319  -4.641286   
3   -9.103459  -2.860546  -29.862724  -5.237663  -28.791194  -5.685985   
4  -12.861357   0.821902   -8.787235  -1.521650   -3.798446  -2.390519   

         8987             
         real       imag  
0  136.160979   0.000000  
1    7.448329 -16.370041  
2  -17.104710  -5.691642  
3  -24.565608 -10.385683  
4   -6.527762  -1.145998  

df = df.stack(0).reset_index(level=0, drop=True).rename_axis('a').reset_index()
print (df.head(10))
      a       imag        real
0  8964   0.000000  112.653120
1  8965   0.000000  117.104887
2  8966   0.000000  127.593406
3  8967        NaN  129.522106
4  8984   0.000000         NaN
5  8985   0.000000  125.423552
6  8986   0.000000  127.888477
7  8987   0.000000  136.160979
8  8964  16.363974   -0.315831
9  8965  22.443628   -2.083329

EDIT: For new structure of data is possible reshape values by ravel:

a = df.filter(like='real')
b = df.filter(like='imag')
c = a.columns.str.replace('_real', '').astype(int)
print (c)
Int64Index([8964, 8965, 8966, 8967, 8985, 8986, 8987], dtype='int64')

df = pd.DataFrame({'r':a.T.to_numpy().ravel(), 'i':b.T.to_numpy().ravel()}, 
                   index=np.tile(c, len(df)))
print (df.head(10))
               r          i
8964  112.653120   0.000000
8965   -0.315831  16.363974
8966   -1.863497  10.672129
8967   -6.749938  14.870590
8985   -2.991405 -10.332938
8986  117.104887   0.000000
8987   -2.083329  22.443628
8964   -6.152232  15.980813
8965  -12.222749  15.012352
8966   -4.097638 -10.204587
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Could you please briefly explain, what you did! :) – Fasty Mar 12 '20 at 11:34
  • 1
    Hey need a small change @jazrael... I want to stack other column values in one column, in this answer, each row has been transposed and made into a column...! – Fasty Mar 12 '20 at 11:52
  • @RamShankerG - What are other columns? Can you be more specific, best change data in question? – jezrael Mar 12 '20 at 11:57
  • column 8965 real should be under column 8964 real , and similarly for other columns...columns should be stacked one below the other column by column – Fasty Mar 12 '20 at 12:00
  • refer to the link and see df.values.ravel! – Fasty Mar 12 '20 at 12:03
  • @RamShankerG - So is necessary convert to int and sorting? `df = df.stack(0).reset_index(level=0, drop=True).rename_axis('a').reset_index().assign(a = lambda x: x['a'].astype(int)).sort_values('a')` – jezrael Mar 12 '20 at 12:05
  • 1
    Now, its what I want, thanks again..btw just make "i" part to 'b' instead of 'a' :) – Fasty Mar 12 '20 at 12:15