0

Let's assume the following is a dataframe

import pandas as pd
import numpy as np  
df = pd.DataFrame({'group1' : ['A', 'A', 'A', 'A',
                         'A', 'A', 'A', 'A'],
                   'group2' : ['A', 'A', 'A', 'A',
                         'A', 'A', 'A', 'A'],
                   'group3' : ['A', 'A', 'A', 'A',
                         'A', 'A', 'A', 'A'],
                   'group4' : ['A', 'A', 'A', 'A',
                         'A', 'A', 'A', 'A'],
                   'group5' : ['C', 'C', 'C', 'C',
                         'C', 'E', 'E', 'E'],
                   'group6' : ['C', 'C', 'C', 'C',
                         'C', 'E', 'E', 'E'],
                   'group7' : ['A', 'A', 'A', 'A',
                         'A', 'A', 'A', 'A'],
                   'time' : [-6,-5,-4,-3,-2,-6,-3,-4] , 
                   'col': [1,2,3,4,5,6,7,8]})

Now, I only wish to select certain slices from the dataframe and the first method I apply is concat:

a=df.iloc[:,0:2]
b=df.iloc[:,6:8]
df1=pd.concat([a,b],sort=False)
df1

The output I get from this code is the following

  group1 group2 group7  time
0      A      A    NaN   NaN
1      A      A    NaN   NaN
2      A      A    NaN   NaN
3      A      A    NaN   NaN
4      A      A    NaN   NaN
5      A      A    NaN   NaN
6      A      A    NaN   NaN
7      A      A    NaN   NaN
0    NaN    NaN      A  -6.0
1    NaN    NaN      A  -5.0
2    NaN    NaN      A  -4.0
3    NaN    NaN      A  -3.0
4    NaN    NaN      A  -2.0
5    NaN    NaN      A  -6.0
6    NaN    NaN      A  -3.0
7    NaN    NaN      A  -4.0

It seems to be an odd result. But if I tried with np_r

df.iloc[:5, np.r_[0:2,6:8]]

The output is the correct one...

  group1 group2 group7  time
0      A      A      A    -6
1      A      A      A    -5
2      A      A      A    -4
3      A      A      A    -3
4      A      A      A    -2
5      A      A      A    -6
6      A      A      A    -3
7      A      A      A    -4

Is there a more efficient way with concat to fix the output and is np_r the best way to combine slices of dataframes and if so, why?

Nicola
  • 446
  • 7
  • 17

1 Answers1

1

Use axis=1

a=df.iloc[:,0:2]
b=df.iloc[:,6:8]
df1=pd.concat([a,b],sort=False, axis=1)

    group1  group2  group7  time
0   A       A       A       -6
1   A       A       A       -5
2   A       A       A       -4
3   A       A       A       -3
4   A       A       A       -2
5   A       A       A       -6
6   A       A       A       -3
7   A       A       A       -4
rafaelc
  • 57,686
  • 15
  • 58
  • 82