This post is only applicable for dataframes having same dtypes across all columns.
It is possible if the columns to be selected are at regular strides from each other using slicing within .iloc
. As such selecting any two columns is always possible, but for more than two columns, we need to have regular strides between them. In all of those cases, we need to know their column IDs and strides.
Let's try to understand these with the help of some sample cases.
Case #1 : Two columns starting at 0th col ID
In [47]: df1
Out[47]:
a b c d
0 5 0 3 3
1 7 3 5 2
2 4 7 6 8
In [48]: np.array_equal(df1.loc[:, ['a', 'b']], df1.iloc[:,0:2])
Out[48]: True
In [50]: np.shares_memory(df1, df1.iloc[:,0:2]) # confirm view
Out[50]: True
Case #2 : Two columns starting at 1st col ID
In [51]: df2
Out[51]:
a0 a a1 a2 b c d
0 8 1 6 7 7 8 1
1 5 8 4 3 0 3 5
2 0 2 3 8 1 3 3
In [52]: np.array_equal(df2.loc[:, ['a', 'b']], df2.iloc[:,1::3])
Out[52]: True
In [54]: np.shares_memory(df2, df2.iloc[:,1::3]) # confirm view
Out[54]: True
Case #2 : Three columns starting at 1st col ID and a stride of 2 columns
In [74]: df3
Out[74]:
a0 a a1 b b1 c c1 d d1
0 3 7 0 1 0 4 7 3 2
1 7 2 0 0 4 5 5 6 8
2 4 1 4 8 1 1 7 3 6
In [75]: np.array_equal(df3.loc[:, ['a', 'b', 'c']], df3.iloc[:,1:6:2])
Out[75]: True
In [76]: np.shares_memory(df3, df3.iloc[:,1:6:2]) # confirm view
Out[76]: True
Select 4 columns :
In [77]: np.array_equal(df3.loc[:, ['a', 'b', 'c', 'd']], df3.iloc[:,1:8:2])
Out[77]: True
In [78]: np.shares_memory(df3, df3.iloc[:,1:8:2])
Out[78]: True