5

I am trying to select a subset of a DataFrame based on the columns of another DataFrame.

The DataFrames look like this:

    a   b   c   d
0   0   1   2   3
1   4   5   6   7
2   8   9  10  11
3  12  13  14  15

   a  b
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

I want to get all rows of the first Dataframe for the columns which are included in both DataFrames. My result should look like this:

    a   b   
0   0   1   
1   4   5   
2   8   9  
3  12  13    
jpp
  • 159,742
  • 34
  • 281
  • 339
j. DOE
  • 238
  • 1
  • 2
  • 15

2 Answers2

6

You can use pd.Index.intersection or its syntactic sugar &:

intersection_cols = df1.columns & df2.columns
res = df1[intersection_cols]
jpp
  • 159,742
  • 34
  • 281
  • 339
3
import pandas as pd

data1=[[0,1,2,3,],[4,5,6,7],[8,9,10,11],[12,13,14,15]]
data2=[[0,1],[2,3],[4,5],[6,7],[8,9]]

df1 = pd.DataFrame(data=data1,columns=['a','b','c','d'])
df2 = pd.DataFrame(data=data2,columns=['a','b'])

df1[(df1.columns) & (df2.columns)]
ram nithin
  • 119
  • 5