0

I have two dataframes like this.But, column zero is a ndarray. I want to find intersect dataframes.

a1 =

 0                   |  1
 [39]                |  6000000
 [49] [50] [51] [52] |  84100 
 [49]                |  95400
 [20]                |  65089

a2 =

 0                   |  1
 [49] [50] [51] [52] |  84100 
 [38] [50]           |  530400
 [52]                |  60611
 [20]                |  65089

expected output:

a3 =

 0                   |  1
 [49] [50] [51] [52] |  84100 
 [20]                |  65089

Any ideas would be appreciated.

yssefunc
  • 91
  • 3
  • 10

1 Answers1

1

You should be able to just make a boolean mask by comparison using the numpy.array.all method:

a1 = pd.DataFrame({'a':[[0], [0,1,2], [3], [4]], 
                    'b':[0, 1000, 2000, 3000]})
a2 = pd.DataFrame({'a':[[0], [0,1,2], [4], [6]], 
                    'b':[0, 1000, 88000, 6000]})

a3 = a1[(a1==a2).values.all(axis=1)]

which returns:

     a            b
0      [0]        0
1   [0, 1, 2]   1000
FChm
  • 2,515
  • 1
  • 17
  • 37
  • Thanks for the solution... But, i got the error " Can only compare identically-labeled DataFrame objects ".. i am trying to figure out why? – yssefunc Mar 11 '19 at 07:26
  • I'm guessing your example doesn't reflect the true nature of your data. The error is probably being thrown because your two dataframes have different column names. – FChm Mar 11 '19 at 07:32
  • I am sure that two dataframes have the same colunm name. – yssefunc Mar 11 '19 at 07:33
  • This may help: [Pandas can only compare identically labeled dataframe objects error](https://stackoverflow.com/questions/18548370/pandas-can-only-compare-identically-labeled-dataframe-objects-error). I still suspect there is a labelling issue. – FChm Mar 11 '19 at 07:38
  • I found that my dataframes have different shapes.. i applied to your code.. i got the same error.. – yssefunc Mar 11 '19 at 07:49