0

I am creating a python application where I am trying to compare two Dataframes to identify differences. Given below is the piece of code where I am facing an issue. I am facing an issue in the below line, when it tries to compare between NaN and String/int

different = (a0 != a1)

Error:

TypeError: Cannot change data-type for object array

Code:

df0 = Excel1.parse(sheet)
df1 = Excel2.parse(sheet)
a0, a1 = (df0.fillna('0')).align(df1.fillna('0'))
different = (a0 != a1)
comp = a0[different].join(a1[different], lsuffix='_old', rsuffix='_new')
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
jay
  • 43
  • 1
  • 9
  • See great answers [here](http://stackoverflow.com/questions/17095101/outputting-difference-in-two-pandas-dataframes-side-by-side-highlighting-the-d). – Parfait Feb 22 '17 at 03:10

1 Answers1

0

May be convert the data frames into numpy arrays using a0=df0.values and a1=df1.values; then you will have two matrices a0, a1; to find cells which have different values, you may use np.where(a0 != a1). Obviously, you may want to clean the data using np.isnan() or np.isnf() before doing the comparison.

It doesn't appear that any one array has only integers. If that is true, refer here to ensure the two arrays are of the same type before doing the comparison.

`

ajmartin
  • 2,379
  • 2
  • 26
  • 42