11

I would like to keep the dtypes of the columns as int after the update for obvious reasons. Any ideas why this doesn't work as expected?

import pandas as pd

df1 = pd.DataFrame([
    {'a': 1, 'b': 2, 'c': 'foo'},
    {'a': 3, 'b': 4, 'c': 'baz'},
])

df2 = pd.DataFrame([
    {'a': 1, 'b': 8, 'c': 'bar'},
])

print 'dtypes before update:\n%s\n%s' % (df1.dtypes, df2.dtypes)

df1.update(df2)

print '\ndtypes after update:\n%s\n%s' % (df1.dtypes, df2.dtypes)

The output looks like this:

dtypes before update:
a     int64
b     int64
c    object
dtype: object
a     int64
b     int64
c    object
dtype: object

dtypes after update:
a    float64
b    float64
c     object
dtype: object
a     int64
b     int64
c    object
dtype: object

Thanks to anyone that has some advise

Brendan Maguire
  • 4,188
  • 4
  • 24
  • 28

1 Answers1

8

This is a known issue. https://github.com/pydata/pandas/issues/4094 I think your only option currently is calling astype(int) after the update.

JAB
  • 12,401
  • 6
  • 45
  • 50