When I use pandas.DataFrame.replace(dict)
to convert user_id string
to integer
, I receive:
"OverflowError: Python int too large to convert to C long".
sample code:
import pandas as pd
x = {'user_id':['100000715097692381911',
'100003840837471130074'],
'item_id': [1, 2]
}
dfx = pd.DataFrame(x)
dfx['user_id'].replace(
{
'100000715097692381911': 0,
'100003840837471130074': 1
}, inplace=True)
I don't understand why this is duplicated. I think this is a problem of pandas taking str type as integers. I didn't load those big id numbers as integer but as string. Well, if I prepend an character to 'user_id' string, like 's100000715097692381911', it will not report OverflowError.