In my daily work with Pandas, I often have to set the type of IDs as 'object'. To best illustrate the problem I write down the simple yet puzzling code:
a = pd.DataFrame({'A':[12,32,34,54,65],'B':[122,32,234,54,65],'C':[12,323,34,544,653]},dtype='object')
If I check the types of the columns:
In: a.dtypes
I get as expected
Out: A object
B object
C object
dtype: object
However, the type of a single element is surprising to me:
In: type(a.A.values[0])
Out: int
This is problematic if I try to merge two DataFrames: If the key is not of the same type they will not match (123456 does not match with '123456').
After some work I get the DataFrame to behave in the way I would have expected (for more details, look here). This is done by doing:
b = pd.DataFrame({'A':[12,32,34,54,65],'B':[122,32,234,54,65],'C':[12,323,34,544,653]}).astype(str)
Why does the statement "dtype='object'" is not enough to get string elements. Am I missing something?