I have a column in a dataframe that I need to join on. The column contains mixed data types, eg:
s = pd.Series([3985500,'3985500',3985500.0,'3985500.0','3985500A','3985500B'])
I'm trying to convert everything that's numeric to int to ensure the key is found when joining. Whatever is string can remain string and the final column format is allowed to be string, as long as the floats are converted to int.
I have tried astype()
, but it ignores floats and for some reason I keep on getting scientific notation (see index 2 and 3):
s.astype(int, errors='ignore')
0 3985500
1 3985500
2 3.9855e+06
3 3985500.0
4 3985500A
5 3985500B
dtype: object
I get pd.to_numeric
to work on floats with a try-except
:
try: int(pd.to_numeric(s[3]))
except ValueError: s[3]
3985500
dtype: int
However, as soon as I try it in a function it returns nothing:
def convert_to_int(cell):
try: int(pd.to_numeric(cell))
except ValueError: cell
convert_to_int(s[3])
Any idea why this is happening? There might be other workarounds, but why is it not working when it's in a function?
I wish to use this function with s.apply()
. I have looked at a couple of similar posts: