I am working with a dataframe and I have to convert a column into int type
i use the following notation:
result_df['ftmSectionId'] = result_df['ftmSectionId'].astype('int')
The DF has several million rows, so apparently there are some values that can not be converted into int (perhaps including commas or periods...) I get the error:
ValueError: invalid literal for int() with base 10: 'not'
Now according to this question: How do I fix invalid literal for int() with base 10 error in pandas
I could use:
data.Population1 = pd.to_numeric(data.Population1, errors="coerce")
Which works.
But in this way I dont know why in the first place I got an error. Due to the nature of the DataBase I am working with I would expect that particular column to have only Integers. How could I query the column to find out which values can not be convert to 'int' with the simple approach .astype('int') ?
Thanks
Other possible answers but not duplicates: Unable to convert pandas dataframe column to int variable type using .astype(int) method This question addresses the same problem, only that they know that the problem is that the column contains NaN and they remove them. I dont know what is the problem here, my goal here is not only convert to 'int' but rather catch the trouble values