0

I am trying to convert a column of Object dtype to float64

Pandas version - 2.21

I tried using convert_objects() to force rows that cannot convert to NaN and was successful at converting my column to a float64

I want to know what rows/data did not allow me to convert it into a float64. Is there a function out their that can do that ?

For example :

col1

2015
2016
NaN
NaN
3005
i_am_a_string
4006
another_string
5008
4005

df['col1'].astype(float64) 

FAILED!! because the column has string data and cannot convert them all to float64

My desired output I want to see those strings

i_am_a_string
another_string
abhi655
  • 55
  • 7

2 Answers2

0

If I am understanding the question correctly I would do a try catch block similar to as follows:

 try:
    convert_objects()
 except Exception as e:
    print(e)

Doing such would print any errors that may occur

Shane
  • 72
  • 1
  • 13
  • 1
    You should almost never catch a general `Exception`... – Error - Syntactical Remorse Jul 18 '19 at 18:23
  • Correct, however I am unsure of the code the OP is using to be able to understand what is being given as an output. the only other option I can remotely guess at would be to return the data from the function and if it is not null then move on if null print the object and a message – Shane Jul 18 '19 at 18:26
  • 1
    convert_objects() worked fine. I want to see rows which are may be 'strings' that are not allowing me to convert a object datatype to float64 – abhi655 Jul 18 '19 at 18:27
  • @abhi655 can you post some of your code? This may help so we can copy your code to see exactly what you see:) – Shane Jul 18 '19 at 18:29
  • I edit my original question with an example, thank you – abhi655 Jul 18 '19 at 19:10
0

You can make a new column that is the copy of the original, and perform checks on the original column after conversion. For example:

>>> df = {'col1':[1,'two',3,'four']}
>>> df = pd.DataFrame(df)
>>> df['col2'] =df['col1']
>>> df['col1'] = pd.to_numeric(df['col1'],errors = 'coerce')
>>> df
   col1  col2
0   1.0     1
1   NaN   two
2   3.0     3
3   NaN  four
>>> row = df[pd.notna(df['col1'])]
>>> row
   col1 col2
0   1.0    1
2   3.0    3
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
hoan duc
  • 70
  • 10