How to covert a DataFrame column containing strings and NaN
values to floats. And there is another column whose values are strings and floats; how to convert this entire column to floats.
-
9DO NOT USE **`convert_objects`**. It is deprecated. Use `to_numeric` or `astype` instead – Ted Petrou Nov 06 '17 at 16:33
7 Answers
NOTE:
pd.convert_objects
has now been deprecated. You should usepd.Series.astype(float)
orpd.to_numeric
as described in other answers.
This is available in 0.11. Forces conversion (or set's to nan)
This will work even when astype
will fail; its also series by series
so it won't convert say a complete string column
In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))
In [11]: df
Out[11]:
A B
0 1.0 1.0
1 1 foo
In [12]: df.dtypes
Out[12]:
A object
B object
dtype: object
In [13]: df.convert_objects(convert_numeric=True)
Out[13]:
A B
0 1 1
1 1 NaN
In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]:
A float64
B float64
dtype: object
-
1Please note that this does not work for columns (at leadt multiindex), works just for values in the dataframe – denfromufa Apr 29 '15 at 17:20
-
1
-
then you are doing something wrong. converting string to float is an explicty user action. – Jeff Apr 29 '15 at 18:55
-
19`df['ColumnName'] = df['ColumnName'].convert_objects(convert_numeric=True)` You can convert just a single column. – Jack Jun 19 '16 at 15:01
-
22
-
13convert_objects is deprecated in newer pandas. Use the data-type specific converters pd.to_numeric. – Thomas Matthew Jul 23 '16 at 23:18
-
With pandas 0.22.0 `FutureWarning: convert_objects is deprecated. To re-infer data dtypes for object columns, use Series.infer_objects() For all other conversions use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.` – SyntaxRules May 04 '18 at 19:48
You can try df.column_name = df.column_name.astype(float)
. As for the NaN
values, you need to specify how they should be converted, but you can use the .fillna
method to do it.
Example:
In [12]: df
Out[12]:
a b
0 0.1 0.2
1 NaN 0.3
2 0.4 0.5
In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)
In [14]: df.a = df.a.astype(float).fillna(0.0)
In [15]: df
Out[15]:
a b
0 0.1 0.2
1 0.0 0.3
2 0.4 0.5
In [16]: df.a.values
Out[16]: array([ 0.1, 0. , 0.4])

- 76,608
- 25
- 108
- 120
In a newer version of pandas (0.17 and up), you can use to_numeric function. It allows you to convert the whole dataframe or just individual columns. It also gives you an ability to select how to treat stuff that can't be converted to numeric values:
import pandas as pd
s = pd.Series(['1.0', '2', -3])
pd.to_numeric(s)
s = pd.Series(['apple', '1.0', '2', -3])
pd.to_numeric(s, errors='ignore')
pd.to_numeric(s, errors='coerce')

- 214,103
- 147
- 703
- 753
-
40To apply `pd.to_numeric` to a `DataFrame`, one can use `df.apply(pd.to_numeric)` as [explained in detail in this answer](https://stackoverflow.com/a/34844867/604687). – Ninjakannon Jan 05 '17 at 19:06
df['MyColumnName'] = df['MyColumnName'].astype('float64')

- 17,045
- 10
- 39
- 63

- 3,434
- 2
- 36
- 39
-
7This does not work when converting from a String to a Float: `ValueError: could not convert string to float: 'date'` – Jack Jun 19 '16 at 14:56
-
@Jack do you know the workaround here? I'm running into this exact issue converting string to float. – Hatt Jun 14 '18 at 21:01
-
@Hatt i am facing the same issue. did you find the solution for it? – Prakhar Jhudele May 21 '20 at 06:37
-
@Jack I'm not sure but you seem to mix up date format and float. # convert to datetime df['date'] = pd.to_datetime(df['date']) – Claude COULOMBE May 22 '20 at 14:50
you have to replace empty strings ('') with np.nan before converting to float. ie:
df['a']=df.a.replace('',np.nan).astype(float)

- 351
- 3
- 7
Here is an example
GHI Temp Power Day_Type
2016-03-15 06:00:00 -7.99999952505459e-7 18.3 0 NaN
2016-03-15 06:01:00 -7.99999952505459e-7 18.2 0 NaN
2016-03-15 06:02:00 -7.99999952505459e-7 18.3 0 NaN
2016-03-15 06:03:00 -7.99999952505459e-7 18.3 0 NaN
2016-03-15 06:04:00 -7.99999952505459e-7 18.3 0 NaN
but if this is all string values...as was in my case... Convert the desired columns to floats:
df_inv_29['GHI'] = df_inv_29.GHI.astype(float)
df_inv_29['Temp'] = df_inv_29.Temp.astype(float)
df_inv_29['Power'] = df_inv_29.Power.astype(float)
Your dataframe will now have float values :-)

- 929
- 1
- 8
- 14
import pandas as pd
df['a'] = pd.to_numeric(df['a'])

- 2,025
- 3
- 9
- 20

- 59
- 3
-
Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney Apr 20 '23 at 05:14