0

I need to convert all the values inside a column to type float. However, the column contains missing values which show up like this:

enter image description here

I used the following code to convert all the values to type float.

df['col'].apply(lambda x : float(x) if x!=".." else np.nan)

But got an error message telling:

TypeError: float() argument must be a string or a number, not 'NoneType'

I would appreciate any recommendation.

Julieta M.
  • 73
  • 5
  • 1
    `..` appears to just be the way a value of `None` is *rendered* in the string representation of the column, not the actual value stored in the column. The error message tells you what value of `x` is actually used. – chepner Jul 23 '23 at 14:44
  • 1
    Why not handle missing value before before conversion ? This will ensure you have the sale datatypes. – abdoulsn Jul 24 '23 at 07:14
  • For further help, please copy-paste the output of > print(df), well, a number of representative lines anyways, into your question, otherwise we risk getting nowhere. '..' might not be the only odd thing in your df. – OCa Jul 24 '23 at 09:05

2 Answers2

0

Well, your data is actually None, to skip them just check if they're None.

I guess this will do your job

df['col'].apply(lambda x : float(x) if x is not None and x != '..' else np.nan)

Or alternatively you can use this:

def convert(x):
    try:
        return float(x)
    except ValueError:
        return np.nan


df['col'].apply(convert)

Ashenguard
  • 174
  • 1
  • 10
0

Editing and summing up

This wasn't clear from your input picture (please post as text) but it appears your input dataframe in fact contains both None and '..'

  • error "TypeError: float() argument must be a string or a number, not 'NoneType'" indicates presence of 'None'
  • error ValueError: could not convert string to float: '..' now indicates presence of some actual '..' strings.
  • Forget applying lambdas: pandas.DataFrame.astype knows to skip NoneType without raising an error. Just get rid of arbitrary strings like '..' first.

Adapted answer:

One-liner, for in-place conversion:

df['col'] = np.where(df['col']=='..', None, 
                                      df['col']).astype('float')

Alternatively, decomposing steps simplifies the syntax: (no need for numpy.where)

(0) A mock-up of your initial dataframe: (Please post as text instead of image)

df = pd.DataFrame({'col' : ['-0.123',None, '..']})
df
      col
0  -0.123
1    None 
2    '..'

(1) First, curate your data by replacing cells of value '..' with None:

df.loc[df['col']=='..'] = None
df
      col
0  -0.123
1    None
2    None

(2) Then apply data type conversion:

df['col'] = df['col'].astype('float64')
df
      col
0  -0.123
1     NaN
2     NaN

Missing data are converted to 'not a number'. Resulting data type is as requested:

df.dtypes
col    float64
dtype: object

To go further: More pandas built-in column-wise datatype converters to be used instead of crafting you own lambda functions.

OCa
  • 298
  • 2
  • 13
  • Thank you. I tried your code, and again got the same error message: ValueError: could not convert string to float: '..' . Are there any suggestions on how to overcome this? – Julieta M. Jul 24 '23 at 03:46
  • answer edited to account for presence of BOTH None and '..' in in put dataframe – OCa Jul 24 '23 at 11:26