0

As i try to convert an object type column to float, I am getting ValueError: could not convert string to float: 'Y':

import pandas as pd
import numpy as np

df_train = pd.read_csv('loan_prediction/train_u6lujuX_CVtuZ9i.csv')
df_train_y = df_train.iloc[:, 12].values

df_train_y.astype(float)
desertnaut
  • 57,590
  • 26
  • 140
  • 166

1 Answers1

1

This might help you to find the non-numeric values in your data set.

First, create a data frame, and set certain elements of Column 12 to non-numeric values:

import numpy as np
import pandas as pd

nrows, ncols = (10, 15)
data = np.arange(nrows * ncols).reshape((nrows, ncols))
df = pd.DataFrame(data)

df.iloc[2:5, 12] = 'x'

Second, extract column 12, and convert to numeric type:

df_2 = df.iloc[:, 12].copy()
df_2 = pd.to_numeric(df_2, errors='coerce')

Third, find the non-numeric values (with a Boolean mask):

mask = df_2.isna()
print(df[mask].iloc[:, 12])

2    x
3    x
4    x
Name: 12, dtype: object
jsmart
  • 2,921
  • 1
  • 6
  • 13
  • Thanks jsmart. Using the above code, I found that my datatype is an object throughout my 12th column. However how can i convert those object data into integers or float? – Sione Hoghen Jul 28 '20 at 13:14
  • If you execute `pd.to_numeric('12', errors='coerce')`, then you'll get 12 (as an integer). But if you execute `pd.to_numeric('abc', errors='coerce')`, then you'll get 'nan' (not a number) -- because 'abc' is not a string representation of a number. The `errors='coerce'` returns a number (if the conversion was successful) or nan otherwise. – jsmart Jul 28 '20 at 13:20
  • Program returned nan for all index of Column 12. But i want all string values to be changed to numeric. – Sione Hoghen Jul 28 '20 at 17:35
  • Hi, can you post the first 10 or 20 lines of your data file? – jsmart Jul 28 '20 at 18:00