-1

I'm testing my data using the SVM Classifier. And my dataset is in a form of text and I'm trying to transform it into float.

I have data that may look like this: dataset


Transform as float

df.columns = df('columns').str.rstrip('%').astype('float') / 100.0


TypeError                                 Traceback (most recent call last)
<ipython-input-66-74921537411d> in <module>
      1 # Transform as float
----> 2 df.columns = df('columns').str.rstrip('%').astype('float') / 100.0
      3 

TypeError: 'DataFrame' object is not callable
Park
  • 2,446
  • 1
  • 16
  • 25
kope
  • 1
  • 1
  • `df('columns')` should be `df(['columns'])`? – Lei Yang Jan 26 '22 at 06:44
  • I already changed it to: df(['columns']) = df(['columns']).str.rstrip('%').astype('float') / 100.0 And I get: File "", line 2 df(['columns']) = df(['columns']).str.rstrip('%').astype('float') / 100.0 ^ SyntaxError: cannot assign to function call – kope Jan 26 '22 at 06:46
  • i think you need [Apply function to each cell in DataFrame](https://stackoverflow.com/questions/39475978/apply-function-to-each-cell-in-dataframe) – Lei Yang Jan 26 '22 at 06:52
  • It doesn't work like this, you need to learn the basics of text representation in ML, like one-hot encoding. – Erwan Jan 27 '22 at 13:34
  • you write wrong code, please be careful – mohammad mobasher Feb 05 '22 at 16:47

1 Answers1

0

Basically, it is impossible to convert text to float. In your dataset, it seems that all the columns have text values, and not sure if the value can be numbers by using rstrip('%') (because the values are too long, so truncated in the image).

If the values of a columns can be numbers by using rstrip('%'), then you can convert it. In addition, you are using (), not [] for the dataframe. Because you are using`df(...'), it looks like a function call. You can do what you want if the values of a columns is numbers, as follows:

df['columns'] = df['columns'].str.rstrip('%').astype('float') / 100.0

Here is a full code sample:


import pandas as pd

df = pd.DataFrame({
    'column_name': ['111%', '222%'],
})
# df looks like:
#  columns
#0    111%
#1    222%

df['column_name'] = df['column_name'].str.rstrip('%').astype('float') / 100.0

print(df)
#   columns
#0     1.11
#1     2.22
Park
  • 2,446
  • 1
  • 16
  • 25