Pandas - setting variables to numeric

Question

I am having difficulties with setting some variables to numeric. (I have just started to learn python for data science and have minimal background.)

I tried those:

data["S2BQ1A25"] = data["S2BQ1A25"].convert_objects(convert_numeric=True)

df[["S2BQ1A16", "S2BQ1A25"]] = df[["S2BQ1A16", "S2BQ1A25"]].apply(pd.to_numeric)

data["S2BQ1A16"] = pandas.to_numeric(data["S2BQ1A16"] )

data["S2BQ1A16"] = pd.to_numeric(data["S2BQ1A16"])

I am using Anaconda, Spyder to code. Python 3.8. I imported pandas (1.0.5) and numpy(1.18.5).

Thanks in advance.

Edit: For S2BQ1A16 and S2BQ1A25 there were 4 choices: 1 yes, 2 no, 9 unknown, BL. NA, lifetime abstainer.

Errors I got, respectively:

 File "/home/nida/Desktop/p-projects/temp.py", line 18, in <module>
    data["S2BQ1A25"] = data["S2BQ1A25"].convert_objects(convert_numeric=True)

  File "/home/nida/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 5274, in __getattr__
    return object.__getattribute__(self, name)

AttributeError: 'Series' object has no attribute 'convert_objects'

  File "/home/nida/Desktop/p-projects/temp.py", line 18, in <module>
    df[["S2BQ1A16", "S2BQ1A25"]] = df[["S2BQ1A16", "S2BQ1A25"]].apply(pd.to_numeric)

NameError: name 'df' is not defined

  File "/home/nida/Desktop/p-projects/temp.py", line 18, in <module>
    data["S2BQ1A16"] = pandas.to_numeric(data["S2BQ1A16"] )

  File "/home/nida/anaconda3/lib/python3.8/site-packages/pandas/core/tools/numeric.py", line 149, in to_numeric
    values = lib.maybe_convert_numeric(

  File "pandas/_libs/lib.pyx", line 1963, in pandas._libs.lib.maybe_convert_numeric

ValueError: Unable to parse string " " at position 0

  File "/home/nida/Desktop/p-projects/temp.py", line 18, in <module>
    data["S2BQ1A16"] = pd.to_numeric(data["S2BQ1A16"])

NameError: name 'pd' is not defined

Can you post sample data and what problem you are exactly facing? — bigbounty, Aug 01 '20 at 17:06
You seem to have different objects (`data` vs `df`) and it's unclear whether you're trying to apply some code without having imported `pandas` in the specific way (i.e. did you `import pandas` or `import pandas as pd`). There are several ways any of the above could fail, so it's difficult to say exactly what you need to do. Please create a [mcve] with sample data that reproduces your problem fully (take a look at https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — ALollz, Aug 01 '20 at 17:09
I did import pandas. It was like this in the lesson I am taking. Thanks. — pnida, Aug 01 '20 at 17:19
Take a lookat `.astype()` and `.convert_dtypes()` in the docs https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html Both methods are defined for series and DataFrames. The second traceback states you made a mistake either defining `df` or you didn't mean to type df in that line — RichieV, Aug 01 '20 at 18:12

score 0 · Answer 1 · answered Aug 01 '20 at 18:19

try using the astype method of pandas

df["column-name"].astype('int64')

If u want to convert the datatype of multiple rows then you can pass the columns names in the form of a list

df[["column-1","column-2"]].astype('int64')

For numeric datatypes you could use:

int64 (64 bit integer datatype)
float64 (64 bit float datatype)

score 0 · Answer 2 · answered Aug 02 '20 at 14:52

0

Thank you all. I did find a way.

data["S2BQ1A16"] = data["S2BQ1A16"].apply(pandas.to_numeric,errors="coerce")

answered Aug 02 '20 at 14:52

pnida

1
1

Pandas - setting variables to numeric

2 Answers2