0

I have four columns of type object in a Pandas (2.0.1) DataFrame which want to convert to int.

Applying the following method:

cols = ['x1','x2','y1','y2']

df[cols] = df[cols].apply(pd.to_numeric)

# The same message is raised when trying to cast a single column:
df['x1'] = pd.to_numeric(df['x1'])

# The same message is also raised when using .astype():
dff[cols] = dff[cols].astype(int)

as described here: https://stackoverflow.com/a/28648923/6630397 raises the message:

/tmp/ipykernel_87959/2834796204.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
    
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[cols] = df[cols].apply(pd.to_numeric)

How can I properly (and rapidly) cast my four columns from object to int?

swiss_knight
  • 5,787
  • 8
  • 50
  • 92

3 Answers3

2

A possible solution:

df[cols] = df[cols].astype('int')
PaulS
  • 21,159
  • 2
  • 9
  • 26
1

The SettingWithCopyWarning can be avoided by using .loc indexer to select and modify the specific columns of original dataframe.

I would also specify the desired output data type as integer using the downcast parameter, as pd.to_numeric() may return a float data type if the column contains any non-integer values.

Code

cols = ["x1", "x2", "y1", "y2"]

df.loc[:, cols] = df[cols].apply(pd.to_numeric, downcast="integer")
hlin03
  • 125
  • 7
0

I believe that sometimes pandas is a little bit overeager to throw warnings and it is nothing particularly wrong with your solution, but maybe this one will be slightly cleaner:

df = df.astype({'x1': 'int', 'x2': 'int', 'y1': 'int', 'y2': 'int'})
Matmozaur
  • 283
  • 2
  • 6