4

I have a column, having float values,in a dataframe (so I am calling this column as Float series). I want to convert all the values to integer or just round it up so that there are no decimals.

Let us say the dataframe is df and the column is a, I tried this :

df['a'] = round(df['a']) 

I got an error saying this method can't be applied to a Series, only applicable to individual values.

Next I tried this :

for obj in df['a']: 
   obj =int(round(obj))

After this I printed df but there was no change. Where am I going wrong?

Chad S.
  • 6,252
  • 15
  • 25
Data Enthusiast
  • 521
  • 4
  • 12
  • 22

4 Answers4

7

round won't work as it's being called on a pandas Series which is array-like rather than a scalar value, there is the built in method pd.Series.round to operate on the whole Series array after which you can change the dtype using astype:

In [43]:
df = pd.DataFrame({'a':np.random.randn(5)})
df['a'] = df['a'] * 100
df

Out[43]:
            a
0   -4.489462
1 -133.556951
2 -136.397189
3 -106.993288
4  -89.820355

In [45]:
df['a'] = df['a'].round(0).astype(int)
df

Out[45]:
     a
0   -4
1 -134
2 -136
3 -107
4  -90

Also it's unnecessary to iterate over the rows when there are vectorised methods available

Also this:

for obj in df['a']: 
   obj =int(round(obj))

Does not mutate the individual cell in the Series, it's operating on a copy of the value which is why the df is not mutated.

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • 1
    this worked. thanks. So, here round is both a built in function and an attribute of Series array (which we are using here) – Data Enthusiast Oct 15 '15 at 22:01
  • technically no, it's a method for Series not an attribute but it could be passed as a function depending on the usage, best not to get too confused by this – EdChum Oct 15 '15 at 22:03
2

The code in your loop:

obj = int(round(obj))

Only changes which object the name obj refers to. It does not modify the data stored in the series. If you want to do this you need to know where in the series the data is stored and update it there.

E.g.

for i, num in enumerate(df['a']):
    df['a'].iloc[i] = int(round(obj))
Chad S.
  • 6,252
  • 15
  • 25
  • I tried this but it did not help. ILooks like we are just modifying the copy here and not the actual value – Data Enthusiast Oct 15 '15 at 22:04
  • This is called [chain-indexing](http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy) and sometimes doesn't work, you'd have to do `df['a'].iloc[i] = int(round(obj))` – EdChum Oct 15 '15 at 22:05
2

When converting a float to an integer, I found out using df.dtypes that the column I was trying to round off was an object not a float. The round command won't work on objects so to do the conversion I did:

df['a'] = pd.to_numeric(df['a'])
df['a'] = df['a'].round(0).astype(int)

or as one line:

df['a'] = pd.to_numeric(df['a']).round(0).astype(int)
Arthur D. Howland
  • 4,363
  • 3
  • 21
  • 31
1

If you specifically want to round up as your question states, you can use np.ceil:

import numpy as np
df['a'] = np.ceil(df['a']) 

See also Floor or ceiling of a pandas series in python?

Not sure there's much advantage to type converting to int; pandas and numpy love floats.

Community
  • 1
  • 1
fantabolous
  • 21,470
  • 7
  • 54
  • 51