2

I have the following id from a huge dataframe with a lot of ids, i pick this one in particular to show you what is the problem

                 id  year    anual_jobs     anual_wage
874180  20001150368  2010          10.5    1071.595917

after this i code

df.anual_jobs= df.anual_jobs.round() 

i get this error but code runs anyways.

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-
docs/stable/indexing.html#indexing-view-versus-copy
self[name] = value

my result is:

                 id  year    anual_jobs     anual_wage
874180  20001150368  2010          10.0    1071.595917

when i want to round anual_jobs to 11.0 instead of 10.0

Lucas Dresl
  • 1,150
  • 1
  • 10
  • 19
  • Short answer, you created `df` by subsetting some other dataframe, assigning a view. Just do `df = df.copy()` and then it should be fine. – cs95 Dec 18 '17 at 17:38
  • 1
    @cᴏʟᴅsᴘᴇᴇᴅ: While the SettingWithCopyWarning is a problem and should be fixed, the assignment *is* actually modifying the dataframe. The question is about why it's rounding down instead of up. – user2357112 Dec 18 '17 at 17:43
  • 1
    @user2357112; oops, I thought the round was being applied to the wrong column and that it wasn't working. Okay, that's because numpy rounds to the closest even integer. Let me reopen, and you can answer it if you like. – cs95 Dec 18 '17 at 17:45

2 Answers2

6

As @cᴏʟᴅsᴘᴇᴇᴅ pointed out, this is happening because numpy rounds half-values to the nearest even integer (see docs here and a more general discussion here), and pandas uses numpy for most of its numerical work. You can resolve this by rounding the "old-fashioned" way:

import numpy as np
df.anual_jobs = np.floor(df.anual_jobs + 0.5)

or

import pandas as pd
df.anual_jobs = pd.np.floor(df.anual_jobs + 0.5)

As @cᴏʟᴅsᴘᴇᴇᴅ pointed out you can also resolve the slice assignment warning by creating your dataframe as a free-standing frame instead of a view on an older dataframe, i.e., execute the following at some point before you assign values into the dataframe:

df = df.copy()
Matthias Fripp
  • 17,670
  • 5
  • 28
  • 45
1

If what you want is because of the half-integer use decimal

from decimal import Decimal, ROUND_HALF_UP

print(Decimal(10.5).quantize(0, ROUND_HALF_UP))
print(Decimal(10.2).quantize(0, ROUND_HALF_UP))

>> 11
>> 10
SamuelNLP
  • 4,038
  • 9
  • 59
  • 102