1

I have dataFrame live (live births) with a column 'agepreg' which is a float with two decimal places. I'd like to create a new column 'agepreg_rounded' as integer.

My naive approach:

live['agepreg_rounded'] = live['agepreg'].apply(lambda x: round(x,0))

Does work, but throws warning:

/usr/local/lib/python3.5/dist-packages/ipykernel/__main__.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

I've tried multiple times to use the .loc syntax, but failed.

Can anyone set me straight?

Here's the sort of thing I'm tempted to write, but that is clearly wrong:

live['agepreg_rounded'] = live.loc[live['agepreg']].apply(lambda x: round(x,0))

Update: Where does live come from?

I'm following ThinkStats2 book from O'Reilly and the data comes from a file downloaded with the source material:

import nsfg
preg = nsfg.ReadFemPreg()
live = preg[preg.outcome == 1]
goose
  • 2,502
  • 6
  • 42
  • 69
  • What is your code above `live['agepreg_rounded'] = live['agepreg'].apply(lambda x: round(x,0))` ? – jezrael Mar 02 '18 at 12:41
  • @jezrael Sorry I'm not sure if I've mis-understood your question, but this is my attempt at creating a new column based on 'agepreg' but without the decimal. I realise now that technically I'm not trying to turn it into an integer, just round it. – goose Mar 02 '18 at 12:43
  • OK, how is created `live` ? Maybe help [this](https://stackoverflow.com/q/20625582) – jezrael Mar 02 '18 at 12:45
  • @jezrael it gets created from a file from the book I'm following along. I've added a brief note to the question detailing this a little more. – goose Mar 02 '18 at 12:49
  • so need `live = preg[preg.outcome == 1].copy()` – jezrael Mar 02 '18 at 12:49
  • @jezrael ah I see. The problem was in a different place to where I thought it was. If only the warning had specified a line, think this might be related to the version of ipython I'm using though. Thank you! – goose Mar 02 '18 at 12:52
  • You are welcome! – jezrael Mar 02 '18 at 12:53

1 Answers1

1

I think you need copy and then instead apply use Series.round:

live = preg[preg.outcome == 1].copy()
live['agepreg_rounded'] = live['agepreg'].round(0)

If you modify values in live later you will find that the modifications do not propagate back to the original data (preg), and that Pandas does warning.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252