I have dataFrame live (live births) with a column 'agepreg' which is a float with two decimal places. I'd like to create a new column 'agepreg_rounded' as integer.
My naive approach:
live['agepreg_rounded'] = live['agepreg'].apply(lambda x: round(x,0))
Does work, but throws warning:
/usr/local/lib/python3.5/dist-packages/ipykernel/__main__.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
I've tried multiple times to use the .loc syntax, but failed.
Can anyone set me straight?
Here's the sort of thing I'm tempted to write, but that is clearly wrong:
live['agepreg_rounded'] = live.loc[live['agepreg']].apply(lambda x: round(x,0))
Update: Where does live come from?
I'm following ThinkStats2 book from O'Reilly and the data comes from a file downloaded with the source material:
import nsfg
preg = nsfg.ReadFemPreg()
live = preg[preg.outcome == 1]