Z-score normalization for only one column that does not replace the column in pandas

Question

I've got a data set where income is one of many variables. I want to add a column immediately to the right of the income variable that is the z-score. I know there's a question on here about how to do this to all but one column or many columns, but I need it for the one column, and without replacing the values. This is probably the long way of doing it but I've extracted just the income column and then applied the z-score to it. However, I can't figure out how to rename the column "Norm_Income" and then put it back into the main data frame, right next to the income. Any help is greatly appreciated. Here's what I have (I know it's not much):

## HW Part 3:  Standardizing Income Attribute with Z-Score Normalization
Income=pd.DataFrame(bank_df,columns=['income'])
from scipy.stats import zscore
Norm_Income=Income.apply(zscore)
Norm_Income

Edit: This is so weird: this work last night, but now I get an error. Here's my code:

## HW Part 3: Standardizing Income Attribute with Z-Score Normalization Income=pd.DataFrame(bank_df,columns=['income'])
from scipy.stats import zscore
Income["Norm_Income"] = Income.apply(zscore) bank_df=bank_df[["id","age","income","Norm_Income","children","gender","region","married","car","savings_acct","current_acct","mortgage","pep"]]
bank_df

Here's the new error:

score 0 · Answer 1 · answered Jan 29 '19 at 05:46

0

You already have a series, so it's pretty straightforward to put it in the dataframe, take a look at Adding new column to existing DataFrame in Python pandas

You just need:

Income["Norm_Income"] = Income.apply(zscore)

instead of your 3rd line

answered Jan 29 '19 at 05:46

Keatinge

4,330
6
25
44

This is so weird: this work last night, but now I get an error. Here's my code: ## HW Part 3: Standardizing Income Attribute with Z-Score Normalization Income=pd.DataFrame(bank_df,columns=['income']) from scipy.stats import zscore Income["Norm_Income"] = Income.apply(zscore) bank_df=bank_df[["id","age","income","Norm_Income","children","gender","region","married","car","savings_acct","current_acct","mortgage","pep"]] bank_df I can't get a screenshot to paste in this comment but it's a long error that ends with: KeyError: "['Norm_Income'] not in index" Please help! – immaprogrammingnoob Jan 31 '19 at 00:42

score 0 · Answer 2 · answered Jan 31 '19 at 04:35

So please disregard my comment to the answer. I figured out code that worked in the context of my problem.

## HW Part 3:  Standardizing Income Attribute with Z-Score Normalization
Income=pd.DataFrame(bank_df,columns=['income'])
from scipy.stats import zscore
bank_df["norm_income"] = Income.apply(zscore)
bank_df["norm_income"]
bank_df=bank_df[["id","age","income","norm_income","children","gender","region","married","car","savings_acct","current_acct","mortgage","pep"]]
bank_df

Z-score normalization for only one column that does not replace the column in pandas

2 Answers2