2

I have a DataFrame which has 5 rows and 5 columns.

Created by the statement-

df = pd.DataFrame(np.random.randn(25).reshape(5,5), index=list('abcde'), columns=list('vwxyz'))

I NEED TO - Define a function call 'Standardizing' which works on a series as:
Find the mean and standard deviation of the series
Subtract each value of the series with the mean
Divide the result with the standard deviation
Apply this function to each numerical column (int or float) of the DF.

To achieve this, I do -
1)

def Standardizing(y):
    mean = y.mean()
    sd = y.std()
    return y.map(lambda x: round(((x-mean)/sd),2))

df.apply(Standardizing)

2)

def Standardizing2(y):
    mean = y.mean()
    sd = y.std()
    return round((y-mean)/sd,2)

df.apply(Standardizing2)

While I understand, 1 should work. y is a Series for me and then I am using map to reach each element of Series.
Just of curiosity, I wrote 2 and that also works.
I do not understand why it works. I would really appreciate any inputs.

rAmAnA
  • 1,909
  • 1
  • 11
  • 11
  • A [Series](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#series) is like a Numpy ndarray. It *knows* that when you add/subtract/divide/multiply/... a scalar that you want it to [operate on all the elements of the Series](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#vectorized-operations-and-label-alignment-with-series). Your second method would be the preferred way. – wwii May 06 '18 at 03:09
  • ... Or simply `y = (y-y.mean())/y.std(); return y.round(2)`. – wwii May 06 '18 at 03:15
  • @coldspeed - though i do believe this is sort of same concept, the scenario is very different and might be helpful for some one new to Python. – rAmAnA May 06 '18 at 08:10
  • Your question is a bit atypical; If my comment answered your question, its likely there won't be any answers so your question should be *closed* marking it as a duplicate is a good way to do that. – wwii May 06 '18 at 14:09

0 Answers0