Can someone tell me what does None mean in this code?
s = df.A.values[:, None]
I don't know what else can I write about this problem.
df.A.values
converts df.A
to a numpy array. To answer this, let's create a test dataframe:
>>> from pandas import util
>>> df= util.testing.makeDataFrame()
>>> df.head()
A B C D
BMdjymcTHC -0.684721 1.622097 -2.525634 1.627290
0e7Mekvkf7 0.003399 0.152074 -0.095163 -0.276664
q0E6te3rF9 1.639105 -1.935913 1.733587 -0.729493
w7d1NGfq1p -0.496669 -1.182373 -0.950125 2.201667
RPqDHEGhxs -1.169309 0.608857 -0.748978 0.270510
Your code gives the following output:
>>> df.A.values[:, None]
array([[-0.68472066],
[ 0.00339929],
[ 1.63910531],
[-0.49666918],
[-1.16930896],
[ 0.18225299],
[ 0.88957142],
[ 0.97299314],
[ 0.67984743],
[ 1.11192848],
[-1.43273161],
[-0.59633832],
[ 0.81591342],
[ 1.26188783],
[ 0.08789735],
[-0.37412069],
[ 0.15285941],
[-0.14208735],
[ 0.37897237],
[ 0.49208469],
[ 0.86949863],
[-0.98972967],
[ 0.66001405],
[-1.69139314],
[ 1.18512158],
[ 1.47981638],
[ 1.21812138],
[ 0.82375357],
[-0.4896989 ],
[ 0.53701562]])
Let's check it's shape
:
>>> df.A.values[:, None].shape
(30, 1)
If you don't have None
:
>>> df.A.values[:]
array([-0.68472066, 0.00339929, 1.63910531, -0.49666918, -1.16930896,
0.18225299, 0.88957142, 0.97299314, 0.67984743, 1.11192848,
-1.43273161, -0.59633832, 0.81591342, 1.26188783, 0.08789735,
-0.37412069, 0.15285941, -0.14208735, 0.37897237, 0.49208469,
0.86949863, -0.98972967, 0.66001405, -1.69139314, 1.18512158,
1.47981638, 1.21812138, 0.82375357, -0.4896989 , 0.53701562])
and the shape
is :
>>> df.A.values[:].shape
(30,)
So, it is essentially adding one more dimension/axis to the numpy
array and creates an array of dimension (30x1)
in my case. Your code is equivalent to:
>>> df.A.values.reshape(-1,1)
>>> df.A.values.reshape(-1,1).shape
(30, 1)