2

I am having problem with the .loc[]

if I do this:

import pandas as pd
x = pd.DataFrame(zip(range(4), range(4)), columns=['a', 'b'])
print(x)
   a  b
0  0  0
1  1  1
2  2  2
3  3  3

q = x.loc[:, 'a']
q += 2
print(x)
   a  b
0  2  0
1  3  1
2  4  2
3  5  3

as you can see my operation on q is done on x also because x.loc[:,'a'] is not returning a copy.

if I do this:

import pandas as pd
x = pd.DataFrame(zip(range(4), range(4)), columns=['a', 'b'])
print(x)
   a  b
0  0  0
1  1  1
2  2  2
3  3  3

q = x.loc[x.index, 'a']
q += 2
print(x)
   a  b
0  0  0
1  1  1
2  2  2
3  3  3

as you can see. doing x.index instead of : returns a copy and my operation on q is not reflected on x. I feel like this is very risky am I wondering if this is intended or a bug that : does not behave like x.index

Jeff answer on .loc with good insight also : from the doc

thanks for help

addition on speed: of course the : indexer should be much faster than using x.index, which is why I tend to use it:

import timeit
%timeit x.loc[:,'a']
10000 loops, best of 3: 25.1 µs per loop
%timeit x.loc[x.index,'a']
10000 loops, best of 3: 128 µs per loop
Community
  • 1
  • 1
Steven G
  • 16,244
  • 8
  • 53
  • 77

0 Answers0