I am having problem with the .loc[]
if I do this:
import pandas as pd
x = pd.DataFrame(zip(range(4), range(4)), columns=['a', 'b'])
print(x)
a b
0 0 0
1 1 1
2 2 2
3 3 3
q = x.loc[:, 'a']
q += 2
print(x)
a b
0 2 0
1 3 1
2 4 2
3 5 3
as you can see my operation on q is done on x also because x.loc[:,'a'] is not returning a copy.
if I do this:
import pandas as pd
x = pd.DataFrame(zip(range(4), range(4)), columns=['a', 'b'])
print(x)
a b
0 0 0
1 1 1
2 2 2
3 3 3
q = x.loc[x.index, 'a']
q += 2
print(x)
a b
0 0 0
1 1 1
2 2 2
3 3 3
as you can see. doing x.index
instead of : returns a copy and my operation on q is not reflected on x. I feel like this is very risky am I wondering if this is intended or a bug that : does not behave like x.index
Jeff answer on .loc with good insight also : from the doc
thanks for help
addition on speed: of course the : indexer should be much faster than using x.index, which is why I tend to use it:
import timeit
%timeit x.loc[:,'a']
10000 loops, best of 3: 25.1 µs per loop
%timeit x.loc[x.index,'a']
10000 loops, best of 3: 128 µs per loop