There are already a couple of questions on SO relating to this, most notably this one, however none of the answers seem to work for me and quite a few links to docs (especially on lexsorting) are broken, so I'll ask another one.
I'm trying do to something (seemingly) very simple. Consider the following MultiIndexed Dataframe:
import pandas as pd; import random
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.concat([pd.Series(np.random.randn(8), index=index), pd.Series(np.random.randn(8), index=index)], axis=1)
Now I want to set all values in column 0
to some value (say np.NaN
) for the observations in category one
. I've failed with:
df.loc(axis=0)[:, "one"][0] = 1 # setting with copy warning
and
df.loc(axis=0)[:, "one", 0] = 1
which either yields a warning about length of keys exceeding length of index, or one about a lack of lexsorting to sufficient depth.
What is the correct way to do this?