This SO answer comes close to what I am looking for, but I cannot apply it to a MultiIndex DataFrame
.
I define a hierarchically-indexed Dataframe
like this:
arrays = [["bar", "bar"], ["one", "two"]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
np.random.seed(0)
df = pd.DataFrame(np.random.randn(4, 2), index=["A", "B", "C", "D"], columns=index)
Output:
first bar foo
second one two one
A 1.764052 0.400157 None
B 0.978738 2.240893 None
C 1.867558 -0.977278 None
D 0.950088 -0.151357 None
I want to populate a new df['foo', 'one'] column as follows:
- I want the first row n to be 100.
- I want row n+1 to be: n * (df['bar', 'one'][n] + 1)
- And so on.
The final table should look like:
first bar foo
second one two one
A 1.764052 0.400157 100
B 0.978738 2.240893 276.4
C 1.867558 -0.977278 546.9
D 0.950088 -0.151357 1568.4
The main issue I am facing is accessing the row idx of a MultiIndex DataFrame
. df['foo', 'one'][0]=100
works but throws a warning: "A value is trying to be set on a copy of a slice from a DataFrame"
.
And df.loc[0, 'bar', 'one']
throws "Too many indexers"