I have a big DataFrame and an index array (perhaps obtained via groupby
). I'd like to create a view into the original DataFrame and modify that view such that the original DataFrame is updated, in the following manner:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(8, 3), columns=list('ABC'), index=range(1, 9))
subdf = df.loc[[1, 3, 4], :]
subdf.loc[:, 'A'] = 100
When I create subdf = df.loc[[1, 3, 4], :]
, I believe I'm getting a view into the original dataframe, as I used loc
, per my interpretation of this answer and the technical documentation.
However, when I attempt to modify subdf
, I see that changes aren't propagated to df
, indicating subdf
was a copy, not a view.
Now, I recognize I could use the index [1, 3, 4]
in the original loc
, i.e., I could achieve this df.loc[[1, 3, 4], 'A'] = 100
, but I'd like to create a separate variable to contain the view that I can then pass to functions that aren't aware that they're dealing with a subset of the data.
Questions Are two chainings of loc
not guaranteed to return a view? How can I achieve my goal of having a stand-alone variable for the view that I can then modify, and have those modifications reflect in the original?