I created a DataFrame using the compare function to compare two datasets from two different years. If the values in the sub-indexed columns (2021, 2023) for the main column (e.g. company A) are equal, both of the columns will display NaN.
From here I want to select a main column (e.g. company A), and only display the rows where one or both of the values in the sub-indexed columns (e.g. 2021, 2023 for company A) are not Null. (If both are Null, do not display.)
cols = pd.MultiIndex.from_tuples([('company A', '2021'), ('company A', '2023'), ('company B', '2021'), ('company B', '2023')])
df = pd.DataFrame([[np.nan,np.nan,np.nan,np.nan], [3,4,np.nan,np.nan], [np.nan,np.nan,np.nan,6]], columns=cols)
df
company A company B
2021 2023 2021 2023
0 NaN NaN NaN NaN
1 3.0 4.0 NaN NaN
2 NaN NaN NaN 6.0
Desired DataFrame:
company A company B
2021 2023 2021 2023
1 3.0 4.0 NaN NaN