Q1: What is the best practice for saving meta information to a dataframe? I know of the following coding practice
import pandas as pd
df = pd.DataFrame([])
df.currency = 'USD'
df.measure = 'Price'
df.frequency = 'daily'
But as stated in this post Adding meta-information/metadata to pandas DataFrame this is associated with the risk of losing the information by appling functions such as "groupby, pivot, join or loc" as they may return "a new DataFrame without the metadata attached".
Is this still valid or has there been an update to meta information processing in the meantime? Is it good coding practice to subclass pandas for this purpose?
Q2: What would be an alternative coding practice?
I do not think building a seperate object is very suitable. Also working with Multiindex does not convince me. Lets say I want to divide a dataframe with prices by a dataframe with earnings. Working with Multiindices would be very involved.
#define price DataFrame
p_index = pd.MultiIndex.from_tuples([['Apple', 'price', 'daily'],['MSFT', 'price', 'daily']])
price = pd.DataFrame([[90, 20], [85, 30], [70, 25]], columns=p_index)
# define earnings dataframe
e_index = pd.MultiIndex.from_tuples(
[['Apple', 'earnings', 'daily'], ['MSFT', 'earnings', 'daily']])
earnings=pd.DataFrame([[5000, 2000], [5800, 2200], [5100, 3000]],
columns=e_index)
price.divide(earnings.values, level=1, axis=0)
In the example above I do not even ensure that the company indices really match. I would probably need to invoke a pd.DataFrame.reindex() or similar. This cannot be a good coding practice in my point of view.
Is there a straightforward solution to the problem of handling meta information in that context that I don't see?
Thank you in advance