I am trying to merge several time series dataframes into one very large dataframe with a MultiIndex.
Suppose I have these DataFrames.
In [1]: dates = pd.DatetimeIndex(["2019-1-1", "2019-1-2", "2019-1-3"], name="Date")
In [2]: df_a = pd.DataFrame(np.random.randn(3, 2), columns=['Col1', 'Col2'], index=dates)
In [3]: df_b = pd.DataFrame(np.random.randn(3, 2), columns=['Col1', 'Col2'], index=dates)
In [4]: df_c = pd.DataFrame(np.random.randn(3, 2), columns=['Col1', 'Col2'], index=dates)
In [5]: df_a
Out[5]:
Col1 Col2
Date
2019-01-01 1.317679 -1.201769
2019-01-02 -0.991833 0.626420
2019-01-03 0.549733 1.942215
Now, I've created the scafolding for the dataframe that I want. It looks like this.
In [6]: stock_symbols = ["A", "B", "C"]
In [7]: index = pd.MultiIndex.from_product([dates, stock_symbols], names=["Date", "Script"])
In [8]: df = pd.DataFrame(columns=['Col1', 'Col2'], index=index)
In [9]: df
Out[9]:
Col1 Col2
Date Script
2019-01-01 A NaN NaN
B NaN NaN
C NaN NaN
2019-01-02 A NaN NaN
B NaN NaN
C NaN NaN
2019-01-03 A NaN NaN
B NaN NaN
C NaN NaN
How do I specify to Pandas that values from df_a are to be appended to the appropriate index position?
I figured I'd have to use a .join()
but since the values of Script don't occur in the DataFrames I don't know what to do.
Please help.