I'm working with time series Pandas dataframe where for each company there are several observations indexed by date. For each observation there are multi values. I've been experimenting with pandas MultiIndex vs stack/unstack and have gotten close to desired result but am stuck:
Here's what I've tried -- full code to reproduce:
import pandas.util.testing as tm; tm.N = 3
import numpy as np
def unpivot(frame):
N, K = frame.shape
data = {'value' : frame.values.ravel('F'),
'variable' : np.asarray(frame.columns).repeat(N),
'date' : np.tile(np.asarray(frame.index), K)}
return pd.DataFrame(data, columns=['date', 'variable', 'value'])
df = unpivot(tm.makeTimeDataFrame())
df['topic'] = df.groupby('variable').cumcount()+1
df['company'] = np.random.choice(['Apple Inc', 'Google', 'Nike Corp'], len(df))
This is what I start with:
date variable value topic company
0 2000-01-03 A -1.250926 1 Apple Inc
1 2000-01-04 A -0.807887 2 Google
2 2000-01-05 A 0.302593 3 Nike Corp
3 2000-01-03 B 0.502474 1 Nike Corp
4 2000-01-04 B 0.974968 2 Apple Inc
5 2000-01-05 B 0.698478 3 Google
6 2000-01-03 C 0.871437 1 Nike Corp
7 2000-01-04 C 0.235059 2 Apple Inc
8 2000-01-05 C -1.827321 3 Google
9 2000-01-03 D -0.764457 1 Google
10 2000-01-04 D 0.014109 2 Nike Corp
11 2000-01-05 D -0.966902 3 Google
My goal is something like this:
topic 1 2 3
company variable date
Apple Inc A 2000-01-03 -1.2 NaN NaN
B 2000-01-04 NaN 0.9 NaN
C 2000-01-04 NaN 0.2 NaN
Google A 2000-01-04 NaN -0.8 NaN
B 2000-01-05 NaN NaN 0.69
And I've tried this but 'date' is not correct:
df.set_index(['company', 'variable', 'date'], inplace=False).unstack()
value topic \
date 2000-01-03 2000-01-04 2000-01-05 2000-01-03 2000-01-04
company variable
Apple Inc A -1.250926 NaN NaN 1.0 NaN
B NaN 0.974968 NaN NaN 2.0
C NaN 0.235059 NaN NaN 2.0
Google A NaN -0.807887 NaN NaN 2.0
B NaN NaN 0.698478 NaN NaN
C NaN NaN -1.827321 NaN NaN
D -0.764457 NaN -0.966902 1.0 NaN
Nike Corp A NaN NaN 0.302593 NaN NaN
B 0.502474 NaN NaN 1.0 NaN
C 0.871437 NaN NaN 1.0 NaN
D NaN 0.014109 NaN NaN 2.0
date 2000-01-05
company variable
Apple Inc A NaN
B NaN
C NaN
Google A NaN
B 3.0
C 3.0
D 3.0
Nike Corp A 3.0
B NaN
C NaN
D NaN