I have a dataset and need to add two more series of data which I am making up. The dataset row count will expand by multiplying all the existing rows by the new series length.
e.g.
- original frame = 10
- week frame = 2
- session frame = 2
- final frame will be (10*2)*2=40
I put together some test code which works but it is ugly. Can someone suggest a better way please.
code
df_testing = pd.DataFrame({'week': [1,2,3]}).T
df_testing2 = pd.DataFrame({'data': ['aa', 'bb', 'cc']})
two_frames = pd.concat([df_testing, df_testing2]).ffill()[1:]
two_frames = pd.melt(two_frames, id_vars='data', value_vars=[0,1,2])
df_sessions = pd.DataFrame({'Session': ['2020-Jun', '2020-Mar']}).T
final = pd.concat([df_sessions, two_frames]).ffill()[1:].rename(columns={'value': 'week'})
final = pd.melt(final, id_vars=['data', 'week'], value_vars=[0,1])
final.sort_values(by=['week', 'data']).drop(columns=['variable'])
output
data | week | value | |
---|---|---|---|
0 | aa | 1.0 | 2020-Jun |
1 | aa | 1.0 | 2020-Mar |
2 | bb | 1.0 | 2020-Jun |
3 | bb | 1.0 | 2020-Mar |
4 | cc | 1.0 | 2020-Jun |
5 | cc | 1.0 | 2020-Mar |
6 | aa | 2.0 | 2020-Jun |
7 | aa | 2.0 | 2020-Mar |
8 | bb | 2.0 | 2020-Jun |
9 | bb | 2.0 | 2020-Mar |
10 | cc | 2.0 | 2020-Jun |
11 | cc | 2.0 | 2020-Mar |
12 | aa | 3.0 | 2020-Jun |
13 | aa | 3.0 | 2020-Mar |
14 | bb | 3.0 | 2020-Jun |
15 | bb | 3.0 | 2020-Mar |
16 | cc | 3.0 | 2020-Jun |
17 | cc | 3.0 | 2020-Mar |