I have a dataframe which contains various datapoints over MULTIPLE years from ONE unit. This unit number is listed in the first column of my dataframe, named 'Unit'. The year variable is in the second column.
For visalisation, this is a mini version of my dataset. In reality is the size: 55 columns by 700000 rows.
import random
col3=[random.randrange(1,101,1) for _ in range (14)]
col4=[random.randrange(1,101,1) for _ in range (14)]
d = {'Unit': [1, 1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 6, 6, 6],
'Year': [2014, 2015, 2016, 2017, 2015, 2016, 2017, 2017, 2014, 2015, 2014, 2015, 2016, 2017], 'col3' : col3, 'col4' : col4 }
df = pd.DataFrame(data=d)
With this dataset I want to look at the ratios between col3 and col4 within a year and between years. For this reason I want to make a three dimensional dataframe, which places year on an additional axis and not as a variable in my 2D frame.
Does anyone have tips on how to do this? and is this a good approach? suggestions?
Jen