I'm about to learn python. I hope this is not a stupid question but I could really need some help with this:
I have the following DataFrame "df_sales_2016_us" with 2935 entries. This is the head.
Gender Country Size (US) year Month
7610 Female United States 8.0 2016 1
7613 Female United States 9.0 2016 1
7617 Male United States 9.5 2016 1
7618 Female United States 10.5 2016 1
7619 Male United States 8.5 2016 1
"Month" contains values from 1 to 12 for each month. "Size (US") contains 16 different shoe size from 6.0 to 15
I would now like to create a new DataFrame looking like this:
- The columns are the Month from 1 to 12 while the rows are the different shoe sizes from 6.0 to 15.
- The single values should be the number of shoes for each size sold in each month.
How can I achieve this? This a table created with the value "0" just to clearify my goal.
1 2 3 4 5 6 7 8 9 10 11 12
6.0 0 0 0 0 0 0 0 0 0 0 0 0
6.5 0 0 0 0 0 0 0 0 0 0 0 0
7.0 0 0 0 0 0 0 0 0 0 0 0 0
7.5 0 0 0 0 0 0 0 0 0 0 0 0
8.0 0 0 0 0 0 0 0 0 0 0 0 0
8.5 0 0 0 0 0 0 0 0 0 0 0 0
9.0 0 0 0 0 0 0 0 0 0 0 0 0
9.5 0 0 0 0 0 0 0 0 0 0 0 0
10.0 0 0 0 0 0 0 0 0 0 0 0 0
10.5 0 0 0 0 0 0 0 0 0 0 0 0
11.0 0 0 0 0 0 0 0 0 0 0 0 0
11.5 0 0 0 0 0 0 0 0 0 0 0 0
12.0 0 0 0 0 0 0 0 0 0 0 0 0
13.0 0 0 0 0 0 0 0 0 0 0 0 0
14.0 0 0 0 0 0 0 0 0 0 0 0 0
15.0 0 0 0 0 0 0 0 0 0 0 0 0
I tried to create the following DataFrame df_test, but I am out of options to proceed and think this is a totally wrong approach. (Here as an example only for "Month" "1"). Is there any option to get only the specific value needed for each column/row?
df_test = pd.DataFrame({'1':[df_sales_2016_us["Month"]==1],
'2':[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'3':[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'4': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'5': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'6': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'7': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'8': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'9': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'10': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'11': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
'12': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
}, index=[df_sales_2016_us["Size (US)"].unique()])
df_test.sort_index()
The result for the way above is of course the following table as my approach does not get the single values needed but just informs, if "Month" is "1". But I have no idea how to go on from here or how to switch to another way to solve this. If someone would have an idea for this I would be very grateful. Thank you so much!
1 2 3 4 5 6 7 8 9 10 11 12
6.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
6.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
7.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
7.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
8.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
8.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
9.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
9.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
10.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
10.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
11.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
11.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
12.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
13.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
14.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0
15.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0