1

I have a time-series DataFrame of sales, I need to calculate the average share of sales for each day of the week. Below is an example of what I want:

  • df1: time series dataframe for sales.
  • share_sales_week_of_day - wanted DataFrame. For Mondays I have calculated real value. Here 28 and 42 = total week sales for every week from dataframe, 1 and 8 = sales on Mondays.

Code:

df1 = pd.DataFrame(index = ['2011-01-31', '2011-02-01', '2011-02-01', '2011-02-02', '2011-02-03', '2011-02-04', '2011-02-05', '2011-02-06', '2011-02-07', '2011-02-08', '2011-02-09', '2011-02-10', '2011-02-11', '2011-02-12'], 
                   data = [1,2,3,4,5,6,7,  8,8,8,3,4,5,6], columns = ['sales'])

share_sales_week_of_day = pd.DataFrame(data = {'d_1' : [0.5*(1.0/28 + 8.0/42)],'d_2' : [0], 'd_3' : [0], 'd_4' : [0], 'd_5' : [0], 'd_6' : [0], 'd_7' : [0]})

Can you explain how can I calculate shares for whole data....

smci
  • 32,567
  • 20
  • 113
  • 146
Roman Kazmin
  • 931
  • 6
  • 18
  • 1) Before you can compute what ***day-of-week*** a date is ('Monday','Tuesday', etc), you need to first convert dates from *strings* like `'2011-01-31'` to actual ***datetimes***. What function in pandas does that? When you read the pandas documentation, what does it tell you? – smci May 22 '20 at 23:47
  • 2) Then, after you've converted your index to datetimes, see [Get weekday/day-of-week for Datetime column of DataFrame](https://stackoverflow.com/questions/28009370/get-weekday-day-of-week-for-datetime-column-of-dataframe) – smci May 22 '20 at 23:51
  • 1
    3) Then, which pandas function will convert a ('long-form') dataframe of `date,sales` entries into a ***table*** where day-of-week is in the ***columns***, and the entries are the sum (aggregate) of each day's `sales` values? Again, what does the pandas documentation tell you? – smci May 22 '20 at 23:53

1 Answers1

1

Let us do pd.crosstab

df.index=pd.to_datetime(df.index) 

s=pd.crosstab(index=df1.index.strftime('%U'),columns=df1.index.weekday,values=df1.sales.values,aggfunc='sum',normalize='index').mean()
col_0
0    0.113095
1    0.184524
2    0.107143
3    0.136905
4    0.166667
5    0.196429
6    0.095238
dtype: float64

s.to_frame().T
col_0         0         1         2         3         4         5         6
0      0.113095  0.184524  0.107143  0.136905  0.166667  0.196429  0.095238
BENY
  • 317,841
  • 20
  • 164
  • 234