1

I have a dataframe with a column of multiple stores and the sales per date from the month of march to present.

         Date                Sales          Store   
0         20/05/2020         581             A      
1         19/05/2020         408             A      
2         18/05/2020         262             A      
3         17/05/2020         0               A  
4         16/05/2020         1063            A  
... ... ... ... ... ... ... ... ... ... ... ...

595       12/05/2020         245             Z  
596       11/05/2020         13              Z  
597       10/05/2020         165             Z  
598       09/05/2020         240             Z  
599       08/05/2020         163             Z  
600   rows × 3 columns

I am trying to sum up the total number of sales per individual date e.g total sales for all stores on 12/05/2020 = x amount. The problem is the way the data is stored in the dataframe which makes it difficult to simply use sum(). Store A is listed first with the dates from March to present then comes store B with the dates from March per individual day till today in the present.

I extracted the unique dates from the dataframe and converted them to an array. I don't work with python, pandas, numpy very often and thus am rubbish at using the syntax correctly. I want to create an array of the the "total sales per individual date" i.e all sales from all stores on the 01/03/2020 till today 25/05/2020. This is my code and I would appreciate if readers could help me with the syntax.

total_sales_per_date = []

for i in unique_dates:

    for i in csv_list:
        int a
        int temp

        if csv_list.date[i] == unique_dates[j]:

        temp = list.sales[i]

        a = a + temp

        if i == rows.Length

        a.append(total_sales_per_date)

My goal is that I create two arrays of equal size and shape e.g:

 unique_dates.shape = (142, 1)
   total_sales_per_date = (142, 1)

All suggestions, tips,example and advice will be much appreciated

chris1234
  • 53
  • 6

1 Answers1

1

Can you try df.groupby(['Date'])['Sales'].sum() Regarding the shape of the dataframe, print(df.groupby(['Date'])['Sales'].sum().reset_index().shape[0]) should be the same as print(df['Date'].nunique())

XXavier
  • 1,206
  • 1
  • 10
  • 14