0

I have the following dataset (there are many rows):

        NUM     POS   SKU   STORE   FOR        DATE     PRICE   QTD DEV
1   93591601    10  37350   HC01    8740    2017-01-02  76.00   1.0 0.0
2   93591701    20  37350   HC01    8740    2017-01-02  83.49   1.0 0.0

3   93592369    20  37350   HC01    8740    2017-01-04  92.90   1.0 0.0
4   93592440    20  37350   HC01    8740    2017-01-04  88.85   1.0 0.0
5   93592697    20  37350   HC01    8740    2017-01-04  78.38   1.0 0.0

What I am trying to do is to group by ('SKU', 'STORE', 'DATA'), and Aggragate the rows

  • some of them using sum,

  • others calculating the mean

  • others keeping the last row of the group.

In python I can do this using this function:

df = df.groupby(['SKU', 'STORE', 'DATA']).agg({'PRICE': np.mean,
                                             'QTD':np.sum,
                                             'DEV':'last',
                                             'FOR':'last',
                                             }).reset_index()



        NUM     POS   SKU   STORE   FOR        DATE     PRICE   QTD DEV
1   93591601    10  37350   HC01    8740    2017-01-02  79.74   2.0 0.0
2   93591701    20  37350   HC01    8740    2017-01-04  86.71   3.0 0.0

How can I do this using sql ?

Supposing that the table name is DT:

SELECT 
MEAN(PRICE),
SUM(QTD)
FROM DT
GROUP BY 'SKU', 'STORE', 'DATA'

How do I get the last row value from each group ?

Cesar
  • 575
  • 3
  • 16

1 Answers1

-1
    SELECT
SKU, STORE, DATA,
    AVG(PRICE),
    SUM(QTD),
    MAX(FOR),
    MAX(DEV),
    FROM DT
    GROUP BY SKU, STORE, DATA

EDIT. As has been suggested I replaced MEAN with AVG (Works for all the database providers I know)
If you want to get values of FOR and DEV corresponding to highest date or something you can replicate this solution:
Select first row in each GROUP BY group?