Aggregating Columns with conditions

Question

I have the following dataset (there are many rows):

        NUM     POS   SKU   STORE   FOR        DATE     PRICE   QTD DEV
1   93591601    10  37350   HC01    8740    2017-01-02  76.00   1.0 0.0
2   93591701    20  37350   HC01    8740    2017-01-02  83.49   1.0 0.0

3   93592369    20  37350   HC01    8740    2017-01-04  92.90   1.0 0.0
4   93592440    20  37350   HC01    8740    2017-01-04  88.85   1.0 0.0
5   93592697    20  37350   HC01    8740    2017-01-04  78.38   1.0 0.0

What I am trying to do is to group by ('SKU', 'STORE', 'DATA'), and Aggragate the rows

some of them using sum,
others calculating the mean
others keeping the last row of the group.

In python I can do this using this function:

df = df.groupby(['SKU', 'STORE', 'DATA']).agg({'PRICE': np.mean,
                                             'QTD':np.sum,
                                             'DEV':'last',
                                             'FOR':'last',
                                             }).reset_index()



        NUM     POS   SKU   STORE   FOR        DATE     PRICE   QTD DEV
1   93591601    10  37350   HC01    8740    2017-01-02  79.74   2.0 0.0
2   93591701    20  37350   HC01    8740    2017-01-04  86.71   3.0 0.0

How can I do this using sql ?

Supposing that the table name is DT:

SELECT 
MEAN(PRICE),
SUM(QTD)
FROM DT
GROUP BY 'SKU', 'STORE', 'DATA'

How do I get the last row value from each group ?

Do you mean you want to use the last row? or the maximum from that column? If the former, lookup rank, row_number, or other similar sequential windowed functions so that the last can be later selected via a WHERE clause or similar. — drcoding, Jun 05 '19 at 15:22
you are grouping by three constant values (strings), not by column names. — , Jun 05 '19 at 15:40

Arkadiusz Raszeja · Answer 1 · 2019-06-05T15:34:50.660

-1

    SELECT
SKU, STORE, DATA,
    AVG(PRICE),
    SUM(QTD),
    MAX(FOR),
    MAX(DEV),
    FROM DT
    GROUP BY SKU, STORE, DATA

EDIT. As has been suggested I replaced MEAN with AVG (Works for all the database providers I know)
If you want to get values of FOR and DEV corresponding to highest date or something you can replicate this solution:
Select first row in each GROUP BY group?

edited Jun 05 '19 at 15:34

answered Jun 05 '19 at 15:05

Arkadiusz Raszeja

862
7
18

How did u assumed using `Mean` without knowing DBMS and in which DBMS this query will work ? – Ven Jun 05 '19 at 15:07
Author used it in his example :) – Arkadiusz Raszeja Jun 05 '19 at 15:08
you still didn't answered my question, which DBMS `mean` will work ? – Ven Jun 05 '19 at 15:09
oah, you are right. I assumed query from post worked. Sorry – Arkadiusz Raszeja Jun 05 '19 at 15:11
And does this query now ensures OP gets last row ? – Ven Jun 05 '19 at 15:17
last rows of FOR and DEV. MAX and MIN are both working for numerical and alphabetical order. – Arkadiusz Raszeja Jun 05 '19 at 15:20
You are trying to make an effort, appreciate it. Atleast try and get some sample data into your query, your query has syntax errros and all over the place, try and execute with sample data into temp table before posting – Ven Jun 05 '19 at 15:30
It actually worked in my database, with comperable data. – Arkadiusz Raszeja Jun 05 '19 at 15:34
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/194503/discussion-between-arkadiusz-raszeja-and-ven). – Arkadiusz Raszeja Jun 05 '19 at 15:35

Aggregating Columns with conditions

1 Answers1