If as your comment states:
if there were duplicates then I would like to take the min(date_received), min(expiry_date), min(quantity)
Then just use the MIN
function:
SELECT batch_number, MIN(date_received), MIN(expiry_date), MIN(prod_code), MIN(quantity)
FROM scheme.stquem
GROUP BY batch_number;
The problem with this approach, is given the following sample data:
batch_number date_received expiry_date prod_code quantity
-----------------------------------------------------------------
1 2015-08-01 2015-09-01 p1 5
1 2015-08-02 2015-08-08 p1 3
1 2015-08-02 2015-08-09 p0 1
You will get fields from all different rows, so you will get your date_received
from the first row, your expiry_date
from the 2nd, and prod_code
and quantity
from the 3rd. This is valid, and aggregates are useful, however I am not sure this would be what you would want.
Instead I imagine you would want to rank your rows within each batch_number
, which you can do using ROW_NUMBER()
:
SELECT batch_number,
date_received,
expiry_date,
prod_code,
quantity,
RowNumber = ROW_NUMBER()
OVER(PARTITION BY batch_number
ORDER BY date_received, expiry_date, prod_code, quantity)
FROM scheme.stquem
This would give you an output of:
batch_number date_received expiry_date prod_code quantity RowNumber
------------------------------------------------------------------------------
1 2015-08-01 2015-09-01 p1 5 1
1 2015-08-02 2015-08-08 p1 3 2
1 2015-08-02 2015-08-09 p0 1 3
Then you just need to put the above query into a subquery, and select only the first row for each partition:
SELECT batch_number, date_received, expiry_date, prod_code, quantity
FROM ( SELECT batch_number,
date_received,
expiry_date,
prod_code,
quantity,
RowNumber = ROW_NUMBER()
OVER(PARTITION BY batch_number
ORDER BY date_received, expiry_date, prod_code, quantity)
FROM scheme.stquem
) AS t
WHERE t.RowNumber = 1;
So here you get the fields associated with the minimum date_received
, rather than for unrelated records.