0

I've seen so many questions and answers on this subject, but I"m having trouble understanding what I'm doing wrong exactly. The query below gets all the information I need, but if it has multiple rows for when the r_num are the same and the r_order for each r_num are the same, then I only one result for each r_num. Right now though this call gives me errors. Can someone please explain to me how Distinctworks, and how I can use it in this query to achieve my goal?

SELECT DISTINCT po_num, r_num, r_order, vendor, order_date, received_by, received_date FROM(
    SELECT p.id as po_num, r.id as r_num, r.rec_order as r_order, v.name as vendor, p.order_date, r.received_by, r.received_date
    FROM Parts.dbo.po as p INNER JOIN
    Parts.dbo.vendor as v ON v.id = p.vendor__id INNER JOIN
    Parts.dbo.receipts as r ON r.id = 'R-'+CAST(p.id AS varchar)
    GROUP BY p.id, r.id, r.rec_order, v.name, p.order_date, r.received_by, r.received_date) as tbl
GROUP BY r_num, r_order

Here is the error I get

Msg 8120, Level 16, State 1, Line 1 Column 'tbl.po_num' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Mark Hill
  • 1,769
  • 2
  • 18
  • 33

1 Answers1

1

Unless you're doing aggregations, you don't need to have both a DISTINCT and a GROUP BY.

SELECT DISTINCT po_num, r_num, r_order, vendor, order_date, received_by, received_date 
FROM
  (
    SELECT p.id as po_num, r.id as r_num, r.rec_order as r_order, v.name as vendor, p.order_date, r.received_by, r.received_date
    FROM Parts.dbo.po as p INNER JOIN
    Parts.dbo.vendor as v ON v.id = p.vendor__id INNER JOIN
    Parts.dbo.receipts as r ON r.id = 'R-'+CAST(p.id AS varchar)
    GROUP BY p.id, r.id, r.rec_order, v.name, p.order_date, r.received_by, r.received_date
  ) as tbl

If you're trying to get distinct r_num and r_order values, you need to either group by / select only those two columns:

SELECT r_num, r_order
FROM
  (
    SELECT p.id as po_num, r.id as r_num, r.rec_order as r_order, v.name as vendor, p.order_date, r.received_by, r.received_date
    FROM Parts.dbo.po as p INNER JOIN
    Parts.dbo.vendor as v ON v.id = p.vendor__id INNER JOIN
    Parts.dbo.receipts as r ON r.id = 'R-'+CAST(p.id AS varchar)
    GROUP BY p.id, r.id, r.rec_order, v.name, p.order_date, r.received_by, r.received_date
  ) as tbl
GROUP BY r_num, r_order

Or, if your data can handle it, use an aggregation on the extra columns:

SELECT MAX(po_num) AS po_num, r_num, r_order, MAX(vendor) AS vendor -- etc.
FROM
  (
    SELECT p.id as po_num, r.id as r_num, r.rec_order as r_order, v.name as vendor, p.order_date, r.received_by, r.received_date
    FROM Parts.dbo.po as p INNER JOIN
    Parts.dbo.vendor as v ON v.id = p.vendor__id INNER JOIN
    Parts.dbo.receipts as r ON r.id = 'R-'+CAST(p.id AS varchar)
    GROUP BY p.id, r.id, r.rec_order, v.name, p.order_date, r.received_by, r.received_date
  ) as tbl
GROUP BY r_num, r_order
AHiggins
  • 7,029
  • 6
  • 36
  • 54
  • So the only time then I would need to actually use the DISTINCT keyword is if like I'm trying to find the SUM of certain columns, or a function where multiple results can occur? – Mark Hill Nov 06 '14 at 16:21
  • Not quite: use `GROUP BY` when you need to find a `SUM()` (or any other aggregate). As for DISTINCT, you only need it when you have multiple, completely identical rows in your query - it applies to all columns being returned. I generally find that in most cases where that happens, I'm doing something wrong with my JOINs ... – AHiggins Nov 06 '14 at 16:23
  • Ah, that makes sense, so to sum it all up, make sure my JOINS are correct and as concise as possible before going anywhere else. – Mark Hill Nov 06 '14 at 16:24
  • It certainly can't hurt :) – AHiggins Nov 06 '14 at 16:26