-1

When I'm running my SQL query I got unwanted duplicates. In my query I will list (5) latest products added as the example below.

A product can have assigned multiple categories like this example:

- Product A, Category "men > clothes > pants"
- Product A, Category "women > clothes > pants"
- Product B, Category "men > clothes > shirts"
- Product B, Category "women > clothes > shirts"

Running my SQL query will give me the output:

- Product A
- Product A
- Product B
- Product B

What I want is just:

- Product A (Category "men > clothes > pants")
- Product B (Category "men > clothes > shirts")

Product table
ProductID
ProductName
Category join table

JoinID
ProductID
Category3ID
Category3 table
Category3ID
Category2ID
Cat3Name
Category2 table
Category2ID
Category1ID
Cat2Name
Category1 table
Category1ID
Cat1Name

SELECT TOP 5 * FROM Product P
INNER JOIN Product_Category_Join PCJ ON (P.ProductID = PCJ.ProductID)
INNER JOIN Category3 C3 ON (PCJ.Category3ID = C3.Category3ID)
INNER JOIN Category2 C2 ON (C3.Category2ID = C2.Category2ID)
INNER JOIN Category1 C1 ON (C2.Category1ID = C1.Category1ID)
ORDER BY P.Date DESC
Maduro
  • 713
  • 5
  • 23
  • 44
Espen S.
  • 43
  • 6

1 Answers1

2

A simple way to remove duplicates is to use a CTE(common-table-expression) and the ROW_NUMBER function. It has the advantage that you can select all columns if you want:

WITH CTE AS
(
    SELECT  *, RN = ROW_NUMBER() OVER (PARTITION BY P.ProductID ORDER BY P.Date DESC)
    FROM Product P
    INNER JOIN Product_Category_Join PCJ ON (P.ProductID = PCJ.ProductID)
    INNER JOIN Category3 C3 ON (PCJ.Category3ID = C3.Category3ID)
    INNER JOIN Category2 C2 ON (C3.Category2ID = C2.Category2ID)
    INNER JOIN Category1 C1 ON (C2.Category1ID = C1.Category1ID)
)
SELECT TOP 5 * 
FROM CTE
WHERE RN = 1
ORDER BY Date DESC
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • Thanks @Tim Schmelter. I recon it should be `WITH CTE AS`? I got the following error: "The column 'ProductID' was specified multiple times for 'CTE'." – Espen S. Oct 29 '15 at 13:51
  • 1
    @EspenS.: yes. And you should list all columns `SELECT P.Col1,C2.Col2,C2.Col3` instead of using `SELECT *`. – Tim Schmelter Oct 29 '15 at 13:54
  • Your solution solved my query problem! Your CTE example provided me with new useful insight. – Espen S. Oct 29 '15 at 14:12