2

Ok, so I have an intermediate table for mapping suppliers from one system to another. There are a bunch of instances where they want to map multiple suppliers from the old system to the new system. So my map table looks like

legacy      new
12345       1
12346       1
54321       2
9876        3
9875        3

When I join the product table to the supplier table for the old system it's pulling multiple suppliers (12345, 12346). So when I join to the map table to get the new id, I'm getting a duplicate row since it's joining both legacy suppliers to the same new one.

There are cases where a product has multiple suppliers (12346, 54321) that map to diff new ids, which is good/valid.

Assume the product IDs are the same between systems, code looks something like this.

SELECT *
  FROM products p
    INNER JOIN suppliers s -- Find the supplier(s) for the product
      ON s.pid = p.pid
    LEFT JOIN supplier_map sm -- Get the new supplier IDs
      ON sm.legacy_id = s.sid
    LEFT JOIN NEWSYS.dbo.product_supplier ps  -- Checks to see if item/supplier combo already exists in new system
      ON ps.item_id = p.pid
      AND ps.supplier_id = sm.new_id

  WHERE ps.item_id IS NULL -- We only want new item/supplier combos

So how can I construct the join so that I only get one result if the new_id is the same for the same product (ie how do I eliminate the dupe?)

Running on MS SQL Server 13.0.4001.0

Dizzy49
  • 1,360
  • 24
  • 35
  • 3
    I think you just want `select distinct`, but you should be careful about the columns you choose. If that doesn't work, you should provide sample data and desired results. – Gordon Linoff Aug 22 '19 at 16:59
  • 1
    If the rows are complete duplicates then I agree you want select distinct. If they are close to duplicates but have different values in that legacy column, distinct won't help. You could use a group by and MAX(legacy) for a really dirty solution. Can you maybe modify the table so that one row is considered 'primary' for that one to many relationship? Then you could add `AND Primary = 1' to whatever join needs to only select a single legacy ID and get a lot more control over the results. You can even add a table constraint so that each [new] value only has one primary flag. – Cody McPheron Aug 22 '19 at 17:27
  • 1
    Google "join to top 1 row" while I look for the appropriate dupe to flag. – Tab Alleman Aug 22 '19 at 17:44
  • 1
    @TabAlleman Nope, not a dupe. If it was joining to top 1, that would be easy. The issue is that it's joining two unique ids on the left to a single id on the right (so one for each). If it was the other way around I could join to top. – Dizzy49 Aug 22 '19 at 19:20
  • 1
    Well...you join to the top 1, just not top 1 of the join-key. You have two rows with `new_id=1` and you only want one of them, so you JOIN to a derived table that only contains the TOP 1 for each `new_id`, and join to the `legacy_id` associated with each of those `new_id`. Same idea, nothing new. – Tab Alleman Aug 22 '19 at 19:57

0 Answers0