1

I need a query that will return all related alias id's from either column. Shown here are some alias customer ids, among thousands of other rows. If the input parameter to a query is id=7, I need a query that would return 5 rows (1,5,7,10,22). That is because they are all aliases of one-another. For example, 22 and 10 are indirect aliases of 7.

CustomerAlias
--------------------------
AliasCuID   AliasCuID2 
--------------------------
1           5      
1           7      
5           7      
10          5      
22          1      

Here is an excerpt from the customer table.

Customer
----------------------------------
CuID    CuFirstName    CuLastName
----------------------------------
 1      Mike           Jones
 2      Fred           Smith
 3      Jack           Jackson
 4      Emily          Simpson
 5      Mike           Jones
 6      Beth           Smith
 7      Mike           jones
 8      Jason          Robard
 9      Emilie         Jiklonmie
 10     Michael        jones
 11     Mark           Lansby
 12     Scotty         Slash
 13     Emilie         Jiklonmy
 22     mike           jones

I've been able to come close, but I cannot seem to select the indirectly related aliases correctly. Given this query:

SELECT DISTINCT Customer.CuID, Customer.CuFirstName, Customer.CuLastName
FROM  Customer  WHERE  
 (Customer.CuID = 7) OR (Customer.CuID IN
  (SELECT AliasCuID2
    FROM CustomerAlias AS CustomerAlias_2
    WHERE (AliasCuID = 7))) OR (Customer.CuID IN
  (SELECT AliasCuID
  FROM  CustomerAlias AS CustomerAlias_1
  WHERE (AliasCuID2 = 7)))

Returns 3 out of 5 of the desired ids of course. This lacks the indirectly related aliased id of 10 and 22 in the result rows.

1   Mike    Jones
5   Mike    Jones
7   Mike    jones

* Based on suggestions below, I am trying a CTE hierarchical query.

I have this now after following some suggestions. It works for some, as long as the records in the table reference enough immediate ids. But, if the query uses id=10, then it still comes up short, just by the nature of the data.

DECLARE @id INT
SET @id = 10;

 DECLARE @tmp TABLE ( a1 INT, a2 INT, Lev INT );

WITH Results (AliasCuID, AliasCuID2, [Level]) AS (
   SELECT AliasCuID,
          AliasCuID2,
          0 as [Level]
     FROM CustomerAlias
    WHERE AliasCuID = @id OR AliasCuID2 = @id
   UNION ALL
   -- Recursive step
   SELECT a.AliasCuID,
          a.AliasCuID2,
          r.[Level] + 1 AS [Level]
     FROM CustomerAlias a
     INNER JOIN Results r ON a.AliasCuID = r.AliasCuID2 )

    INSERT INTO @tmp
        SELECT * FROM Results;


WITH Results3 (AliasCuID, AliasCuID2, [Level]) AS (
   SELECT AliasCuID,
          AliasCuID2,
          0 as [Level]
     FROM CustomerAlias
    WHERE AliasCuID = @id OR AliasCuID2 = @id
   UNION ALL
   -- Recursive step
   SELECT a.AliasCuID,
          a.AliasCuID2,
          r.[Level] + 1 AS [Level]
     FROM CustomerAlias a
     INNER JOIN Results3 r ON a.AliasCuID2 = r.AliasCuID )

    INSERT INTO @tmp
        SELECT * FROM Results3;


  SELECT DISTINCT a1 AS id FROM @tmp
  UNION ALL
  SELECT DISTINCT a2 AS id FROM @tmp
  ORDER BY id

Note that this is a simplified the query to just give a list of related ids.

---
id
---
5
5
7
10

But, it is still unable to pull in ids 1 and 22.

Mike Lemke
  • 23
  • 4

2 Answers2

0

This is not an easy problem to solve unless you have some idea of the depth of your search (https://stackoverflow.com/a/7569520/1803682) - which it looks like you do not - and take a brute force approach to it.

Assuming you do not know the depth you will need to write a stored proc. I followed this approach for a nearly identical problem: https://dba.stackexchange.com/questions/7147/find-highest-level-of-a-hierarchical-field-with-vs-without-ctes/7161#7161

UPDATE If you don't care about the chain of how the alias's were created - I would run a script recursively to make them all refer to a single (master?) record. Then you can easily do the search and it will be quick - not a solution if you care about how the alias's get traversed though.

Community
  • 1
  • 1
Matthew
  • 9,851
  • 4
  • 46
  • 77
0

I created a SQL Fiddle for SQL Server 2012. Please let me know if you can or cannot access it.

My thought here was that you'd want to just keep checking the left and right branches recursively, separately. This logic probably falls apart if the relationships bounce between left and right. You could set up a third CTE to reference the first two, but joining on left to right and right to left, but ain't nobody got time for that.

The code is below as well.

CREATE TABLE CustomerAlias
(
  AliasCuID INT,
  AliasCuID2 INT
)
GO

INSERT INTO CustomerAlias
SELECT 1,5
UNION SELECT 1, 7
UNION SELECT 5, 7
UNION SELECT 10, 5
UNION SELECT 22, 1
GO

DECLARE @Value INT
SET @Value = 7


; WITH LeftAlias AS
(
  SELECT AliasCuID
    , AliasCuID2
  FROM CustomerAlias
  WHERE AliasCuID2 = @Value

    UNION ALL

  SELECT a.AliasCuID
    , a.AliasCuID2
  FROM CustomerAlias a
  JOIN LeftAlias b
    ON a.AliasCuID = b.AliasCuID2
)
, RightAlias AS
(
  SELECT AliasCuID
    , AliasCuID2
  FROM CustomerAlias
  WHERE AliasCuID = @Value

    UNION ALL

  SELECT a.AliasCuID
    , a.AliasCuID2
  FROM CustomerAlias a
  JOIN LeftAlias b
    ON a.AliasCuID2 = b.AliasCuID
)
SELECT DISTINCT A
FROM
(
  SELECT A = AliasCuID
  FROM LeftAlias

  UNION ALL

  SELECT A = AliasCuID2
  FROM LeftAlias

  UNION ALL

  SELECT A = AliasCuID
  FROM RightAlias

  UNION ALL

  SELECT A = AliasCuID2
  FROM RightAlias
) s
ORDER BY A
Nick Vaccaro
  • 5,428
  • 6
  • 38
  • 60