9

question:

I have the following (directed) graph: Graph

And this table:

CREATE TABLE [dbo].[T_Hops](
    [UID] [uniqueidentifier] NULL,
    [From] [nvarchar](1000) NULL,
    [To] [nvarchar](1000) NULL,
    [Distance] [decimal](18, 5) NULL
) ON [PRIMARY]

GO

And this content:

      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'A'              ,'E'              ,10.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'E'              ,'D'              ,20.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'A'              ,'B'              ,5.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'B'              ,'C'              ,10.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'C'              ,'D'              ,5.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'A'              ,'F'              ,2.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'F'              ,'G'              ,6.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'G'              ,'H'              ,3.00000              );   
      INSERT INTO [dbo].[T_Hops]             ([UID]             ,[From]             ,[To]             ,[Distance])       VALUES             (newid()              ,'H'              ,'D'              ,1.00000              );   

Now I can query the best connection from point x to point y like this:

WITH AllRoutes 
(
     [UID]
    ,[FROM]
    ,[To]
    ,[Distance]
    ,[Path]
    ,[Hops]
)
AS
(
    SELECT 
         [UID]
        ,[FROM]
        ,[To]
        ,[Distance]
        ,CAST(([dbo].[T_Hops].[FROM] + [dbo].[T_Hops].[To]) AS varchar(MAX)) AS [Path]
        ,1 AS [Hops]
      FROM [dbo].[T_Hops]
      WHERE [FROM] = 'A'

    UNION ALL


    SELECT 
         [dbo].[T_Hops].[UID]
        --,[dbo].[T_Hops].[FROM]
        ,Parent.[FROM]
        ,[dbo].[T_Hops].[To]
        ,CAST((Parent.[Distance] + [dbo].[T_Hops].[Distance]) AS [decimal](18, 5)) AS distance
        ,CAST((Parent.[Path] + '/' + [dbo].[T_Hops].[FROM] + [dbo].[T_Hops].[To]) AS varchar(MAX)) AS [Path]
        ,(Parent.[Hops] + 1) AS [Hops]
     FROM [dbo].[T_Hops]

    INNER JOIN AllRoutes AS Parent 
            ON Parent.[To] = [dbo].[T_Hops].[FROM] 

)

SELECT TOP 100 PERCENT * FROM AllRoutes


/*
WHERE [FROM] = 'A' 
AND [To] = 'D'
AND CHARINDEX('F', [Path]) != 0 -- via F
ORDER BY Hops, Distance ASC
*/

GO

Now I want to create an undirected graph, for that I can, for example, also get the path from D to A

I start with a most simple change, and just ad the reverse direction for HD.

INSERT INTO [dbo].[T_Hops]
           ([UID]
           ,[From]
           ,[To]
           ,[Distance])
     VALUES
           (newid() --<UID, uniqueidentifier,>
           ,'D' --<From, nvarchar(1000),>
           ,'H' --<To, nvarchar(1000),>
           ,1 --<Distance, decimal(18,5),>
           )
GO

Now, as expected, my query throws an exception:

Infinite recursion / max recursion level (100) exceeded

Because the number of possible connections is now infinite.

Now in Oracle you do the same thing with "connect by prior" instead of a tree. And if a cycle problem (infinite recursion) is possible, you just add NOCYCLE to CONNECT BY PRIOR, making it "CONNECT BY NOCYCLE PRIOR"

Now in MS-SQL, I fixed that behaviour by adding:

AND Parent.[Path] NOT LIKE '%' + [dbo].[T_Hops].[FROM] + '/%'

to the inner join clause, essentially emulating NOCYCLE.

However, as LIKE is basically strstr (or worse strcasestr), and thus maximally slower than checking an array of parent elements, I am extremely worried about performance.

After all, this is just an example, and I intend to basically add data for an entire country. So the end result would potentially be extremely slow.

Anybody else has a better (=faster) method of how to replace NOCYCLE in MS SQL ?

Or is this the point where I simply have no other option but switching to Oracle (for doing this in an acceptable speed) ?

Note: Any temp tables (large amount of data) solution will be slower, because the temp tables will be swapped to the HardDisk when there is not enough RAM (absolute certainty).

Same goes for any solution using functions and table-valued functions.

Jerry Nixon
  • 31,313
  • 14
  • 117
  • 233
Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442
  • Note to self: also asked here: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/32069da7-4820-490a-a8b7-09900ea1de69 – Stefan Steiger Aug 19 '11 at 15:11
  • Note to self: Solution for PostGre: http://stackoverflow.com/questions/25058906/nocycle-in-postgres – Stefan Steiger Feb 13 '15 at 09:57
  • You seem skilled at the language. Why not use a procedure that implements the SQL equivelant of one of many well-documented all-pair shortest paths algorithms, instead of using kinda gimmicky recursion? My guess is that this would improve performance much better than trying to optimize a query of a less performant algorithm. – George Menoutis May 31 '19 at 10:11

1 Answers1

1

To improve select performance store possible paths between nodes in a permanent table

TABLE T_Hops_Path
(
  FromNode,
  ToNode,
  HopCount,
  TotalDistance
)

If your tree structure does not change often you can write a stored procedure that generates this table every N hours.

alexm
  • 6,854
  • 20
  • 24
  • And create a stored procedure that creates that table with a datetime postfix, writing the latest n postfix into a table which your mentioned stored procedure can use to create a dynamic SQL string. Hmm, big bricolage, but it could work. – Stefan Steiger Sep 13 '11 at 08:27