3

We have a nested structure of tasks in which every task can contain other tasks. Order of tasks in a task is important and is defined by the Sequence field starting at zero. Here is my table structure:

USE [MyDB]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Relation](
    [PK_ID] [int] IDENTITY(1,1) NOT NULL,
    [SourceEntityId] [uniqueidentifier] NOT NULL,
    [TargetEntityId] [uniqueidentifier] NOT NULL,
 CONSTRAINT [PK_Relation] PRIMARY KEY CLUSTERED 
(
    [PK_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TaskTable1](
    [Id] [uniqueidentifier] NOT NULL,
    [Title] [nvarchar](max) NULL,
    [SequenceId] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TaskTable2](
    [Id] [uniqueidentifier] NOT NULL,
    [Title] [nvarchar](max) NULL,
    [SequenceId] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO
SET IDENTITY_INSERT [dbo].[Relation] ON 

GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (1, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'5b266fd1-cbc8-c16a-91c4-5675a35c9ecf')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (2, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'e499ca68-8103-b8ec-06ba-110fa3f6eb5b')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (4, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'645ad2eb-df10-0d5b-0526-408aad45a145')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (5, N'785227d1-393c-ae18-02e5-03ab08d577af', N'5655aeb7-b8b5-dca9-38af-37687c668c14')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (6, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'030cdefc-0e45-01e6-e2a5-a69e303bda4b')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (7, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'0375c7a1-8cc5-a4c8-151c-966e4af83f73')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (8, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'785227d1-393c-ae18-02e5-03ab08d577af')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (9, N'030cdefc-0e45-01e6-e2a5-a69e303bda4b', N'8324bba9-252f-bef8-c018-8b86491e2361')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (10, N'030cdefc-0e45-01e6-e2a5-a69e303bda4b', N'f1cbe8a3-3285-4cf0-096d-aad0327bdb0b')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (11, N'dab00c89-961c-84dd-bb43-cffd18e63594', N'0189f0af-5045-a498-2d70-99187bf3f0ae')
GO
INSERT [dbo].[Relation] ([PK_ID], [SourceEntityId], [TargetEntityId]) VALUES (12, N'785227d1-393c-ae18-02e5-03ab08d577af', N'ffecd091-c17b-ee5f-a64d-54ea9ff65aa9')
GO

SET IDENTITY_INSERT [dbo].[Relation] OFF
GO
INSERT [dbo].[TaskTable1] ([Id], [Title], [SequenceId]) VALUES (N'5b266fd1-cbc8-c16a-91c4-5675a35c9ecf', N'First', 0)
GO
INSERT [dbo].[TaskTable1] ([Id], [Title], [SequenceId]) VALUES (N'e499ca68-8103-b8ec-06ba-110fa3f6eb5b', N'Second', 1)
GO
INSERT [dbo].[TaskTable1] ([Id], [Title], [SequenceId]) VALUES (N'0189f0af-5045-a498-2d70-99187bf3f0ae', N'Fourth', 3)
GO
INSERT [dbo].[TaskTable1] ([Id], [Title], [SequenceId]) VALUES (N'0375c7a1-8cc5-a4c8-151c-966e4af83f73', N'Sixth', 5)
GO
INSERT [dbo].[TaskTable2] ([Id], [Title], [SequenceId]) VALUES (N'030cdefc-0e45-01e6-e2a5-a69e303bda4b', N'Fifth', 4)
GO
INSERT [dbo].[TaskTable2] ([Id], [Title], [SequenceId]) VALUES (N'785227d1-393c-ae18-02e5-03ab08d577af', N'Seventh', 6)
GO
INSERT [dbo].[TaskTable2] ([Id], [Title], [SequenceId]) VALUES (N'645ad2eb-df10-0d5b-0526-408aad45a145', N'Third', 2)
GO
INSERT [dbo].[TaskTable2] ([Id], [Title], [SequenceId]) VALUES (N'8324bba9-252f-bef8-c018-8b86491e2361', N'sub1', 0)
GO
INSERT [dbo].[TaskTable2] ([Id], [Title], [SequenceId]) VALUES (N'f1cbe8a3-3285-4cf0-096d-aad0327bdb0b', N'sub2', 1)
GO
INSERT [dbo].[TaskTable1] ([Id], [Title], [SequenceId]) VALUES (N'ffecd091-c17b-ee5f-a64d-54ea9ff65aa9', N'sub 1', 0)
GO
INSERT [dbo].[TaskTable1] ([Id], [Title], [SequenceId]) VALUES (N'5655aeb7-b8b5-dca9-38af-37687c668c14', N'sub 2', 1)
GO

To get the tasks in order with their child tasks right beneath their parents, I tried the following query to no avail:

;With TaskCTE 
AS
(
    select R.SourceEntityId AS ParentTask_Id, R.TargetEntityId AS Task_Id , cast(null as uniqueidentifier) AS ParentTask, 0 AS Level
     , ROW_NUMBER() OVER (ORDER BY (SELECT 100)) / power(10.0,0) as x
     from Relation R
        where (R.SourceEntityId = 'DAB00C89-961C-84DD-BB43-CFFD18E63594')
    UNION ALL
    select R1.SourceEntityId , R1.TargetEntityId, TaskCTE.Task_Id  , Level + 1 
    , x + ROW_NUMBER() OVER (ORDER BY (SELECT 100)) / power(10.0,level+1)
    from Relation R1
        INNER JOIN TaskCTE
            ON R1.SourceEntityId = TaskCTE.Task_Id  
)

select ParentTask_Id, Task_Id, ParentTask, Level 
, COALESCE(TT1.Title, TT2.Title) AS Title
, COALESCE(TT1.SequenceId, TT2.SequenceId) AS SequenceId
, x
from TaskCTE
LEFT OUTER JOIN TaskTable1 TT1 
ON TaskCTE.Task_Id = TT1.Id
LEFT OUTER JOIN TaskTable2 TT2
ON TaskCTE.Task_Id = TT2.Id

order by level , SequenceId

If you follow the structure of required output (shown in below image), the sequence ** column along with the **Level column must determine the sort order.

Thanks in advance

required output

Edit: My query output which is wrong: enter image description here

Yasser Sobhdel
  • 611
  • 8
  • 26
  • 1
    Good of you to include an [mcve]. However, when I execute your code, the order is as expected (not like in your picture). (On sql server 2014, but I doubt the version will matter) – HoneyBadger Feb 07 '18 at 10:52
  • My code is giving wrong output, the correct output must have a proper ordering of Level and sequenceId, in case the record has child tasks. My calculated 'X' column fails to honor the rule. – Yasser Sobhdel Feb 07 '18 at 10:55
  • I can only say my results are ordered perfectly fine, I don't really understand what you mean about column `x`, but `ROW_NUMBER() OVER (ORDER BY (SELECT 100))` is underterministic: you are ordering by a contant. – HoneyBadger Feb 07 '18 at 11:02
  • You must be very careful when using `row_number` in recursive part. It may not work as you expect. In any case it's better if you provide a desired output – uzi Feb 07 '18 at 11:02
  • @HoneyBadger: can you please post the results? is it completely like the picture I have uploaded, since in my lab, I can see the record with sequenceId 3 is not placed correctly. – Yasser Sobhdel Feb 07 '18 at 11:09
  • I can explain why you are getting a semi-sorted results, for brevity I have reproduced the tables and I have inserted rows in order, that is why you are getting such a result. please change the order of rows inserted while inserting records in the table. thanks – Yasser Sobhdel Feb 07 '18 at 11:11
  • Tables have no inherit order, and the order in which rows are inserted is not preserved in any way. That cannot be the issue. – HoneyBadger Feb 07 '18 at 11:14
  • Alright, see my results in the **Edit** section – Yasser Sobhdel Feb 07 '18 at 11:17
  • sequenceId 3 Seems correctly placed in the order in your results. It's Level 1/SequenceId 0 which looks misplaced. – HoneyBadger Feb 07 '18 at 11:17
  • Yep, you are right, please look at the edit section. thanks – Yasser Sobhdel Feb 07 '18 at 11:20
  • Those are the same results I get, what's wrong with it? The ordering is exactly as expected isn't it? – HoneyBadger Feb 07 '18 at 11:52
  • No, it isn't expected result. the expected result is the first image. As you mentioned earlier the last 4 rows are misplaced. – Yasser Sobhdel Feb 07 '18 at 12:35
  • do you mean that output is correct but ordering is not correct.can you please paste the correct output ?Thanks.hurry up. – KumarHarsh Feb 08 '18 at 03:58

3 Answers3

1

If your problem is that sequence field in other table rather than relation table, then why do not you join them before running recursion? But it likely will be slower than your initial query. Here's a sample

with cte as (
    select 
        r.SourceEntityId, r.TargetEntityId, t.SequenceId, 0 k
    from 
        Relation r
        join (
            select * from TaskTable1
            union all
            select * from TaskTable2
        ) t on r.TargetEntityId = t.id


    ---------------------------------------        
    union all select * from cte where k = 1
    ---------------------------------------
)
, rcte as (
    select
        SourceEntityId, TargetEntityId, ParentTask = cast(null as uniqueidentifier)
        , SequenceId, rn = cast(row_number() over (order by SequenceId) as varchar(8000)), 1 step
    from
        cte
    where
        SourceEntityId = 'DAB00C89-961C-84DD-BB43-CFFD18E63594'
    union all
    select
        a.TargetEntityId, b.TargetEntityId, a.SourceEntityId, b.SequenceId
        , cast(concat(a.rn, '.', row_number() over (partition by b.SourceEntityId order by b.SequenceId)) as varchar(8000))
        , step + 1
    from
        rcte a
        join cte b on a.TargetEntityId = b.SourceEntityId
)
select
    *
from
    rcte
order by rn

I have not included your X column, I can not get what are trying to calculate. Also, in your expected output values of ParentTask and ParentTask_Id are same. Should be so?

uzi
  • 4,118
  • 1
  • 15
  • 22
  • Thanks for the answer. This produces correct result. I will wait for more effective query and then mark as answer. – Yasser Sobhdel Feb 08 '18 at 16:43
  • Sure. No problem. Up to you. How much does it perform worse? There is one trick that can speed up recursion. But that is not guaranteed. Can you post your final query? – uzi Feb 08 '18 at 17:36
  • I am afraid that I don't have access to main data and I have omitted many details for brevity. Let me run it on main DB and I will let you know. Thanks any way and I am all ears for every kind of performance-boosting trick :) – Yasser Sobhdel Feb 08 '18 at 17:49
  • You have not posted your query. So I made changes to my query with a trick I am talking about. Look at the end of CTE named cte – uzi Feb 09 '18 at 04:01
1

I am using same query as @Uzi with minor correction.I am having same doubts as him.@Yasser should clearly show what output is desire in proper output and remove unnecessary columns.

if row_number only purpose is to order record then why convert it to varchar(8000).Also you can avoid expensive Row_number all together.

Take advantage of PK_ID instead of expensive row_number,even if PK_ID is not in sequence in this case.

If performance is big issue then user should mention number of rows in 3 TABLE AND WHAT OTHER FILTER be applied IN WHERE CONDITION ?

Why data type is uniqueidentifier ?Will it solve the purpose if it is INT? Read this

Check this query,

     WITH cte
    AS (
        SELECT r.PK_ID
            ,r.SourceEntityId
            ,r.TargetEntityId
            ,t.SequenceId,0 k
        FROM #Relation r
        JOIN (
            SELECT id
                ,SequenceId
            FROM #TaskTable1

            UNION ALL

            SELECT id
                ,SequenceId
            FROM #TaskTable2
            ) t ON r.TargetEntityId = t.id

             ---------------------------------------        
        --union all select * from cte where k = 1
        ---------------------------------------
        )
        ,rcte
    AS (
        SELECT SourceEntityId
            ,TargetEntityId
            ,ParentTask = cast(NULL AS UNIQUEIDENTIFIER)
            ,SequenceId
            , rn = cast(row_number() over (order by SequenceId) as decimal(3,1))
            --, rn = cast( SequenceId+1 as decimal(3,1))--**
            ,1 step
        FROM cte
        WHERE SourceEntityId = 'DAB00C89-961C-84DD-BB43-CFFD18E63594'

        UNION ALL

        SELECT a.TargetEntityId
            ,b.TargetEntityId
            ,a.SourceEntityId
            ,b.SequenceId

                ,cast((a.rn+(b.SequenceId/10.0)) as  decimal(3,1))
            ,step + 1
        FROM rcte a
        JOIN cte b ON a.TargetEntityId = b.SourceEntityId

        )

    SELECT *

    FROM rcte
    ORDER BY rn
--**
--SELECT *

--FROM rcte
--ORDER BY rn,st

-- 2nd Edit,

I understand that there is no way of changing database. In that case it is very logical to create index view where task table id is Clustered index.

  select id, SequenceId from #TaskTable1
  union all
  select id, SequenceId from #TaskTable2



     Create nonclustered index NCI_Relation_SourceID on Relation([SourceEntityId])
 Create nonclustered index NCI_Relation_TargetEntityId on Relation([TargetEntityId])

you can once try this combination,

Remove PK_ID as clustered index and make TargetEntityId as clustered index.

you can once try creating view on this query,

SELECT r.PK_ID
        ,r.SourceEntityId
        ,r.TargetEntityId
        ,t.SequenceId
    FROM #Relation r
    JOIN (
        SELECT id
            ,SequenceId
        FROM #TaskTable1

        UNION ALL

        SELECT id
            ,SequenceId
        FROM #TaskTable2
        ) t ON r.TargetEntityId = t.id
KumarHarsh
  • 5,046
  • 1
  • 18
  • 22
  • Thanks for the answer, I may answer your questions as well: row_number or the "x" column is merely used for sorting. I have simplified the situation yet the guid primary key is what I am facing in the main database. I am trying to use this query as a stored procedure where the hard-coded guid is the sole parameter. Size of task tables are somewhat 300000 rows per table and increasing with a rate of 20000 per month. – Yasser Sobhdel Feb 09 '18 at 05:12
  • Thanks for edit but your query does not provide the required result. – Yasser Sobhdel Feb 10 '18 at 12:29
  • you can comment rn and uncomment "--**" part. – KumarHarsh Feb 10 '18 at 18:11
0

by adding a new column named Hierarchy in CTE expression and sorting outcome according to this value could solve your requirement

Here is the modified CTE query

;With TaskCTE AS
(
    select 
    R.SourceEntityId AS ParentTask_Id, 
    R.TargetEntityId AS Task_Id , cast(null as uniqueidentifier) AS ParentTask, 0 AS Level
     , ROW_NUMBER() OVER (ORDER BY (SELECT 100)) / power(10.0,0) as x
     ,CAST( ROW_NUMBER() OVER (ORDER BY R.SourceEntityId) as varchar(max)) Hierarchy
     from Relation R
        where (R.SourceEntityId = 'DAB00C89-961C-84DD-BB43-CFFD18E63594')


    UNION ALL

    select R1.SourceEntityId , R1.TargetEntityId, TaskCTE.Task_Id  , Level + 1 
    , x + ROW_NUMBER() OVER (ORDER BY (SELECT 100)) / power(10.0,level+1)
    ,CAST(Hierarchy + ':' + CAST(ROW_NUMBER() OVER (ORDER BY R1.SourceEntityId) as varchar(max)) as varchar(max)) as Hierarchy
    from Relation R1
        INNER JOIN TaskCTE
            ON R1.SourceEntityId = TaskCTE.Task_Id  
)

select ParentTask_Id, Task_Id, ParentTask, Level 
, COALESCE(TT1.Title, TT2.Title) AS Title
, COALESCE(TT1.SequenceId, TT2.SequenceId) AS SequenceId
, x
,Hierarchy
from TaskCTE
LEFT OUTER JOIN TaskTable1 TT1 
ON TaskCTE.Task_Id = TT1.Id
LEFT OUTER JOIN TaskTable2 TT2
ON TaskCTE.Task_Id = TT2.Id

order by Hierarchy 

Please note that I have added Hierarchy column and its value is calculated using a ROW_NUMBER() function which creates a unique integer value for each task

You can find implemantation of this hierarchy query with SQL CTE at refereced tutorial I hope it helps

I am also adding the output as screenshot here to show how data is sorted according to Hierarchy

Although childs are listed after parents, it does not one-to-one match with your desired outcome as I could see

enter image description here

Eralper
  • 6,461
  • 2
  • 21
  • 27
  • I am afraid that I can't see any difference between your output and my code when ordering on X column. I have run your code and row with SequenceId 3 is not correctly placed. – Yasser Sobhdel Feb 07 '18 at 12:31
  • Did you check the referenced tutorial, because there is an implementation for similar sorting there. I might haved used a wrong column for Hierarchy field.. – Eralper Feb 07 '18 at 13:50
  • Thank you for pointing out the tutorial. I saw it, here is my problem: every tutorial assumes you have an "Ordinal" or "Sequence" field along with hierarchical data, yet my case is different. I have relations in a different table and "Sequence" field in others, and when I use the CTE with recursion, I reach an structure which needs another recursion to complete. Since nested CTE is not supported, I have asked for help to find if there is any kind of query to solve this issue? – Yasser Sobhdel Feb 07 '18 at 14:01