1

I have managed to build a recursive query which returns rows for the selected Id and all its children. This works absolutely fine for the ultimate parent, but I need it also to work correctly when the passed Id is that of one of the children, showing just the child and its children if any. Currently it still returns other child rows of the ultimate parent plus the passed child row displays twice...

As with a previous issue I have to do this using the sub-query format, because other TSQL based database engines than SQL Server may be used that do not support CTE or the WITH clause.

Desired Outcome:

Using Id 2, the correct data is returned: 2, 3, 4, 6, 7. Using Id 6, it should return 6, 7 only. Currently the query returns 6,3,4, 6,7.

Data:

ProjectId   ProjectName                             ParentId
1           Test Project                            -1
2           Test Project 2                          0
3           Test Project 2 Sub Project 1            2
4           Test Project 2 Sub Project 2            2
5           Test Project 3                          -1
6           Test Project 2 Sub Sub Project 1        3
7           Test Project 2 Sub Sub Sub Project 1    6

Query:

DECLARE @PROJECTID BIGINT = 2;

SELECT *
FROM            
(
    SELECT *
    FROM ProjectCostingProjects pcp
    WHERE pcp.[ProjectId] = @PROJECTID 
    UNION ALL
    SELECT  pcp2.*
    FROM    ProjectCostingProjects pcp2
    JOIN    ProjectCostingProjects pcp
    ON     pcp2.ParentID = pcp.ProjectId
);

Any advice or suggestions gratefully received.

oldcoder
  • 342
  • 2
  • 12
  • 1
    Should not ID 2 return, 2, 3, 6, 7 and 4? – devzero Aug 29 '17 at 21:34
  • @devzero: Thanks for that, you are absolutely correct. I did not include all columns in the question to make it more digestible. In my test data Id 4 is marked as inactive, and my full query has additional code to exclude inactive records. I have edited the answer to reflect your correction. – oldcoder Aug 30 '17 at 09:44

2 Answers2

2

So... the recursive common table expression is not merely a function of using union all, it is using the self reference of the common table expression in the second part of the union all. One does not simply replicate this recursive operation on another RDBMS by trying to put it a subquery/derived table.

If you want a recursive hierarchy traversal in SQL Server, the best option is to use the recursive common table expression.

declare @projectid bigint = 6;

;with cte as (
  select *
  from ProjectCostingProjects pcp
  where pcp.[ProjectId] = @projectid 
  union all
  select  pcp2.*
  from    ProjectCostingProjects pcp2
    inner join cte 
      on pcp2.Parentid = cte.ProjectId
)
select *
from cte;

rextester demo: http://rextester.com/XON59636

returns:

+-----------+--------------------------------------+----------+
| ProjectId |             ProjectName              | ParentId |
+-----------+--------------------------------------+----------+
|         6 | Test Project 2 Sub Sub Project 1     |        3 |
|         7 | Test Project 2 Sub Sub Sub Project 1 |        6 |
+-----------+--------------------------------------+----------+

The query you have now will (after adding an alias) return whatever row @ProjectID is, and 3,4,6,7. Because what you've written will return whichever row equals @ProjectID, and all rows that have a parent (that isn't 0 or -1), which are rows with ProjectId 3,4,6,7.

rextester demo with @ProjectId = null : http://rextester.com/VQU71307

SqlZim
  • 37,248
  • 6
  • 41
  • 59
  • Thanks for the input and the effort you went to in order to provide it. Unfortunately in the question I very clearly pointed out that I am unable to use CTE or WITH because the query needs to be portable to other database engines that support only a subset of TSQL. The information even has its own paragraph! As a result your posting is not an answer to the question posed... – oldcoder Aug 30 '17 at 09:38
  • @oldcoder Please tag your question with the actual database you are using. If you are using SQL Server Compact Edition please tag your question with `sql-server-ce` instead of `sql-server`. – SqlZim Aug 30 '17 at 12:15
  • I AM using SQL Server. However, as the question clearly stated the same query may need to be used on other database engines that use a subset of TSQL including but not limited to VistaDB, SQLite, etc. Therefore the question is tagged absolutely correctly, and the requirements are clearly included in the question ready for any potential responder to read and understand. It really could not be any clearer, could it? It specifically states *NOT* CTE and *NOT* WITH in the requirements within the question... – oldcoder Aug 30 '17 at 13:38
  • @oldcoder SQLite does not use a subset of TSQL, it uses its own subset of ANSI SQL + proprietary features, and it does support [Recursive Common Table Expressions using **`with`**](https://sqlite.org/lang_with.html). VistaDB, however, does not support recursive queries, so you will need a different solution, such as what devzero has posted. – SqlZim Aug 30 '17 at 15:02
  • Thank you for the clarification - I was not aware that SQLite supports those language options and I am always pleased to learn something new. I was fully aware that VistaDB does not have that support, but my point was to give examples of other engines that could potentially be used and for which the final query must be suitable. VistaDB therefore seems to be a great example of such an engine but SQLite not so good. So VistaDB is clearly the ideal one to discuss. Again, I am grateful for your efforts, but they do not address the problem at hand. – oldcoder Aug 30 '17 at 15:15
0

Here is an answer that uses a while loop instead.

Solution 1, using while loop and a temp table

/* Populating the temp table with the data */
DECLARE @recurse TABLE
(projectId INT,parent int);

INSERT INTO @recurse (projectId,parent) VALUES (1,-1);
INSERT INTO @recurse (projectId,parent) VALUES (2,-0);
INSERT INTO @recurse (projectId,parent) VALUES (3,2);
INSERT INTO @recurse (projectId,parent) VALUES (4,2);
INSERT INTO @recurse (projectId,parent) VALUES (5,-1);
INSERT INTO @recurse (projectId,parent) VALUES (6,3);
INSERT INTO @recurse (projectId,parent) VALUES (7,6);

DECLARE @recurse2 TABLE
(projectId INT,parent INT);

--Start by inserting the root element
INSERT INTO @recurse2 ( projectId, parent)
SELECT * FROM @recurse WHERE projectId = 2

--insert elements until all children have all children
WHILE EXISTS (SELECT * FROM @recurse WHERE parent IN (SELECT projectId FROM @recurse2) AND projectId NOT IN (SELECT projectId FROM @recurse2) )
BEGIN
    INSERT INTO @recurse2
    (
        projectId,
        parent
    )
    SELECT * FROM @recurse WHERE parent IN (SELECT projectId FROM @recurse2) AND projectId NOT IN (SELECT projectId FROM @recurse2)
END 

SELECT * FROM @recurse2

Solution 2 For performance you can create a intermidate table with the result from the while loop. This intermidiate table can be updated on an intervall, or as part of the creating an element in the main table. This could be either a part of your business logic or as a DB trigger.

Here is the code I would write if you wanted this as a scheduled job:

-- Populating the temp table with the data 
DECLARE @recurse TABLE
(projectId INT,parent int);

INSERT INTO @recurse (projectId,parent) VALUES (1,-1);
INSERT INTO @recurse (projectId,parent) VALUES (2,-0);
INSERT INTO @recurse (projectId,parent) VALUES (3,2);
INSERT INTO @recurse (projectId,parent) VALUES (4,2);
INSERT INTO @recurse (projectId,parent) VALUES (5,-1);
INSERT INTO @recurse (projectId,parent) VALUES (6,3);
INSERT INTO @recurse (projectId,parent) VALUES (7,6);

DECLARE @recurse2 TABLE
(projectId INT,parent INT, lvl int);

--Start by inserting all elements root at lvl 0
INSERT INTO @recurse2 ( projectId, parent, lvl ) SELECT projectId, parent, 0 FROM @recurse 
SELECT * FROM @recurse2

--insert elements until we have all parents for all elements
WHILE EXISTS (SELECT * FROM @recurse2 a WHERE lvl IN (SELECT TOP 1 lvl FROM @recurse2 b WHERE a.projectId = b.projectId ORDER BY lvl DESC) AND a.parent > 0 )
BEGIN
    --Insert the next parent for all elements that does not have a their top level parent allready
    INSERT INTO @recurse2 ( projectId, parent , lvl )
    SELECT  projectId, 
            (SELECT b.parent FROM @recurse b WHERE b.projectId = a.parent), 
            lvl + 1 
    FROM @recurse2 a WHERE lvl IN (SELECT TOP 1 lvl FROM @recurse2 b WHERE a.projectId = b.projectId ORDER BY lvl DESC) AND a.parent > 0 
END 

--Find all children to an element
SELECT * FROM @recurse2 WHERE parent = 2

The big advantage of solution #2 is that the performance should be REALY good for reads, probably even better than CTE's. Also it works equally well for reading from bottom to top, as top to bottom.

devzero
  • 2,510
  • 4
  • 35
  • 55
  • This appears to be *extremely* non-performant. My original query with two other comparison queries embedded in the same script has a total run time of 0.47 milliseconds. I set yours off on its own (no extra queries) well over a minute ago and it has still not returned even though there are only 7 records! If it doesn't return soon I will have to kill the process to escape it... The only change I made was to the column types which are BIGINT in my test data. Any ideas why? – oldcoder Aug 30 '17 at 09:51
  • OK. Performant on SQL Server, non-performant on other database engines. I will pass the query on to the engine developers for investigation, which will hopefully result in an improvement in due course, but for the meantime I need an alternative solution... – oldcoder Aug 30 '17 at 09:56
  • This was a problem with at least one other database engine, and has been resolved by the developers. Therefore this is a great answer, and I have accepted it with thanks. – oldcoder Oct 01 '17 at 09:13