2

I'm experimenting with SQL Common Table Expressions using a sample hierarchy of cities, countries and continents and which have been visited and which haven't.

The table t_hierarchy looks like this:

The hierarchy table

(NOTE: The visited column is deliberately NULL for non-cities because I want this to be a dynamically calculated percentage.)

I have then used the following SQL to create a recursive result set based on the data in t_hierarchy:

WITH myCTE (ID, name, type, parentID, visited, Depth)
 AS
 (
    Select ID, name, type, parentID, visited, 0 as Depth From t_hierarchy where parentID IS NULL
    UNION ALL
    Select t_hierarchy.ID, t_hierarchy.name, t_hierarchy.type, t_hierarchy.parentID, t_hierarchy.visited, Depth + 1 
    From t_hierarchy 
    inner join myCte on t_hierarchy.parentID = myCte.ID
 )

Select ID, name, type, parentID, Depth, cnt.numDirectChildren, visited
FROM myCTE
LEFT JOIN (
          SELECT  theID = parentID, numDirectChildren = COUNT(*)
          FROM    myCTE
          GROUP BY parentID
        ) cnt ON cnt.theID = myCTE.ID

 order by ID

The result looks like this:

cte result

What I would like to do now, which I am struggling with, is to create a column, e.g. visitedPercentage to show the percentage of cities visited for each 'level' of the hierarchy (treating cities differently to countries and continents). To explain, working our way up the 'tree':

  • Madrid would be 100% because it has been visited (visited = 1)
  • Barcelona would be 0% because it has not been visited (visited = 0)
  • Spain would therefore be 50% because it has 2 direct children and one is 100% and the other is 0%
  • Europe would therefore be 50% because Spain is 50%, France is 100% (Paris has been visited), and Germany is 0% (Berlin has not been visited)

I hope this makes sense. I kind of want to say "if it's not a city, work out the visitedPercentage of THIS level based on the visitedPercentage of all direct children, otherwise just show 100% or 0%. Any guidance is much appreciated.


UPDATE: I've managed to progress it a bit further using Daniel Gimenez's suggestion to the point where I've got France 100, Spain 50 etc. But the top level items (e.g. Europe) are still 0, like this:

enter image description here

I think this is because the calculation is being done after the recursive part of the query, rather than within it. I.e. this line:

SELECT... , visitPercent = SUM(CAST visited AS int) / COUNT(*) FROM myCTE GROUP BY parentID

is saying "look at the visited column for child objects, calculate the SUM of the values, and show the result as visitPercent", when it should be saying "look at the existing visitPercent value from the previous calculation", if that makes sense. I've no idea where to go from here! :)

Community
  • 1
  • 1
valoukh
  • 541
  • 1
  • 8
  • 19
  • Hey @valoulkh I don't have time to work on this, but you'll need a second recurisive CTE that uses myCTE to come up with all the related cities in each level of the hierachy. `SELECT ID, parentID FROM... UNION ALL SELECT ID, [parent's parentID] FROM newCTE INNER JOIN myCTE` – Daniel Gimenez Dec 10 '15 at 15:31
  • Not a problem, thanks for your help so far. I'll try and follow your logic :) – valoukh Dec 10 '15 at 15:35

1 Answers1

2

I think I've done it, using 2 CTE's. In the end it was easier to get the total number of descendents for each level (children, grandchildren etc) and use that to calculate the overall percentage.

That was painful. At one point typing 'CATS' instead of 'CAST' had me puzzled for about 10 minutes.

with cte1 (ID,parentID,type,name,visited,Lvl) as (
    select t.ID, t.parentID, t.type, t.name, t.visited, 0 as [Lvl]
    from t_hierarchy t
    where t.parentID is not null
    union all
    select c.ID, t.parentID, c.type, c.name, c.visited, c.Lvl + 1
    from t_hierarchy t
        inner join cte1 c on c.parentID = t.ID
    where t.parentID is not null
),
cte2 (ID,name,type,parentID,parentName_for_reference,visited,Lvl) as (
    Select t_hierarchy.ID, t_hierarchy.name, t_hierarchy.type, t_hierarchy.parentID, p.name as parentName_for_reference, t_hierarchy.visited, 0 as Lvl
        From t_hierarchy
        left join t_hierarchy p ON p.ID = t_hierarchy.parentID
        where t_hierarchy.parentID IS NULL
    UNION ALL
    Select t_hierarchy.ID, t_hierarchy.name, t_hierarchy.type, t_hierarchy.parentID,p.name as parentName_for_reference, t_hierarchy.visited, Lvl + 1 
    From t_hierarchy
    inner join cte2 on t_hierarchy.parentID = cte2.ID
    inner join t_hierarchy p ON p.ID = t_hierarchy.parentID
)

select cte2.ID,cte2.name,cte2.type,cte2.parentID,cte2.parentName_for_reference,cte2.visited,cte2.Lvl
,CASE WHEN type = 'city' THEN 'N/A' ELSE CAST(cnt.totalDescendents as varchar) END AS totalDescendents
,CASE WHEN type = 'city' THEN 'N/A' ELSE CAST(COALESCE(cnt2.totalDescendentsVisited,0) as varchar) END AS totalDescendentsVisited
,CASE WHEN type = 'city' THEN 'N/A' ELSE CAST((CAST(ROUND(CAST(COALESCE(cnt2.totalDescendentsVisited,0) as float)/CAST(cnt.totalDescendents as float),2) AS numeric(36,2))*100) as varchar) END as asPercentage
from cte2
left JOIN (
     SELECT  theID = parentID, COUNT(*) as totalDescendents
     FROM cte1
     WHERE type = 'city'
     GROUP BY parentID
  ) cnt ON cnt.theID = cte2.ID
 left JOIN (
     SELECT  theID = parentID, COUNT(*) as totalDescendentsVisited
     FROM cte1
     WHERE type = 'city' AND visited = 1
     GROUP BY parentID
  ) cnt2 ON cnt2.theID = cte2.ID
ORDER BY ID

enter image description here

These posts were helpful:

Community
  • 1
  • 1
valoukh
  • 541
  • 1
  • 8
  • 19