4

I believe the Oracle function FIRST_VALUE is what I need to be using based on these two questions: SQL - How to select a row having a column with max value
Oracle: Taking the record with the max date

I have 3 tables that represent people associated with organizations. Each organization may have a parent org, where ORG.PARENT is a foreign key to ORG.ID (so the table refers to itself). A person may be associated with more than one group.

PERSON

ID    NAME
----------
1     Bob

ORG

ID    NAME        PARENT
------------------------
1     A           (null)
2     A-1              1
3     A-2              1
4     A-3              1
5     A-1-a            2
6     A-1-b            2
7     A-2-a            3
8     A-2-b            3

PERSON_TO_ORG

PERSON_ID  ORG_ID
-----------------
    1        1
    1        3

I want to list the groups a person is associated with so I used this query:

SELECT NAME, ID, sys_connect_by_path(NAME, '/') AS path
FROM org
START WITH ID IN
(SELECT org_id FROM person_to_org WHERE person_id=1)
connect by prior org.ID = org.parent;

...which gives me:

NAME    ID    PATH
------------------
A-2     3     /A-2
A-2-a   8     /A-2/A-2-a
A-2-b   9     /A-2/A-2-b
A       1     /A
A-1     2     /A/A-1
A-1-a   5     /A/A-1/A-1-a
A-1-b   6     /A/A-1/A-1-b
A-2     3     /A/A-2
A-2-a   8     /A/A-2/A-2-a
A-2-b   9     /A/A-2/A-2-b
A-3     4     /A/A-3

Notice how A-2 appears twice, as it should. I don't want a group to appear twice, however. I want a group to only appear at its lowest level in the tree, i.e. at its highest level value. Here is how I've tried using FIRST_VALUE with no luck - I still get A-2 (and others) appearing twice:

SELECT id, name, path, first_value(lev) OVER
(
PARTITION BY ID,NAME, path ORDER BY lev DESC
) AS max_lev FROM
(SELECT NAME, ID, sys_connect_by_path(NAME, '/') AS path, LEVEL as lev
FROM org START WITH ID IN
(SELECT org_id FROM person_to_org WHERE person_id=1)
connect by prior org.ID = org.parent);

This seems similar to the FIRST_VALUE example in Pro Oracle SQL but I can't seem to make it work no matter how I tweak the parameters.

How can I return only the rows where a given group has its highest level value (i.e. farthest down in the tree)?

Community
  • 1
  • 1
Paul
  • 19,704
  • 14
  • 78
  • 96

3 Answers3

3

As also said in one of the threads you refer to, analytics are not the most efficient way to go here: you need to aggregate to filter out the duplicates.

SQL> SELECT id
  2       , max(name) keep (dense_rank last order by lev) name
  3       , max(path) keep (dense_rank last order by lev) path
  4    FROM ( SELECT NAME
  5                , ID
  6                , sys_connect_by_path(NAME, '/') AS path
  7                , LEVEL as lev
  8             FROM org
  9            START WITH ID IN (SELECT org_id FROM person_to_org WHERE person_id=1)
 10          connect by prior org.ID = org.parent
 11         )
 12   group by id
 13  /

        ID NAME  PATH
---------- ----- --------------------
         1 A     /A
         2 A-1   /A/A-1
         3 A-2   /A/A-2
         4 A-3   /A/A-3
         5 A-1-a /A/A-1/A-1-a
         6 A-1-b /A/A-1/A-1-b
         7 A-2-a /A/A-2/A-2-a
         8 A-2-b /A/A-2/A-2-b

8 rows selected.

Regards,
Rob.

PS: Here is some more information about the LAST aggregate function: http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions071.htm#sthref1495

Rob van Wijk
  • 17,555
  • 5
  • 39
  • 55
  • What is the purpose of `max(name)` and `max(path)`? Are they just so `keep` can be used? – Paul Jan 28 '12 at 17:35
  • max (or min) in this construct are only relevant in case more than one record is ordered last. Here it's impossible to be in the same tree AND have the same level, so you could use min here instead. – Rob van Wijk Jan 28 '12 at 17:55
  • Great...thanks for the excellent solution. I learned a lot by figuring out how it worked, too. – Paul Jan 28 '12 at 17:56
2

What about this (untested)

SELECT 
    SELECT id, 
           name, 
           path
FROM (            
    SELECT id, 
           name, 
           path, 
           row_number() over (partition by id,name order by lev desc) as rn
    FROM (
       SELECT NAME, 
              ID, 
              sys_connect_by_path(NAME, '/') AS path, 
              LEVEL as lev
       FROM org 
       START WITH ID IN (SELECT org_id FROM person_to_org WHERE person_id=1)
       connect by prior org.ID = org.parent
    )
) 
where rn = 1
Paul
  • 19,704
  • 14
  • 78
  • 96
1

You should partition only OVER (PARTITION BY ID,NAME ORDER BY lev DESC) not ID,NAME, path

Edit: And probably you want first_value(path), not first_value(lev)

Florin Ghita
  • 17,525
  • 6
  • 57
  • 76
  • That's closer - the `first_value(path)` column now contains the correct value for all rows for a given ID, but I still have duplicate IDs. – Paul Jan 28 '12 at 16:26
  • This is what you asked. Initialy, I posted a query that remove duplicates, but in one minute I removed it because I understood that you don't want that. The query was the same as a_horse_with_no_name. :) – Florin Ghita Jan 28 '12 at 19:19