2

I have following data

Table1

id   col1    col2       col3
----------------------------------
1    abc   01/01/2012    -
1    abc   01/01/2012    A
2    abc   01/01/2012    -
2    abc   01/02/2012    -
3    abc   01/02/2012    -
3    xyz   01/01/2012    -
4    abc   01/02/2012    -
4    xyz   01/01/2012    -
4    xyz   01/02/2012    -

following is order to evaluate -

if(col1 is false) then evaluate col2 if(col2 is false) then col3:

Col1 - xyz has first preference from all values in this column
col2 - min date
col3 - not '-' or min(col3)

I want to return only one row for each id, if col1 fails go to col2, if this fails then go to col3 condition. From above table result should be

 id   col1    col2       col3
----------------------------------
 1    abc     01/01/2012  A
 2    abc     01/01/2012  -
 3    xyz     01/01/2012  -
 4    xyz     01/01/2012  -

I tried using dense rank but it didn't help. I'm not sure how to perform this logic using any available function or sql logic.

for col1 - if more than one row for same code or xyz code then fail
for col2 - if more than one row with same min date then fail 
           [use this only if col1 condition fails]
rstech
  • 41
  • 1
  • 2
  • 6
  • :what do you mean by `preference order for each column:` ? can you please explain this ,and on what grounds condition fails specific to column? – Gaurav Soni Jul 12 '12 at 17:43
  • updated. I want to check if each column has only one row for specified condition, if it fails then check next column. – rstech Jul 12 '12 at 17:55
  • like above what do you mean by "preference" cos by the above example you whant group by ID, max(col1), min(col2), max(col3) – Jester Jul 12 '12 at 17:55
  • i want to evaluate col2 condition only when col1 condition fails. – rstech Jul 12 '12 at 17:58
  • check rows with id - 3 in table1 and my desired result row id - 3 – rstech Jul 12 '12 at 18:01
  • xyz is sample text i can have zyx code but i still want xyz (max will return zyx) – rstech Jul 12 '12 at 18:07

2 Answers2

8

You can specify many conditions to order by in your analytic function

SELECT *
  FROM (SELECT id,
               col1,
               col2,
               col3,
               dense_rank() over (partition by id
                                      order by (case when col1 = 'xyz' 
                                                     then 1 
                                                     else 0 
                                                 end) desc,
                                               col2 asc,
                                               col3 asc) rnk
          FROM your_table)
 WHERE rnk = 1

I'm assuming that you want dense_rank given that you used the dense_rank tag. You don't talk about how you want to handle ties or whether ties are even possible, so it's not clear from the question itself whether you want to use the rank, dense_rank, or row_number analytic functions. If you are only ever fetching the highest ranking row per id, rank and dense_rank will behave identically and will return multiple rows if there are ties for first place. row_number will always return a single row by arbitrarily breaking the tie. If you want to fetch rows other than the first row per id, then you'll need to think about ties and you'll get different behavior from rank and dense_rank. If two rows are tied for first, dense_rank will assign the third row a rnk of 2 while rank will assign it a rnk of 3.

This seems to work for the sample data you posted

SQL> ed
Wrote file afiedt.buf

  1  with x as (
  2  select 1 id, 'abc' col1, to_date('01/01/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
  3  select 1 id, 'abc' col1, to_date('01/01/2012', 'MM/DD/YYYY') col2, 'A' col3 from dual union all
  4  select 2 id, 'abc' col1, to_date('01/01/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
  5  select 2 id, 'abc' col1, to_date('01/02/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
  6  select 3 id, 'abc' col1, to_date('01/02/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
  7  select 3 id, 'xyz' col1, to_date('01/01/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
  8  select 4 id, 'abc' col1, to_date('01/02/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
  9  select 4 id, 'xyz' col1, to_date('01/01/2012', 'MM/DD/YYYY') col2, null col3 from dual union all
 10  select 4 id, 'xyz' col1, to_date('01/02/2012', 'MM/DD/YYYY') col2, null col3 from dual
 11  )
 12  SELECT *
 13    FROM (SELECT id,
 14                 col1,
 15                 col2,
 16                 col3,
 17                 dense_rank() over (partition by id
 18                                        order by (case when col1 = 'xyz'
 19                                                       then 1
 20                                                       else 0
 21                                                   end) desc,
 22                                                 col2 asc,
 23                                                 col3 asc) rnk
 24            FROM x)
 25*  WHERE rnk = 1
SQL> /

        ID COL COL2      C        RNK
---------- --- --------- - ----------
         1 abc 01-JAN-12 A          1
         2 abc 01-JAN-12            1
         3 xyz 01-JAN-12            1
         4 xyz 01-JAN-12            1
Justin Cave
  • 227,342
  • 24
  • 367
  • 384
  • this will not work. I want to use col2 rank only when i dont find any one row for col1 rank 1. – rstech Jul 12 '12 at 18:03
  • i tried dense rank and it didn't help me much, I'm not looking for sql that uses dense_rank only. i'm trying to figure out if there is another way to implement this. – rstech Jul 12 '12 at 18:10
  • @rstech - Have you tried the query I posted? It certainly seems to work for me. It will only use `col2` to break the ties in the `col1` condition. – Justin Cave Jul 12 '12 at 18:13
  • stupid me! i thought that will not work, but it works on my original data. I somehow was in wrong notion that it will not rank correctly if i use all columns in one order by clause. Thanks – rstech Jul 12 '12 at 18:17
  • My only comment is that dense_rank() and rank() return the same results for rank = 1, so the answer should be edited to reflect this. ("It is not clear from the question whether you want to use rank() or row_number()"). – Gordon Linoff Jul 12 '12 at 18:20
  • @GordonLinoff both rank and dense_rank will work. I need only rows with rank 1. dense_rank will not skip numbers after ties which is ok when i partition by id. Let me know if i missed something or i'm wrong – rstech Jul 12 '12 at 18:29
  • @rstech . . . I like your answer and even upvoted it. The way that one sentence is phrased, though, it sounds like dense_rank() and rank() will do different things. In fact, they do the same thing in this case, so I find that statement almost possibly misleading. This is just feedback on the phrasing, not on the answer itself. – Gordon Linoff Jul 12 '12 at 18:32
  • @GordonLinoff - Updated my answer to discuss the differences between the three analytic functions and how they handle ties. – Justin Cave Jul 12 '12 at 18:49
0
with tmp(id, col1, col2, col3, col1b, col3b) as
(select distinct id, col1, col2, col3,
        case when col1 = 'xyz' then '0' else '1' || col1 end,
        case when col3 = '-' then '1' else '0' || col3 end
from Table1)

select t1.id, t1.col1, t1.col2, t1.col3 
from tmp t1
left join tmp t2 on t1.id = t2.id
                and t1.col1b > t2.col1b
left join tmp t3 on t1.id = t3.id
                and t1.col1b = t3.col1b
                and t1.col2 > t3.col2
left join tmp t4 on t1.id = t4.id
                and t1.col1b = t4.col1b
                and t1.col2 = t4.col2
                and t1.col3b > t4.col3b
where t2.id is null
  and t3.id is null
  and t4.id is null
Set
  • 391
  • 1
  • 2