0

I'm looking to run an UPDATE, and a subsequent DELETE if necessary, query on my table (let's call it MY_TABLE) that would merge all rows in the following way.

Input Table

ID  LowerRange  UpperRange  Attribute
1   10          20          A
2   20          30          A
3   40          50          A
4   15          35          B

Output table

ID  LowerRange  UpperRange  Attribute
1   10          30          A
3   40          50          A
4   15          35          B

Notice how...

  1. Rows 1 & 2 of the Input Table are merged into Row 1 of the Output Table because their ranges overlap and they have the same Attribute.
  2. Row 3 of the Input Table is not merged with Rows 1 & 2 because their ranges don't overlap, despite them having the same Attribute.
  3. Row 4 of the Input Table is not merged with Rows 1 & 2 because they don't have the same Attribute, despite having overlapping ranges.

All rows in TABLE would be merged where their ranges overlap and they have the same Attribute.

Let me know if you have any questions. Any help would be greatly appreciated.

Thanks, Stephen.

Stephen Walsh
  • 815
  • 4
  • 18
  • 34
  • Could you have the case where the next row overlaps the previous row for the same attribute? E.g. for attribute C, rows with (lower_range, upper_range) of (10, 20) and (15, 23). Or would the lower_range always been the same or greater than the previous row's upper_range? – Boneist Feb 24 '16 at 16:18

2 Answers2

1

Here's one way (assuming that you could have overlapping rows where the start range of the current row is less than or equal to the end range of the previous row):

with sample_data as (select 1 id, 10 lower_range, 20 upper_range, 'A' attribute from dual union all
                     select 2 id, 20 lower_range, 30 upper_range, 'A' attribute from dual union all
                     select 3 id, 40 lower_range, 50 upper_range, 'A' attribute from dual union all
                     select 4 id, 15 lower_range, 35 upper_range, 'B' attribute from dual union all
                     select 5 id, 45 lower_range, 55 upper_range, 'A' attribute from dual union all
                     select 6 id, 16 lower_range, 34 upper_range, 'B' attribute from dual)
select min(id) id,
       min(lower_range) lower_range,
       max(upper_range) upper_range,
       attribute
from   (select id,
               lower_range,
               upper_range,
               attribute,
               sum(diff) over (partition by attribute order by lower_range, upper_range) grp
        from   (select id,
                       lower_range,
                       upper_range,
                       attribute,
                       case when lag(upper_range, 1, lower_range) over (partition by attribute order by lower_range, upper_range) >= lower_range then 0 else 1 end diff
                from   sample_data))
group by attribute, grp;

        ID LOWER_RANGE UPPER_RANGE ATTRIBUTE
---------- ----------- ----------- ---------
         1          10          30 A        
         3          40          55 A        
         4          15          35 B        

If your rows only overlap when the previous upper_range is the same as the current lower_range, then just remove the > from the case statement.

What this does is see if the lower_range of the current row is greater than or equal to the previous row's upper_range. If it is, then we set the result to be 0, otherwise we'll set it to be 1 (which indicates that there is a gap between the two rows).

Next, we then perform a cumulative sum across all the rows per attribute. This then will have the same result for rows that overlap, and will increase by 1 every time it comes across a gap.

Now we can use this along with the attribute column to group the rows and find their min/max ranges along with the min(id).

Boneist
  • 22,910
  • 1
  • 25
  • 40
  • Yes, you're correct in your assumption that one row overlaps with another if it's lower range is less than or equal to the other's, once it's upper range is greater than or equal to the other's. This seems to do exactly what I want and thank you very much for your help so far. The only thing left before I mark this as the accepted answer... ... How would I do the initial 'with sample data as' part for all the rows in my table 'MY_TABLE'? – Stephen Walsh Feb 25 '16 at 08:14
  • 1
    You wouldn't need the sample_data subquery; you'd use your table instead in the main SQL statement. I only created the sample_data subquery to mimic a table with data in, so that I could test the query works. – Boneist Feb 25 '16 at 08:40
  • Ah yes, I just tried that a few minutes ago and seen that it works. Thank you very much. There is just one slight difference that I noticed after running the query on my table which I didn't make clear in the original post. Say if two rows both have the same attribute, and one row's upper range is 9 while the other row's upper range is 10. These should also be merged. I think this is done by just making it >= MSISDN_LOWER-1 but you might want confirm? – Stephen Walsh Feb 25 '16 at 08:52
  • What would the lower_range be for these two rows? – Boneist Feb 25 '16 at 09:13
  • Apologies that's a typo, I meant when one's upper range is 9 and the other's LOWER range is 10, so they essentially follow each other consecutively. Then they should be merged. – Stephen Walsh Feb 25 '16 at 09:19
  • 1
    yeah, `>= lower_range - 1` should work for your purposes. – Boneist Feb 25 '16 at 09:38
0

Check this out: Merge data in two row into one

Community
  • 1
  • 1
brokensax
  • 59
  • 3
  • apologies, maybe I should have made it explicit: the above is just a simplified example. My table actually contains thousands of rows with lots of different LowerRanges, UpperRanges and Attributes. But what I want to do is the same throughout the table. – Stephen Walsh Feb 24 '16 at 13:55
  • what is the datatype of the lowerrange. It looks like it is text because there is a space between the values... Pending on your answer maybe a case statement would work after you can split the values. CASE WHEN Col BETWEEN col1 AND col2 THEN X ELSE 'NOTHING' END AS UpperRange – codeMonger123 Feb 24 '16 at 13:55
  • This question has already been answered earlier. Check out the link in my edited answer. – brokensax Feb 24 '16 at 13:58