2

Say I have an array of arrays in Ruby,

[[100,300], 
 [400,500]]

that I'm building by adding successive lines of CSV data.

What's the best way, when adding a new subarray, to test if the range covered by the two numbers in the subarray is covered by any previous subarrays?

In other words, each subarray comprises a linear range (100-300 and 400-500) in the example above. If I want an exception to be thrown if I tried to add [499,501], for example, to the array because there would be overlap, how could I best test for this?

Phrogz
  • 296,393
  • 112
  • 651
  • 745
mbm
  • 1,902
  • 2
  • 19
  • 28

2 Answers2

9

Since your subarrays are supposed to represent ranges, it might be a good idea to actually use an array of ranges instead of an array of array.

So your array becomes [100..300, 400..500].

For two ranges, we can easily define a method which checks whether two ranges overlap:

def overlap?(r1, r2)
  r1.include?(r2.begin) || r2.include?(r1.begin)
end

Now to check whether a range r overlaps with any range in your array of ranges, you just need to check whether overlap?(r, r2) is true for any r2 in the array of ranges:

def any_overlap?(r, ranges)
  ranges.any? do |r2|
    overlap?(r, r2)
  end
end

Which can be used like this:

any_overlap?(499..501, [100..300, 400..500])
#=> true

any_overlap?(599..601, [100..300, 400..500])
#=> false

Here any_overlap? takes O(n) time. So if you use any_overlap? every time you add a range to the array, the whole thing will be O(n**2).

However there's a way to do what you want without checking each range:

You add all the ranges to the array without checking for overlap. Then you check whether any range in the array overlaps with any other. You can do this efficiently in O(n log n) time by sorting the array on the beginning of each range and then testing whether two adjacent ranges overlap:

def any_overlap?(ranges)
  ranges.sort_by(&:begin).each_cons(2).any? do |r1,r2|
    overlap?(r1, r2)
  end
end
sepp2k
  • 363,768
  • 54
  • 674
  • 675
  • 2
    Isn't this sufficient? r1.include?(r2.begin) || r2.include?(r1.begin) – steenslag Dec 21 '10 at 16:26
  • The advice about sorting the range before checking overlaps is good, but in this particular case, where the array keeps growing, it would seem very handy a bisect-like array. I found this, though not sure it's be the best implementation available: http://codeidol.com/other/rubyckbk/Arrays/Making-Sure-a-Sorted-Array-Stays-Sorted/ – tokland Dec 21 '10 at 16:40
  • @tokland: Unless I missed something, the implementation you linked (or anything else that is based on arrays) needs `O(n)` time to insert an item and maintaining sortedness. So adding all items would take `O(n**2)` time. If we want a solution that fails as early as possible and runs in `O(n log n)` time, we'd need a tree-based datastructure. – sepp2k Dec 21 '10 at 16:47
  • @sepp2k: well, as I said I didn't check the code, my point was that having a permanently ordered array may come handy. I guess there must be a good implementation out there (with the same performance of python's heapq, for example). – tokland Dec 21 '10 at 16:57
  • @tokland: Well, python's `heapq` is, as the name suggests, a heap, which is a tree. I don't think it's possible to implement a sorted array with `O(log n)` insertion time (where by sorted array I mean an array where each element is less than or equal to the one that comes directly after it in memory). As I said a tree-based datastructure (like e.g. the `rbtree` gem, which implements red-black-trees for ruby), could do this efficiently. – sepp2k Dec 21 '10 at 17:06
  • @sepp2k: Yeah, my wording (and link) was not precise; as you say an efficient implementation of this has be tree-based. So read "ordered array" as "something that seems an array and is always sorted". In the same way python's heapq is a tree in heart but it "seems" a list when you use it. – tokland Dec 21 '10 at 17:10
  • Note that ActiveSupport implements Range#overlaps? (plural) exactly the same way as shown above for #overlap? (non-plural) shown above http://rubydoc.info/docs/rails/2.3.8/ActiveSupport/CoreExtensions/Range/Overlaps – adzdavies Nov 06 '11 at 21:51
  • @azdavies: I'm not sure what you mean by plural vs. non-plural here. The rails method compares one range to one other range, not to multiple ranges, so I'd consider it a "singular" method. If you're talking about the method name: `overlaps?` is meant as a verb in 3rd person singular here, not a plural noun. – sepp2k Nov 06 '11 at 23:09
  • @sepp2k you are correct. Just pointing out the method name - in case you're using Rails it's nice to know it's already defined. You could potentially call it in place of "overlap?" within your "any_overlap?(ranges)" shown above. – adzdavies Nov 06 '11 at 23:20
  • Since you have sorted the ranges, by beginning, I believe that you no longer need to use the custom overlap function. Simply use `r1.cover?(r2.begin)` . The only way the second part of the `||` would ever be true is if `r2.begin` is less than `r1.begin` – ndw Nov 06 '13 at 22:19
0

Use multi_range and call overlaps? method to check whether there is any overlaps:

MultiRange.new([100..300, 400..500]).overlaps?(499..501)

Note that you may need to transform your input from array of arrays to array of ranges before using it. Possible working example is:

arrays = [[100, 300], [400, 500]]
subarray = [499, 501]

ranges = arrays.map{|a, b| a..b }
MultiRange.new(ranges).overlaps?(subarray[0].. subarray[1])
# => true
khiav reoy
  • 1,373
  • 13
  • 14