7

so simply put, this is what I am trying to do:

I have a collection of Range objects that are contiguous (non overlapping, with no gaps between them), each containing a start and end int, and a reference to another object obj. These ranges are not of a fixed size (the first could be 1-49, the second 50-221, etc.). This collection could grow to be quite large.

I am hoping to find a way to look up the range (or more specifically, the object that it references) that includes a given number without having to iterate over the entire collection checking each range to see if it includes the number. These lookups will be performed frequently, so speed/performance is key.

Does anyone know of an algorithm/equation that might help me out here? I am writing in Java. I can provide more details if needed, but I figured I would try to keep it simple.

Thanks.

Raedwald
  • 46,613
  • 43
  • 151
  • 237
IgnisFatuus
  • 1,998
  • 1
  • 12
  • 8

3 Answers3

4

If sounds like you want to use a TreeMap, where the key is the bottom of the range, and the value is the Range object.

Then to identify the correct range, just use the floorEntry() method to very quickly get the closest (lesser or equal) Range, which should contain the key, like so:

    TreeMap<Integer, Range> map = new TreeMap<>();
    map.put(1, new Range(1, 10));
    map.put(11, new Range(11, 30));
    map.put(31, new Range(31, 100));

    // int key = 0; // null
    // int key = 1; // Range [start=1, end=10]
    // int key = 11; // Range [start=11, end=30]
    // int key = 21; // Range [start=11, end=30]
    // int key = 31; // Range [start=31, end=100]
    // int key = 41; // Range [start=31, end=100]
    int key = 101; // Range [start=31, end=100]
    // etc.

    Range r = null;
    Map.Entry<Integer, Range> m = map.floorEntry(key);
    if (m != null) {
        r = m.getValue();
    }
    System.out.println(r);

Since the tree is always sorted by the natural ordering of the bottom range boundary, all your searches will be at worst O(log(n)).

You'll want to add some sanity checking for when your key is completely out of bounds (for instance, when they key is beyond the end of the map, it returns the last Range in the map), but this should give you an idea how to proceed.

azurefrog
  • 10,785
  • 7
  • 42
  • 56
  • Perfect, this is exactly what I was looking for. Would there be any drawback to using `map.floorEntry(key)` rather than doing: `map.get(key)`, _check for null_ `map.lowerEntry(key)`? – IgnisFatuus Jul 28 '14 at 21:48
  • Oh, good point that would be more readable, let me update the example code. – azurefrog Jul 28 '14 at 21:49
  • I can confirm that this approach is practical because I have had the same problem as the OP and solved in in this manner. – Raedwald Jul 28 '14 at 22:36
1

Assuming that you lookups are of utmost importance, and you can spare O(N) memory and approximately O(N^2) preprocessing time, the algorithm would be:

  • introduce a class ObjectsInRange, which contains: start of range (int startOfRange) and a set of objects (Set<Object> objects)
  • introduce an ArrayList<ObjectsInRange> oir, which will contain ObjectsInRange sorted by the startOfRange
  • for each Range r, ensure that there exist ObjectsInRange (let's call them a and b) such that a.startOfRange = r.start and b.startOfRange = b.end. Then, for all ObjectsInRange x between a, and until (but not including) b, add r.obj to their x.objects set

The lookup, then, is as follows:

  • for integer x, find such i that oir[i].startOfRange <= x and oir[i+1].startOfRange > x
  • note: i can be found with bisection in O(log N) time!
  • your objects are oir[i].objects
kzagar
  • 355
  • 2
  • 4
  • Thanks for the answer kzagar! I'm going with azurefrog's solution for ease of implementation. Would give you an upvote for good answer, but it won't let me do that yet... – IgnisFatuus Jul 28 '14 at 21:55
  • Yes, azurefrog's is easier to implement. But my solution also works for cases where the ranges would overlap (not a requirement for your case, though). – kzagar Jul 29 '14 at 06:47
0

If the collection is in order, then you can implement a binary search to find the right range in O(log(n)) time. It's not as efficient as hashing for very large collections, but if you have less than 1000 or so ranges, it may be faster (because it's simpler).

Ypnypn
  • 933
  • 1
  • 8
  • 21