14

Basically I have a 2xN array of ints to ints, which indicates from which position to which position of an objects location. Then I have a second array of ints and I want to find which ints land on which object. So for example:

First array

A: 0 - 500

B: 501 - 900

C: 901 - 1055

D: 1056 - 9955 etc.

Second array: 1, 999, 3, 898, 55, 43, 1055, 593, 525, 3099, etc.

This should return the A, C, A, B, A, A, C, B, B, D, etc.

What I am trying to figure out is if there is a way to hash the first array, using some hash functions, such that when hashing the second array I will get a collision if it falls within the range of an object. Any ideas how to do this or if its possible?

Thanks!

ElfsЯUs
  • 1,188
  • 4
  • 14
  • 34

4 Answers4

6

You can use a NavigableMap.

Example Code:

NavigableMap<Integer, String> map = new TreeMap<Integer, String>();
map.put(0, "A");
map.put(501, "B");
map.put(901, "C");
map.put(1056, "D");

System.out.println(map.floorEntry(1).getValue());
System.out.println(map.floorEntry(999).getValue());
System.out.println(map.floorEntry(3).getValue());
System.out.println(map.floorEntry(898).getValue());

Output:

A C A B

Thiago Falcao
  • 4,463
  • 39
  • 34
3

You can use some data structure like an interval tree to store the first array.

http://en.wikipedia.org/wiki/Interval_tree

Then when you go through the second array, you can simply query the tree for a matching interval. This way, you will require O(log n) time to query each element of the second array.

0

You can't use hashing, but you can sort the interval end-points and do a bisection search.

Something like this (in python, but hopefully it makes sense to you):

endpoints = [501, 901, 1056, 9956]
for x in [1, 999, 898, 55, 43, 1055, 593, 525, 3099]:
    print x, 'ABCD'[bisect.bisect_left(endpoints, x)]
  • Maybe I should have explained a little more... but the first array is actually a stream. So I don't have all the ranges from the start, simply too many of them. What I want to do is check if int in index i of the second array already exists in the hashmap. If the int falls within a range that have already seen then return that object, else keep reading in from the stream until I find the correct rane (in the mean time adding the ones I've read from the stream to the hashmap for future use). I know I could turn this to a O(logn) by using a tree... but trying to see if I can't do better. – ElfsЯUs Dec 11 '11 at 09:32
  • And am not sure that is the case... if the range was constant then this would be an easy problem to solve... but with changing range size it makes it a lot harder, but am still not convinced that this is impossible to do. – ElfsЯUs Dec 11 '11 at 09:39
  • An array uses significantly less memory than a hashmap. Also, hashing doesn't work the way you want: you can't hash an interval and then use it to test if a point lies within. –  Dec 11 '11 at 09:42
  • Suppose hash(0) is K. Then consider any range [-N, N]. Clearly 0 is in this, so it has to hash to the same. But then each point in this interval has to hash also to the same. The conclusion is that the hash function has to be constant. –  Dec 11 '11 at 09:45
  • Am not sure your prof is conclusive, if K is zero then we can easily divise a way to hash all values to 0. If we simplify this problem and say that we know the size of the range then it can be easily solved by rounding the position to the nearest range max and hashing for that... thus if we have range [-N, N] where N-(-N) = M, then our hash is min(i/M). Thus, being able to hash for a range is not impossible. What might be impossible is to hash for an unknown range. But am not convinced of that. – ElfsЯUs Dec 11 '11 at 10:11
0

You problem seems closely related to BWT decoding.

If i do understand your problem correctly, you receive the first array as a stream.

Then, if you have the second array in memory, you simply need to build the "reverse array" of it. So that, for example :

Second array: 1, 9, 3, 8, 5, 4, 2, 6, 7

Becomes

Inverse second array : 1, 7, 3, 6, 5, 8, 9, 4, 2

So, now, on receiving your stream, you know immediately where to place each character.

Cyan
  • 13,248
  • 8
  • 43
  • 78