3

I want to create a large range map that will map the keys accoriding their numbers to buckets , for instance :

            NavigableMap<String, String> map = new TreeMap<>();

            map.put("var2#" + 0L,   "out0");      // "var2#0..100        => out0
            map.put("var2#" + 100L, "out1");    // "var2#100..200"     => out1
            map.put("var2#" + 200L, "out2");    // "var2#200..300"     => out2
            map.put("var2#" + 300L, "out3");    // "var2#300..+"       => out3

That means if a new key will arrive with value res should be "var2#150" ==> "out1"

What I tried to do , is to use sorted map , everything is working with range of the numbers inside the map

something like :

String out1 = map.floorEntry("var2#" + 150L).getValue(); //out1 , works!

, but if send var2#2000 , instead to get res "out3" , I got "out2" , and so on...

String res = map.floorEntry("var2#" + 2000L).getValue(); 
Syso(res)  ==> out2 , BUT I expected result => "out3"
// because it is bigger that the range.

P.S:

It is very large map with prefix of some "string" and comes after typed
 long number . Eg. "var1#100, var1#200 , ...bla1#1000 , bla5#2000....

Another issue - when I have same long value on different keys I expect to do 1st match on the string and then on the number ...

    map.put("var1#" + 200L, "out0");
    map.put("var2#" + 200L, "out1");
    map.put("var3#" + 200L, "out2");
    map.put("var4#" + 200L, "out3");

    String out1 = map.floorEntry("var2#" + 150L).getValue();
    System.out.println("====> " + out1); //expected  out1 , because only match of "var2
    String out3 = map.floorEntry("var2#" + 250L).getValue(); //expected  out1 , because only match of "var2
    System.out.println("====> " + out3);" ....

Any suggestions please , maybe some algorithm ?

VitalyT
  • 1,671
  • 3
  • 21
  • 49

4 Answers4

2

One way to compare the prefix string-wise, followed by comparing the suffix numerically, would be:

public static int compareParts(String a, String b) {
    String[] aa = a.split("#", 2), ba = b.split("#", 2);
    int c = aa[0].compareTo(ba[0]);
    return c != 0? c: Integer.compare(Integer.parseInt(aa[1]), Integer.parseInt(ba[1]));
}

But since comparison methods might get called very often, even a single lookup may involve multiple comparisons, it is worth investigating some time to improve the performance, even if the code will look more complicated:

public static int compareParts(String a, String b) {
    final int aLen = a.length(), bLen = b.length(), l = Math.min(aLen, bLen);
    int ix = 0;
    stringPart: {
        for(; ix < l; ix++) {
            char aCh = a.charAt(ix), bCh = b.charAt(ix);
            int cmp = Character.compare(aCh, bCh);
            if(cmp != 0)
                return aCh == '#'? -1: bCh == '#'? +1: cmp;
            if(aCh == '#') break stringPart;
        }
        return 0;
    }
    // number part
    int aIx = ix+1, bIx = aIx;
    while(aIx < aLen && a.charAt(aIx)=='0') aIx++;
    while(bIx < bLen && b.charAt(bIx)=='0') bIx++;
    int cmp = Integer.compare(aLen-aIx, bLen-bIx);
    for(; cmp == 0 && aIx < aLen; aIx++, bIx++) {
        cmp = Character.compare(a.charAt(aIx), b.charAt(bIx));
    }
    return cmp;
}

This does only a single pass over the string. First, it iterates over the first characters of the string like String.compareTo would do, stopping at the first mismatching character or the '#' character. If only one string encountered the '#' the other has a longer prefix and we have to consider that for the result.

Only when both strings have the same prefix, the numerical part after the '#' gets processed. Instead of a doing a full integer parsing, we skip all leading zeros, if there are some. Then, if the length of the remaining significant part differs, it already indicates which number is larger. Only if the significant parts have the same length, we need to iterate them. But we can compare the digits literally without needing to interpret them as numbers in that case, as the iteration order is already from most significant digit to lowest significant digit.

Either method can be used like

NavigableMap<String, String> map = new TreeMap<>(MyClass::compareParts);

map.put("var2#" + 0L,   "out0");
map.put("var2#" + 100L, "out1");
map.put("var2#" + 200L, "out2");
map.put("var2#" + 300L, "out3");

String out1 = map.floorEntry("var2#" + 150L).getValue();
System.out.println("out1 = "+out1);
String out3 = map.floorEntry("var2#" + 2000L).getValue();
System.out.println("res = "+out3);
Holger
  • 285,553
  • 42
  • 434
  • 765
  • Hi @Holger , I have some issue that your suggestion isn't working , if u can have a look it will be great , thanks :) – VitalyT Jun 14 '18 at 05:11
  • @VitalyT When you use `floorEntry`, you request an entry equal or smaller, so `floorEntry("var2#"+150L)` can never end up at the entry of `"var2#"+200L`, as `200L` is neither equal nor smaller than `150L`. Since there is no smaller key with the `"var2#"` prefix, the closest key is `"var1#"+200L`. The prefix *has* precedence. But still, a comparator imposes a total order and `floorEntry` will invariably return an entry with an equal or smaller key according to that order. You can’t expect it to suddenly return an entry with a bigger key based on an additional condition. – Holger Jun 14 '18 at 07:29
  • thank's , I see what you're saying, but if I need to search first by "var2#" and after to map into backet range , how would you change the code , cause it works perfect until I got a bug with scenario like described :( – VitalyT Jun 14 '18 at 09:04
  • You could use `String out1 = map.subMap("var2#0", true, "var3#0", false).get("var2#150");` to enforce the prefix, then, the result will be `null` as there’s no smaller key. Or resort to higher key if no smaller one with that prefix exists, like `Map.Entry e = map.subMap("var2#0", true, "var2#150", true).lastEntry(); if(e == null) e = map.subMap("var2#0", true, "var3#0", false).firstEntry(); String out1 = e != null? e.getValue(): null;`. It may still result in `null` if no key with that prefix exists. – Holger Jun 14 '18 at 11:03
  • I mean inside your suggestion code snippet , 'int compareParts(String a, String b) ' ... – VitalyT Jun 14 '18 at 11:33
  • That’s impossible. As already said, a comparator has to impose a total order. The order cannot change depending on what key you’re looking up. All library code, including `TreeMap`, relies on the comparator adhering to the contract. When you ask for an equal or smaller key, you will get an equal or smaller key. – Holger Jun 14 '18 at 11:39
  • what about , 'NavigableMap map = new TreeMap<>( Comparator.comparing((String p1) -> p1.split("#")[0]) .thenComparingLong(p1 -> Long.parseLong(p1.split("#")[1])));' , it could work ? I know not with the best performance – VitalyT Jun 14 '18 at 11:41
  • That is doing exactly the same as my code. Note my answer’s first code example, which does exactly the same as your suggestion, except for performing `String.split` only once per value. I intentionally posted both variants to document the code’s logic, which is exactly the same. Perhaps you still don’t understand what “*total order*” means. Between all elements accepted by the comparator, there is a relationship `a ≤ b ≤ c ≤ d ≤ e ≤ f ≤ …`, etc, and if you ask for `d` and it isn’t in the map, you will get `c`, never `e`, regardless of what these elements are made of and how they are compared. – Holger Jun 14 '18 at 11:49
  • in my snippet , I'm trying to order twice , once by string when p1.split("#")[0]) and then after by number p1.split("#")[1]) , I don't understand where the 1st part of the order it in your solution (sorry) – VitalyT Jun 14 '18 at 13:54
  • In the first example, you see a `compareTo` invocation on the first part, followed by invoking `Integer.compare` on the second part, only applied if the result of the first comparison is zero. That’s exactly the logic of `thenComparing`. In the end, these technical details do not matter. ☛Re-read my previous comment. Try to understand what “*total order*” means. Asks yourself, which order your comparator is supposed to produce. Write down all keys of the map in exactly that order. Insert the lookup key at the right position in that sequence. Look what’s to the left of it. That’s `floorEntry`. – Holger Jun 14 '18 at 14:03
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/173157/discussion-between-vitalyt-and-holger). – VitalyT Jun 14 '18 at 14:10
1

The problem is that the TreeMap is using String for comparison. So it sorts alphabetically, and var2#2000 comes between var2#200 and var2#300. You should either specify a custom comparator, or use Long or Integer as the key for the TreeMap. So, this should work:

NavigableMap<Long, String> map = new TreeMap<>();
map.put(0L,   "out0");      // "var2#0..100        => out0
map.put(100L, "out1");    // "var2#100..200"     => out1
map.put(200L, "out2");    // "var2#200..300"     => out2
map.put(300L, "out3");    // "var2#300..+"       => out3
Hari Menon
  • 33,649
  • 14
  • 85
  • 108
  • Upvoted your answer because it makes sens to keep _only_ the meaningful part as key. It's more readable and also avoid errors in the future. – Mincong Huang May 31 '18 at 05:07
  • If `"var2#"` is constant in those keys, we can definitely use a numeric value. But if the idea is to provide a range for `var1`, `var2`, ... `varN`... this will need something like a `Map>` – AxelH May 31 '18 at 05:15
  • @ AxelH , var2 is not const , it is very large map with prefix of some "string" and comes after typed long number . Eg. "var1#100, var1#200 , ...bla1#1000 , bla5#2000...." – VitalyT May 31 '18 at 05:17
  • Please add this information in your questin @VitalyT, this is important. – AxelH May 31 '18 at 05:22
1

You can extract the 2nd part of the key and use it as comparator for the navigable map:

Comparator.comparingLong(key -> Long.parseLong(key.split("#")[1]))

So:

NavigableMap<String, String> map =
    new TreeMap<>(Comparator.comparingLong(key -> Long.parseLong(key.split("#")[1])));

map.put("var2#" + 0L,   "out0");    // "var2#0..100        => out0
map.put("var2#" + 100L, "out1");    // "var2#100..200"     => out1
map.put("var2#" + 200L, "out2");    // "var2#200..300"     => out2
map.put("var2#" + 300L, "out3");    // "var2#300..+"       => out3

assertThat(map.floorEntry("var2#" + 150L).getValue()).isEqualTo("out1");
assertThat(map.floorEntry("var2#" + 2000L).getValue()).isEqualTo("out3");
Mincong Huang
  • 5,284
  • 8
  • 39
  • 62
1

I would split the key to get a Map of range per variable:

Map<String, Ranges> map;

Where we implementsRanges as we need to map the value and the result, like proposed by Hari Menon.

class Ranges {

    NavigableMap<Long, String> map = new TreeMap<>();

    public String setFloor(long l, String s){
        return map.put(l, s);
    }

    public String getFloor(long l){
        return map.floorEntry(l).getValue();
    }
}

That will be simple to populate :

Map<String, Ranges> map = new HashMap<>();

Ranges r = new Ranges();
r.setFloor(0L, "out1");
r.setFloor(100L, "out2");   
map.put("var1", r);

r = new Ranges();
r.setFloor(0L, "out3");
r.setFloor(100L, "out4");
map.put("var2", r);

System.out.println(map.get("var1").getFloor(50L));
System.out.println(map.get("var2").getFloor(150L));

out1
out4

We can use a NavigableMap instead of the HashMap but I didn't see the point here.

Please note that this isn't NPE safe, this was not secured to keep the solution short and readable.

AxelH
  • 14,325
  • 2
  • 25
  • 55