1

I have a number generator that issues a constant stream of numbers.

For each number, I want to translate that value into the highest threshold it has passed.

For example, the thresholds could be 0, 50, 100, 175, 250, 300, 400, 550.

For the number 55, I would then want to return 50, as that is the highest threshold lower than the value.

Similarly, if the number is 549 I want to return 400.

Etc.

Restrictions: No decimals, no negative numbers.

Also, the values in the threshold list do not follow any meaningful pattern.

Assuming threshold is a list with values sorted in ascending order with the index being L, my current algorithm is something along this:

function findTreshold(value: number) {
  switch(true) {
    case value >= threshold[L-1]:
      return threshold[L-1];
    case value >= threshold[L-2]:
      return threshold[L-2];
    ...
    default:
     return threshold[0];
  }
}

The solution needs to run in Javascript (ES6+). The example above is written in Typescript, in case anyone's curious about the switch(true) syntax.

But I'm curious to see if there's a more effective algorithm around. This function is called continuously so computation time is important.

earl3s
  • 2,393
  • 1
  • 23
  • 24
Lars Holdaas
  • 691
  • 1
  • 4
  • 24

7 Answers7

2

The following answer uses Binary Search. Since, the thresholds array is sorted, you need not search it entirely. It is sufficient to search the required range alone.

function findThreshold (array, target, low = 0, high = array.length - 1) {
  if (low > high) {
    return array[high]
  }
  const mid = Math.floor((low + high) / 2)

  if (target < array[mid]) {
    return findThreshold(array, target, low, mid - 1)
  } else if (target > array[mid]) {
    return findThreshold(array, target, mid + 1, high)
  } else {
    return array[mid]
  }
}

let thresholds = [0, 50, 100, 175, 250, 300, 400, 550];

console.log(findThreshold(thresholds,50));
console.log(findThreshold(thresholds,271));
console.log(findThreshold(thresholds,549));
console.log(findThreshold(thresholds,9999));

The time complexity is O(log n) . It does not use any additional space except the stack space for the recursion.

Phenomenal One
  • 2,501
  • 4
  • 19
  • 29
  • This is an excellent answer! Both in terms of algorithm and demonstration. Pretty much exactly what I was looking for. – Lars Holdaas May 11 '21 at 01:17
1

Well actually you can use find to solve this. Very simple an clean.

    let startTime, endTime;

    function start() {
      startTime = window.performance.now();
    };

    function end() {
      endTime = window.performance.now();
      const timeDiff = endTime - startTime; //in ms
      console.log(timeDiff + " milliseconds");
    }

    const thresholds = [0, 50, 100, 175, 250, 300, 400, 550];
    thresholds.reverse();
    const valuetofind = 145;
    start();
    console.log(thresholds.find(v => valuetofind  >= v))
    end();
    start();
    console.log(thresholds.find(v => 255  >= v))
    end();
    start();
    console.log(thresholds.find(v => 202  >= v))
    end();

console.log("Someone else suggestion");
// someone elses suggestion
  function findThreshold (array, target, low = 0, high = array.length - 1) {
    if (low > high) {
      return array[high]
    }
    const mid = Math.floor((low + high) / 2)

    if (target < array[mid]) {
      return findThreshold(array, target, low, mid - 1)
    } else if (target > array[mid]) {
      return findThreshold(array, target, mid + 1, high)
    } else {
      return array[mid]
    }
  }

  let thresholds2 = [0, 50, 100, 175, 250, 300, 400, 550];
  start();
  console.log(findThreshold(thresholds2,50));
  end();
  start();
  console.log(findThreshold(thresholds2,271));
  end();
  start();
  console.log(findThreshold(thresholds2,549));
  end();
  start();
  console.log(findThreshold(thresholds2,9999));
  end();
  

It might be tempting to assume that a handcrafted loop is faster than a find or simular, but usually it's not. But if it beats your switch/case.. Well you have to make the measurements yourself I assume.

Edit: Actually it seems that other solutions are quicker, added a testbed included, and result is that the other solution is slightly faster.

for loop with break vs find() in JavaScript

Griffin
  • 785
  • 5
  • 13
  • Thanks for a good answer! I appreciate the effort of researching to compare to other solutions haha. Yeah this is also a nice usage of ES6 syntax imo, but again would run at O(n) as far as I can see – Lars Holdaas May 11 '21 at 01:23
1

Most performant will most likely be a map - there are more ES6ish ways to do it - but the idea is you make a map so you are always doing a direct look up vs doing even a single iteration. (you can expand the map building to include direct use cases, like neg numbers or whatever)

let thresholds = [0, 50, 100, 175, 250, 300, 400, 550];
let valueMap = {};

//boiler to build valueMap
for(let index = 0; index < thresholds.length; index++){
    if(!thresholds[index+1]){break;}
    for(let i = thresholds[index]; i < thresholds[ index + 1]; i++){
        valueMap[i] = thresholds[index];
    }
}
valueMap[thresholds[thresholds.length-1]] = thresholds[thresholds.length-1];



//actual function to use
function findTreshold(num){
    return valueMap[num] || thresholds[thresholds.length-1];
}

console.log(findTreshold(50));
console.log(findTreshold(271));
console.log(findTreshold(549));
console.log(findTreshold(9999));
Phenomenal One
  • 2,501
  • 4
  • 19
  • 29
Kyle
  • 1,463
  • 1
  • 12
  • 18
  • Pretty creative solution! And yeah this should def. be faster than iterating. My only concern is the memory footprint? I can see this map growing to be quite big. I'll have to consider this solution for the tradeoff of computation time vs memory usage, but I like it a lot. – Lars Holdaas May 11 '21 at 01:27
1

The most performant method probably would be to put the thresholds in a balanced binary search tree, assuming that the threshold list does not change nearly as often as it is being used.

Below is a balanced binary search tree (you may search for "AVL tree" to find the algorithm for building it) of your example threshold list:

Binary Tree of OPs threshold list

Suppose the input is 299. The highest threshold found so far is -MAX_INT (a.k.a. the lowest number you can represent with your number type). The first search node is the root node (with a value of 175).

  1. 299 > 175 => highest_threshold := 175; current node := right child (with a value of 300)
  2. 299 < 300 => keep highest_threshold; current node := left child (250)
  3. 299 > 250 => highest_threshold := 250; no right child available, therefore: stop, return highest_threshold.

This approach has a search complexity of O(log n), while the other answers I see (up until now) do have a search complexity of O(n). But building the search tree has a complexity of O(n log n), so you should only use it if you can cache the tree in memory.

EDIT: Maruthi Adithya correctly points out that you can use a sorted array as if it was a balanced binary search tree, so you get the build phase for the price of sorting (it's still O(n log n), but you may need to sort the array anyway)! Also, Maruthi Adithya provided the code for the algorithm that I describe here by example; our answers nicely complement each other.

Also, this approach does work with negative and floating point numbers, too.

orithena
  • 1,455
  • 1
  • 10
  • 24
  • 1
    interesting - so the approach I've got with the cached map is best, if I'm reading this right, - but at the cost of space. That algorithm you've got there is pretty slick, I think I'd use it as the 'engine' to build the in mem cache until I hit a possibly predefined limit – Kyle May 10 '21 at 21:02
  • 1
    This is a very good answer! I would mark this as correct if there was not another answer with a similar approach with a full demonstration given as well. But anyway exactly what I was looking for! – Lars Holdaas May 11 '21 at 01:18
  • @LarsHoldaas Yeah, Maruthi Aditha beat me with the full implementation code, so you can take my answer as an explanation of how his code works (except that he implicitely built the tree by doing addressing magic on a sorted array, which I forgot that you can do that ... so you can ignore my remark about building AVL trees). – orithena May 11 '21 at 14:20
  • @Kyle I just read your algorithm... "at the cost of space" is a definitive understatement. "At the cost of your process being killed by the OOM killer" is more like it -- your space complexity goes up to 2 times the threshold _range_ (not the threshold count, as in my algorithm). And since you use a map instead of an array, finding the map key (a process that is hidden in the javascript engine) costs at least O(log m), with m being the threshold range. I'm sorry to say that your map approach is a whole order of magnitude slower than mine, and uses several orders of magnitude more memory. – orithena May 11 '21 at 18:20
  • @orithena lookup of keys in Map is O(log m)? TIL. If that's the case then yeah sorry Kyle but that solution is not really viable xD – Lars Holdaas May 12 '21 at 01:19
  • @LarsHoldaas Since a map is a collection of arbitrary key objects that point to value objects, you will need to search for the key object (instead of calculating its position, like in the special case where the keys are natural numbers starting with 0: an array). Best you can do is using the hash value of the key object as search key, which transforms your search complexity to a different O(log n), with n being the length of the hash this time. – orithena May 12 '21 at 09:06
  • @LarsHoldaas Complicating things in JS is the fact that arrays are dynamic (i.e. you can add more elements at arbitrary indices and the array grows automagically -- unlike in C), and you may end up with a sparse array (implemented as a map) if you fill the array from the wrong end (see [here for details](https://medium.com/@kevinzifancheng/how-are-arrays-implemented-in-javascript-be925a7c8021)) -- so even if you use an array, the JS engine might decide that it's actually a map with an access complexity of O(log n), depending of how you initialize and fill the array. – orithena May 12 '21 at 09:09
  • @orithena For sure the map could be a hog, no argument there, but "finding the map key (a process that is hidden in the javascript engine) costs at least O(log m), with m being the threshold range " ... I thought under the hood was a direct hash lookup... not just iterating the entire object... that seems to defeat the purpose of using the dictionary in the first place. – Kyle May 12 '21 at 17:53
  • @Kyle The "direct hash lookup" is the part that takes O(log m), and it cannot take O(1) because you _have to_ search for the hash value (Unless you use an array that reserves space for _every possible hash value_ -- and that would be way too much space, even if the hash value is only 32bits long). In arrays, you can use a simple multiplication to find the memory position of a value by index, but that only works for bounded indices in 0..n. – orithena May 13 '21 at 19:22
1

While this may not match some of the other answers for speed in the case where the number of thresholds increases to many thousands, I'm guessing it will hold it's own for lower number, and it certainly is fine for the handful in the example supplied. And the code is quite simple:

const findThreshold = (ts, thresholds = [...ts] .reverse()) => (n) =>
  thresholds .find (t => t < n) || ts [0]

const foo = findThreshold ([0, 50, 100, 175, 250, 300, 400, 550])

console .log (
  [55, 549, -20, 174, 175, 176, 700] .map (n => `${n} => ${
    foo(n)
  }`) .join('\n')
)
.as-console-wrapper {max-height: 100% !important; top: 0}

We curry a function that takes the list of thresholds, reverses a clone of that list and returns a function which takes a number of returns the first threshold less than the number.

If your thresholds weren't sorted to begin, then the .reverse() call should be replaced with .sort ((a, b) => b - a).

Scott Sauyet
  • 49,207
  • 4
  • 49
  • 103
  • This is a pretty creative solution! I like it a lot, but I went with another solution that had an O(log n) runtime. As far as I can see this one will be O(n) runtime for worst cases. – Lars Holdaas May 11 '21 at 01:21
  • Yes, it will be `O(n)`, and the performance can be really important if you have a large list of thresholds or if this is run in a tight loop. But I usually worry about small optimizations for the code that is demonstrably a hot-spot in my application. But there are a number of good -- and varied -- answers here. – Scott Sauyet May 11 '21 at 01:48
0

I'd imagine a for-loop in all it's simplicity would be easier to maintain and run faster. But you'd have to run performance tests to figure that out and it may even depend on which JS-engine it runs in (at this level compiler optimisation probably outweighs coding style)

So, I'd solve it as such:

function findTH(v) {
  for (let i = threshold.length - 1; i >= 0; --i) {
    if (v >= threshold[i]) {
      return threshold[i];
    }
  }
}
Windgazer
  • 161
  • 1
  • 6
0

Not sure if it would help with performance but if you have alot of thresholds I assume slicing out the diffrent digits and create a multidimensional array could be the fastes way

let thresholds = [0, 50, 100, 175, 250, 300, 400, 550];
let thresholdsHundreds = [];
thresholdsHundreds[0] = [50,0];
thresholdsHundreds[1] = [175,100,50];
thresholdsHundreds[2] = [250,175];
thresholdsHundreds[3] = [300,250];
thresholdsHundreds[4] = [400,300];
thresholdsHundreds[5] = [550,400];
let valueToFind = 425;
let valueToFindH = Math.floor(valueToFind/100);
thresholdsHundreds[valueToFindH].find(v => valuetofind  >= v)

Using my "proven to be inferior find" to show it as it's least code, and it's just boiler plate, but I think you will get the idea. Highest value from previous array "slot" included to not have to jump between array "slots". You can give it as many layers of arrays as make sense.

Griffin
  • 785
  • 5
  • 13