How do I find an element in an array that is always less than the element (note NOT closest, strictly less than)?

Question

This is NOT this question -- how to find best matching element in array of numbers? , because the linked question deals with the "closest" element even if that element is greater than the element in question. My question below deals with elements that are only less than the varaible in question. Here's my question:

I'm using Ruby 2.4 . I haev an array of numbers. They are always ordered and unique

[1, 5, 8, 12, 16, 17]

Given another number in a variable, how do I figure out the element from teh array that is closest to my variable without going over? For example, if my variable contained "11" and given the above example, the answer would be "8".

We'd like to see what you tried, rather than just start throwing out code. It's better for us to correct your mistake, so it applies directly to what you've written, than for us to write code that won't match your variables and forces you to shoehorn it into place. — the Tin Man, Jan 09 '17 at 17:22
Does "closest without going over" mean that you want the match if the value equals an element in the array? Providing some expected output would help. — the Tin Man, Jan 09 '17 at 17:44

the Tin Man · Accepted Answer · 2017-01-09T18:26:19.150

I'd use:

n = 10
[1, 5, 8, 12, 16, 17].select { |i| i <= n }.last # => 8

Thinking about it, the fastest way is to use bsearch_index since the input is already sorted.

n = 11 
ary = [1, 5, 8, 12, 16, 17]
ary[ary.bsearch_index { |i| n < i } - 1] # => 8

bsearch would be slower if the hit occurs early in ary but in big arrays it'll quickly pull ahead compared to looking at each element.

require 'fruity'

n = 11 
ary = [1, 5, 8, 12, 16, 17]

compare do 
  last { ary.select { |i| i <= n }.last }
  bsearch_index { ary[ary.bsearch_index { |i| n < i } - 1] }
  Bustikiller { ary.reverse.find { |i| i <= n } }
  engineerDave1 { ary.take_while {|x| x <= n}[-1] }
  engineerDave2 { ary.reduce(nil) {|a,x| a = x if x <= n; a} }
end

# >> Running each test 8192 times. Test will take about 1 second.
# >> bsearch_index is faster than engineerDave1 by 10.000000000000009% ± 10.0%
# >> engineerDave1 is faster than last by 2x ± 0.1
# >> last is similar to engineerDave2
# >> engineerDave2 is similar to Bustikiller

Increasing the size of the array:

require 'fruity'

n = 999 
ary = (0..1000).to_a

compare do 
  last { ary.select { |i| i <= n }.last }
  bsearch_index { ary[ary.bsearch_index { |i| n < i } - 1] }
  Bustikiller { ary.reverse.find { |i| i <= n } }
  engineerDave1 { ary.take_while {|x| x <= n}[-1] }
  engineerDave2 { ary.reduce(nil) {|a,x| a = x if x <= n; a} }
end

# >> Running each test 4096 times. Test will take about 17 seconds.
# >> bsearch_index is faster than Bustikiller by 3x ± 1.0
# >> Bustikiller is faster than engineerDave1 by 21x ± 1.0
# >> engineerDave1 is faster than last by 30.000000000000004% ± 10.0%
# >> last is faster than engineerDave2 by 10.000000000000009% ± 10.0%

The problems with the simple use of bsearch_index are noted in the comments below. I think it'd be worth looking into using it but wrapping it with some logic to fix the problems mentioned. I'll make this a community answer so anyone coming up with the rest of the code can add it.

This should be `i < n` in case same value as `n` is in the array. — Sagar Pandya, Jan 09 '17 at 17:39
The question is somewhat unclear. "closest without going over" implies equal. — the Tin Man, Jan 09 '17 at 17:43
Nice compact `bsearch_index` call, but this will return `max` if no none is found. Since in that case index `0` is found and thus element `-1` returned. — akuhn, Jan 09 '17 at 18:18
Indeed. If every element is greater than n, it returns `max`, if every element is smaller, it raises `NoMethodError: undefined method '-' for nil:NilClass`. — Eric Duminil, Jan 09 '17 at 18:21
Hmm. Good points. `bsearch` and `bsearch_index` are a good starting point though. I'll have to think about how to contain the boundary problems. I made the answer a community wiki, so anyone coming up with solutions can add them. — the Tin Man, Jan 09 '17 at 18:23

score 1 · Answer 2 · answered Jan 09 '17 at 17:14

1

[1, 5, 8, 12, 16, 17].reverse.find{|n| n <= 11}
# => 8

answered Jan 09 '17 at 17:14

Bustikiller

2,423
2
16
34

1

although it works. there is a caveat on memory consumption if this were a large array. This is why reverse_each exists https://github.com/JuanitoFatas/fast-ruby#enumerablereverseeach-vs-enumerablereverse_each-code – engineerDave Jan 09 '17 at 17:30
1

Use `arr.reverse_each.find { ... }` – akuhn Jan 09 '17 at 18:00

score 1 · Answer 3 · edited Jan 09 '17 at 18:24

1

You can use binary search with the Array#bsearch method introduced in Ruby 2.0.

Alas, the call looks a bit ugly since we need to do a reverse search but binary search has O(log n) complexity and is the fastest you can get:

arr = [1, 5, 8, 12, 16, 17]
index = (1..arr.size).bsearch { |i| arr[-i] < 11 }
index ? arr[-index] : nil
# => 8

If your array is short just use find as proposed by Bustikiller

[1, 5, 8, 12, 16, 17].reverse_each.find { |n| n <= 11 }

Why? For small n there is no difference between O(n) and O(log n) and as always readability comes before premature optimization.

edited Jan 09 '17 at 18:24

Wayne Conrad

103,207
26
155
191

answered Jan 09 '17 at 18:11

akuhn

27,477
2
76
91

Nice. The reversed bseach syntax is kinda ugly, but it's a fast as you'll get. – Eric Duminil Jan 09 '17 at 18:14

Cary Swoveland · Answer 4 · 2017-01-09T22:49:54.607

No reverse or reverse_each!¹.

def biggest_but_do_not_go_over(arr, nbr_to_not_go_over)
  return nil if arr.empty? || arr.first > nbr_to_not_go_over
  enum = arr.to_enum
  n = nil # anything
  loop do
    n = enum.next
    return n if nbr_to_not_go_over < enum.peek
  end
  n
end

arr = [1, 5, 8, 12, 16, 17]

biggest_but_do_not_go_over(arr,  0) #=> nil
biggest_but_do_not_go_over(arr,  1) #=>  1
biggest_but_do_not_go_over(arr,  7) #=>  5
biggest_but_do_not_go_over(arr, 16) #=> 16
biggest_but_do_not_go_over(arr, 17) #=> 17
biggest_but_do_not_go_over(arr, 18) #=> 17

A StopIteration exception is raised when the enumerator enum has generated all elements of arr and enum.peek is executed. Kernel#loop handles that exception by breaking out of the loop.

^{1. Not better, just different.}

engineerDave · Answer 5 · 2017-01-09T19:18:18.050

Always ordered and unique would imply to me that you might be faced with the potential for a large array, so it might be optimal to operate on as few items as possible.

[1, 5, 8, 12, 16, 17].take_while {|x| x <= 11}[-1]

or a reduce function with an accumulator

[1, 5, 8, 12, 16, 17].reduce {|a,x| x <= 11 ? (a = x) : (break a)}

Benchmarks when number being searched for is 500000 of 10000000. Also added the Tin man's awesome bsearch_index approach to the benchmarks would be the way to go. Obviously I didn't do mine with enough optimization, just down and dirty quick code. His should be the accepted answer IMO

   Rehearsal -------------------------------------------------
reverse         0.200000   0.000000   0.200000 (  0.209630)
take_while      0.010000   0.000000   0.010000 (  0.007428)
reduce          0.010000   0.000000   0.010000 (  0.012950)
bsearch_index   0.000000   0.000000   0.000000 (  0.001008)
---------------------------------------- total: 0.220000sec

                    user     system      total        real
reverse         0.200000   0.000000   0.200000 (  0.199569)
take_while      0.010000   0.000000   0.010000 (  0.007247)
reduce          0.010000   0.000000   0.010000 (  0.012440)
bsearch_index   0.000000   0.000000   0.000000 (  0.000006)

Memory Benchmarks for when number being searched for is 500000 of 10000000

Calculating -------------------------------------
             reverse    26.667M memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
          take_while    40.000  memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
              reduce     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
       bsearch_index     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

`take_while` is bad choice for large arrays because it creates a copy, and your `reduce` code scans the entire array rather than breaking when found. Just pointing these out because you advertise your solutions as fit for large data. — akuhn, Jan 09 '17 at 17:53
@akuhn take_while is one simple solution but in case its a large array I quickly did the reduce. of course reduce would traverse the array, but with minimal impact memory wise. the assumption is sorted and unique hence why I wrote that it. it could obviously be further improved with something like a bisect to find the sweet spot and then proceed from there. — engineerDave, Jan 09 '17 at 18:13

How do I find an element in an array that is always less than the element (note NOT closest, strictly less than)?

5 Answers5