(Not #dig!) How to determine a key exists in a deeply nested Ruby Hash?

Question

Is there an "easy" way, short of hand-writing the kind of nested Hash/Array traversal performed by Hash#dig, that I can determine if a key is present in a deeply nested Hash? Another way to ask this is to say "determine if any value is assigned".

There is a difference between a Hash having nothing assigned, or it having an explicit nil assigned - especially if the Hash were constructed with a different missing key default value than nil!

h = { :one => { :two => nil }}
h.dig(:one, :two).nil? # => true; but :two *is* present; it is assigned "nil". 
h[:one].key?(:two) # => true, because the key exists

h = { :one => {}}
h.dig(:one, :two).nil? # => true; :two *is not* present; no value is assigned.
h[:one].key?(:two) # => FALSE, because the key does not exist

https://stackoverflow.com/questions/1820451/ruby-style-how-to-check-whether-a-nested-hash-element-exists could be one way. https://stackoverflow.com/questions/15031412/search-for-key-in-a-nested-hash-in-rails is another. The short answer is no, there isn't. What's your use-case? — Cassandra S., Mar 21 '21 at 21:27
The first ultimately suggests #dig and both otherwise hand-code Ruby iterators (which in those cases don't handle Arrays, so aren't functionally equivalent to #dig anyway). Use case is a complex nested Hash/Array inbound payload recursively iterated via a schema object providing all possible mappings to a linear attribute set, with a path array maintaining position. I could hand-write a sort of "#dig?", and it'd be elegant in-place, but I'll most likely just redesign the iterator method to be less elegant but also less maintenance-costly rather than do that. — Andrew Hodgkinson, Mar 21 '21 at 21:35
To be a little more specific: It's a SCIM v2 implementation that draws on prior but all-cases incomplete work in ScimEngine, ScimRails and SCIM Query Filter Parser to provide a more comprehensive solution. The question at hand arises from PUT semantics described by https://tools.ietf.org/html/rfc7644#section-3.5.1 where I wish to maintain the "MAY be assumed to not be asserted by the client" behaviour. We will be releasing this work under an MIT licence once feature-complete and tested. — Andrew Hodgkinson, Mar 21 '21 at 21:39
Let's make your question more precise. If `h = { :one => { :two => { :four => nil }, :three => { :five => nil } } }` you might ask if there is a nested hash for which, say, `:four` is a key. You could use recursion to confirm there is such a key, and if desired produce a sequence of keys that drills down to it. But that is not what you are asking. Your question might be "Does `h` have a key `:one`, whose value is a hash that has a key `:two`, whose value is a hash that has a key `:four`?". You could easily translate that to code, using `dig` or not... — Cary Swoveland, Mar 22 '21 at 01:20
...To use `dig`, `g = h.dig(:one, :two); g.is_a?(Hash) && g.key?(:four)`. — Cary Swoveland, Mar 22 '21 at 01:21
@CarySwoveland just stumbled across this question and I welcome your critique of my proposal — engineersmnky, Jun 12 '21 at 03:49
@engineersmnky, I'll have a look tomorrow (but note the `dig` family is a triumvarate, as there is [OpenStruct#dig](https://ruby-doc.org/stdlib-2.7.0/libdoc/ostruct/rdoc/OpenStruct.html#method-i-dig) also). — Cary Swoveland, Jun 12 '21 at 05:24

score 3 · Answer 1 · answered Mar 21 '21 at 21:53

3

If you are purely checking the existence of a key, you can combine dig and key?. Use key? on the final or last key in your series of keys.

input_hash = {
  hello: {
    world: {
      existing: nil,
    }
  }
}

# Used !! to make the result boolean

!!input_hash.dig(:hello, :world)&.key?(:existing) # => true
!!input_hash.dig(:hello, :world)&.key?(:not_existing) # => false
!!input_hash.dig(:hello, :universe)&.has_key?(:not_existing) # => false

answered Mar 21 '21 at 21:53

Rein Avila

375
1
6

Yep, in absence of anything "better" in the core library, that's a decent approach. The only thing it might trip up on would be checking for value presence if the terminating node was an array index (not an issue under my specific use case). – Andrew Hodgkinson Mar 21 '21 at 22:13
Works even without `!!` – zekromWex May 17 '22 at 11:29
1

Use only `!!` if you don't want a possible `nil` output - if you want boolean output. If the accessed attributes in `.dig(...)` are not present, it will result to `nil`. – Rein Avila May 18 '22 at 04:55

engineersmnky · Answer 2 · 2021-06-14T12:58:34.000

Inspired by your core extension suggestion I updated the implementation a bit to better mimic that of #dig

requires 1+ arguments
raises TypeError if the dig does not return nil, the resulting object does not respond to dig? and there are additional arguments to be "dug"

module Diggable
  def dig?(arg,*args)
    return self.member?(arg) if args.empty?
    if val = self[arg] and val.respond_to?(:dig?) 
      val.dig?(*args)
    else
     val.nil? ? false : raise(TypeError, "#{val.class} does not have a #dig? method")
    end
  end
end

[Hash,Struct,Array].each { |klass| klass.send(:include,Diggable) }

class Array
  def dig?(arg,*args)
    return arg.abs < self.size if args.empty?
    super
  end
end

if defined?(OpenStruct)
  class OpenStruct
    def dig?(arg,*args)
      self.to_h.dig?(arg,*args)
    end
  end
end

Usage

Foo = Struct.new(:a)

hash = {:one=>1, :two=>[1, 2, 3], :three=>[{:one=>1, :two=>2}, "hello", Foo.new([1,2,3]), {:one=>{:two=>{:three=>3}}}]}

hash.dig? #=> ArgumentError
hash.dig?(:one) #=> true
hash.dig?(:two, 0) #=> true
hash.dig?(:none) #=> false
hash.dig?(:none, 0) #=> false
hash.dig?(:two, -1) #=> true
hash.dig?(:two, 10) #=> false
hash.dig?(:three, 0, :two) #=> true
hash.dig?(:three, 0, :none) #=> false
hash.dig?(:three, 2, :a) #=> true
hash.dig?(:three, 3, :one, :two, :three, :f) #=> TypeError

Example

Very nicely done, informative and educational. My only quibble is the inclusion of the optional `self.`'s, but I recognize that is a stylistic issue. — Cary Swoveland, Jun 13 '21 at 03:00
@CarySwoveland some of the `self` references are required e.g. `self[arg]`. Generally speaking I use `self`, as I have here, for extending functionality where the context of the method being called would otherwise be "unknown" to the reader and may be mistaken for a undefined local variable. — engineersmnky, Jun 14 '21 at 13:05

Fravadona · Answer 3 · 2021-04-04T20:17:49.137

Here is a concise way of doing it which works with nested Array and Hash (and any other object that responds to fetch).

def deep_fetch? obj, *argv
  argv.each do |arg|
    return false unless obj.respond_to? :fetch
    obj = obj.fetch(arg) { return false }
  end
  true
end

obj = { hello: [ nil, { world: nil } ] }
deep_fetch? obj, :hell # => false
deep_fetch? obj, :hello, 0 # => true
deep_fetch? obj, :hello, 2 # => false
deep_fetch? obj, :hello, 0, :world # => false
deep_fetch? obj, :hello, 1, :world # => true
deep_fetch? obj, :hello, :world
TypeError (no implicit conversion of Symbol into Integer)

The previous code raises an error when accessing an Array element with a non-Integer index (just like Array#dig), which sometimes is not the behavior one is looking for. The following code works well in all cases, but the rescue is not a good practice:

def deep_fetch? obj, *argv
  argv.each { |arg| obj = obj.fetch(arg) } and true rescue false
end

obj = { hello: [ nil, { world: nil } ] }
deep_fetch? obj, :hell # => false
deep_fetch? obj, :hello, 0 # => true
deep_fetch? obj, :hello, 2 # => false
deep_fetch? obj, :hello, 0, :world # => false
deep_fetch? obj, :hello, 1, :world # => true
deep_fetch? obj, :hello, :world # => false

A neat and compact approach, tho I'd hesitate to use that in anything too performance-sensitive throw/catch is a costly form of flow control in Ruby. — Andrew Hodgkinson, Mar 22 '21 at 22:31
"rescue is not good practice" Actually this is exactly how [Matz recommended](https://bugs.ruby-lang.org/issues/11762#note-17) handling `#dig` errors if you want to safeguard from a corrupted tree. — engineersmnky, Jun 12 '21 at 04:15

Andrew Hodgkinson · Answer 4 · 2021-03-22T22:34:22.097

For reference - taking the unusual step of answering my own question ;-) - here's one of several ways I could solve this if I just wanted to write lots of Ruby.

def dig?(obj, *args)
  arg = args.shift()

  return case obj
    when Array
      if args.empty?
        arg >= 0 && arg <= obj.size
      else
        dig?(obj[arg], *args)
      end
    when Hash
      if args.empty?
        obj.key?(arg)
      else
        dig?(obj[arg], *args)
      end
    when nil
      false
    else
      raise ArgumentError
  end
end

Of course, one could also have opened up classes like Array and Hash and added #dig? to those, if you prefer core extensions over explicit methods:

class Hash
  def dig?(*args)
    arg = args.shift()

    if args.empty?
      self.key?(arg)
    else
      self[arg]&.dig?(*args) || false
    end
  end
end

class Array
  def dig?(*args)
    arg = args.shift()

    if args.empty?
      arg >= 0 && arg <= self.size
    else
      self[arg]&.dig?(*args) || false
    end
  end
end

...which would raise NoMethodError rather than ArgumentError if the #dig? arguments led to a non-Hash/Array node.

Obviously it would be possible to compress those down into more cunning / elegant solutions that use fewer lines, but the above has the benefit of IMHO being pretty easy to read.

In the scope of the original question, though, the hope was to lean more on anything Ruby has out-of-the-box. We've collectively acknowledged early-on that there is no single-method solution, but the answer from @AmazingRein gets close by reusing #dig to avoid recursion. We might adapt that as follows:

def dig?(obj, *args)
  last_arg = args.pop()
  obj      = obj.dig(*args) unless args.empty?

  return case obj
    when Array
      last_arg >= 0 && last_arg <= obj.size
    when Hash
      obj.key?(last_arg)
    when nil
      false
    else
      raise ArgumentError
  end
end

...which isn't too bad, all things considered.

# Example test...

hash = {:one=>1, :two=>[1, 2, 3], :three=>[{:one=>1, :two=>2}, "hello", {:one=>{:two=>{:three=>3}}}]}

puts dig?(hash, :one)
puts dig?(hash, :two, 0)
puts dig?(hash, :none)
puts dig?(hash, :none, 0)
puts dig?(hash, :two, -1)
puts dig?(hash, :two, 10)
puts dig?(hash, :three, 0, :two)
puts dig?(hash, :three, 0, :none)
puts dig?(hash, :three, 2, :one, :two, :three)
puts dig?(hash, :three, 2, :one, :two, :none)

It's not that weird to write an answer to your own question, and I understand your point. That said, IMHO, `Array#dig` has a flaw which is to raise an error when using a "non-implicitly-convertible-to-integer" index. This is inconsistent with its behavior in other cases. For example, why would `[].dig(:key)`) raise an error but `[].dig(0,:key)` not? — Fravadona, Mar 23 '21 at 09:47
Yeah, depending on the requirements and how error-tolerant you wanted it to be, you might not want #dig. In my case I think I'm happy as-is, since attempting to call `[]` on an Array with a non-numeric value returns an error normally. I'm expecting the given path to be structurally valid, but I just don't know if all parts of that structure are present. — Andrew Hodgkinson, Mar 23 '21 at 19:47
Posted an answer based on your concept of modifying core classes. It should act the same way that `dig` does as far as arguments and errors are concerned but it returns a boolean instead. Thanks for the fun project. — engineersmnky, Jun 13 '21 at 00:21

(Not #dig!) How to determine a key *exists* in a deeply nested Ruby Hash?

4 Answers4

(Not #dig!) How to determine a key exists in a deeply nested Ruby Hash?