3

Is it possible to write JMESPath expression to return a list of object names where a specific subproperty value is set? In the example below I'd like to get a list of all hostsnames where fileexists.stat.exists is set to true.

My goal is to use Ansible hostvars structure to get a list of all hosts where a specific file is present.

{
"hostvars": {
    "oclab1n01.example.org": {
        "fileexists": {
            "changed": false, 
            "failed": false, 
            "stat": {
                "exists": false
            }
        }
    }, 
    "oclab1n02.example.org": {
        "fileexists": {
            "changed": false, 
            "failed": false, 
            "stat": {
                "exists": true
            }
        }
    }, 
    "oclab1n03.example.org": {
        "fileexists": {
            "changed": false, 
            "failed": false, 
            "stat": {
                "exists": true
            }
        }
    }
} }

In this example I'd like to get the following output

["oclab1n02.example.org", "oclab1n03.example.org"]
Ilmar Kerm
  • 31
  • 1
  • 3

1 Answers1

0

Short answer (TL;DR)

Yes, this is possible, but it is extremely cumbersome, because, at least in terms of working with JMESpath, the source dataset is poorly normalized for this kind of general-purpose query.

Context

  • jmespath query language
  • querying object properties for deeply nested objects

Problem

  • How to construct a jmespath query with filter expressions
  • The goal is to filter on objects with arbitrarily nested object properties

Solution

  • This can be done with jmespath, but the operation will be cumbersome
  • One problematic issue: the source dataset is poorly normalized for this kind of jmespath query
  • In order to construct the jmespath query, we have to assume all the primary object keys are known in advance of creating the query
  • In this specific example, we have to know that there are three and only three hostnames in advance of constructing the jmespath query ... this is not a favorable circumstance if we want the flexibility to specify any arbitrary number of hostnames

Example

The following (way-too-huge) jmespath query ...

  [
    {
      "hostname": `oclab1n01.example.org`
      ,"fileexists_stat_exists":  @.hostvars."oclab1n01.example.org".fileexists.stat.exists
    }
    ,{
      "hostname": `oclab1n02.example.org`
      ,"fileexists_stat_exists":  @.hostvars."oclab1n02.example.org".fileexists.stat.exists
    }
    ,{
      "hostname": `oclab1n03.example.org`
      ,"fileexists_stat_exists":  @.hostvars."oclab1n02.example.org".fileexists.stat.exists
    }
  ]|[? @.fileexists_stat_exists == `true`]|[*].hostname

returns the following desired result

  [
    "oclab1n02.example.org",
    "oclab1n03.example.org"
  ]

Pitfalls

  • One major pitfall with this use-case is the source dataset is poorly normalized for this kind of query
  • A more flattened data structure would be easier to query
  • Consequently, if possible, a better approach would be to flatten the source dataset before running jmespath queries against it

Alternate example with a different original dataset

If the original data were organized as a list of objects, instead of a set of nested objects within objects, it would be easier to search, sort and filter the list without having to know in advance how many hostname entries are involved.

{"hostvars": [
    {"hostname":"oclab1n01.example.org"
      ,"fileexists":        true
      ,"filechanged":       false
      ,"filefailed":        false
      ,"filestat_exists":   false
      ,"we_can_even_still_deeply_nest":{"however":
           {"im_only_doing":"it here","to":"prove a point"}
         }
     }
    ,{"hostname":"oclab1n02.example.org"
      ,"fileexists":        true
      ,"filechanged":       false
      ,"filefailed":        false
      ,"filestat_exists":   true
     }
    ,{"hostname":"oclab1n03.example.org"
      ,"fileexists":        true
      ,"filechanged":       false
      ,"filefailed":        false
      ,"filestat_exists":   true
     }
  ]
}

The above re-normalized dataset can now be easily queried

hostvars|[? @.filestat_exists == `true`]|[*].hostname
dreftymac
  • 31,404
  • 26
  • 119
  • 182
  • 1
    Why do you think that a "key -> record" relationship is poorly normalized? It seems very idiomatic in configuration management software such as salt and ansible, yet I do find that most jinja filters and, as I've learned today JMESPath work great for `[val1, val2, val3] | map(f) -> [f(val1), f(val2), f(val3)]`, but no so much for `{key1: val1, key2: val2, key3: val3} | map(f) -> {key1: f(val1), key2: f(val2), key3: f(val3)}`. Is this somehow fundamentally hard to implement? – LLlAMnYP Jun 11 '21 at 11:56
  • @LLlAMnYP **//Why [...] a "key -> record" relationship is poorly normalized//** Only for this specific context, not in general. As you mentioned, it is a routinely-encountered data pattern. The matter here is avoiding to fight against JMespath, vs working with it. **//Is this somehow fundamentally hard to implement//** Not fundamentally. The issue is whether the designer(s) chose to make mapping data type iterable, just like list type. It is just a design decision. Feel free to think similar to relational databases, where "rows" (list) are easier to iterate than columns "mapping". – dreftymac Jun 11 '21 at 12:15