451

Given this input:

[
  {
    "Id": "cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b",
    "Names": [
      "condescending_jones",
      "loving_hoover"
    ]
  },
  {
    "Id": "186db739b7509eb0114a09e14bcd16bf637019860d23c4fc20e98cbe068b55aa",
    "Names": [
      "foo_data"
    ]
  },
  {
    "Id": "a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19",
    "Names": [
      "jovial_wozniak"
    ]
  },
  {
    "Id": "76b71c496556912012c20dc3cbd37a54a1f05bffad3d5e92466900a003fbb623",
    "Names": [
      "bar_data"
    ]
  }
]

I'm trying to construct a filter with jq that returns all objects with Ids that do not contain "data" in the inner Names array, with the output being newline-separated. For the above data, the output I'd like is:

cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b
a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19

I think I'm somewhat close with this:

(. - select(.Names[] contains("data"))) | .[] .Id

but the select filter is not correct and it doesn't compile (get error: syntax error, unexpected IDENT).

syntagma
  • 23,346
  • 16
  • 78
  • 134
Abe Voelker
  • 30,124
  • 14
  • 81
  • 98

3 Answers3

663

Very close! In your select expression, you have to use a pipe (|) before contains.

This filter produces the expected output.

. - map(select(.Names[] | contains ("data"))) | .[] .Id

The jq Cookbook has an example of the syntax.

Filter objects based on the contents of a key

E.g., I only want objects whose genre key contains "house".

$ json='[{"genre":"deep house"}, {"genre": "progressive house"}, {"genre": "dubstep"}]'
$ echo "$json" | jq -c '.[] | select(.genre | contains("house"))'
{"genre":"deep house"}
{"genre":"progressive house"}

Colin D asks how to preserve the JSON structure of the array, so that the final output is a single JSON array rather than a stream of JSON objects.

The simplest way is to wrap the whole expression in an array constructor:

$ echo "$json" | jq -c '[ .[] | select( .genre | contains("house")) ]'
[{"genre":"deep house"},{"genre":"progressive house"}]

You can also use the map function:

$ echo "$json" | jq -c 'map(select(.genre | contains("house")))'
[{"genre":"deep house"},{"genre":"progressive house"}]

map unpacks the input array, applies the filter to every element, and creates a new array. In other words, map(f) is equivalent to [.[]|f].

Community
  • 1
  • 1
Iain Samuel McLean Elder
  • 19,791
  • 12
  • 64
  • 80
36

Here is another solution which uses any/2

map(select(any(.Names[]; contains("data"))|not)|.Id)[]

with the sample data and the -r option it produces:

cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b
a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19
syntagma
  • 23,346
  • 16
  • 78
  • 134
jq170727
  • 13,159
  • 3
  • 46
  • 56
  • Exactly what I was looking for - why does this work with a semi-colon `.Names[] ; contains()` and not with a pipe `.Names[] | contains()`? – Matt Mar 12 '18 at 18:33
  • 5
    Ah, it's the `any(generator; condition)` form. I found that without using `any()` I would end up with duplicates in my results if `select()` matched more than once on the same object. – Matt Mar 12 '18 at 18:40
2

Following jq map select expression produces the intended outcome:

aws ecr describe-images \
  --registry-id <aws_account_id> \
  --repository-name <ecr_repository_name> \
  --region <aws_region> \
  --no-cli-pager \
  --filter tagStatus=TAGGED \
| jq '.imageDetails | map(select(.imageTags[] | contains ("version_tag")))'