1

Using yq (or any other tool), how can I return the full YAML path of an arbitrary line number ?

e.g. with this file :

a:
  b:
    c: "foo"
  d: |
    abc
    def

I want to get the full path of line 2; it should yield: a.b.c. Line 0 ? a, Line 4 ? a.d (multiline support), etc.

Any idea how I could achieve that?

Thanks

Julien Tanay
  • 1,214
  • 2
  • 14
  • 20
  • line number need not be an accurate representation in a grammar that supports new lines (e.g. in YAML), e.g. what if there are empty lines between each of the fields? – Inian Nov 04 '22 at 14:09
  • In this context, we don't care. Given a specific line, it should deduce the current key and then the full path. I'll amend with a multiline example. – Julien Tanay Nov 04 '22 at 14:32

2 Answers2

1

I have coded two solutions that differ slightly in their behaviour (see remarks below)

Use the YAML processor mikefarah/yq.

I have also tried to solve the problem using kislyuk/yq, but it is not suitable, because the operator input_line_number only works in combination with the --raw-input option

Version 1

FILE='sample.yml'
export LINE=1
yq e '[..
       | select(line == env(LINE))
       | {"line": line,
          "path": path | join("."),
          "type": type,
          "value": .}
      ]' $FILE

Remarks

  • LINE=3 returns two results, because line 3 contains two nodes
    1. the key 'c' of map 'a.b'
    2. the string value 'foo' of key 'c'.
  • LINE=5 does not return a match, because the multiline text node starts in line 4.
  • the results are wrapped in an array, as multiple nodes can be returned

Output for LINE=1

- line: 1
  path: ""
  type: '!!map'
  value:
    a:
      b:
        c: "foo"
    d: |-
      abc
      def

Output for LINE=2

- line: 2
  path: a
  type: '!!map'
  value:
    b:
      c: "foo"

Output for LINE=3

- line: 3
  path: a.b
  type: '!!map'
  value:
    c: "foo"
- line: 3
  path: a.b.c
  type: '!!str'
  value: "foo"

Output for LINE=4

- line: 4
  path: d
  type: '!!str'
  value: |-
    abc
    def

Output for LINE=5

[]

Version 2

FILE='sample.yml'
export LINE=1
if [[ $(wc -l < $FILE) -lt $LINE ]]; then
  echo "$FILE has less than $LINE lines"
  exit
fi
yq e '[..
       | select(line <= env(LINE))
       | {"line": line,
          "path": path | join("."),
          "type": type,
          "value": .}
      ]
      | sort_by(.line, .type)
      | .[-1]' $FILE

Remarks

  • at most one node is returned, even if there are more nodes in the selected row. So the result does not have to be wrapped in an array. Which node of one line is returned can be controlled by the sort_by function, which can be adapted to your own needs. In this case, text nodes are preferred over maps because "!!map" is sorted before "!!str".
  • LINE=3 returns only the text node of line 3 (not node of type "!!map")
  • LINE=5 returns the multiline text node starting at line 4
  • LINE=99 does not return the last multiline text node of sample.yaml because the maximum number of lines is checked in bash beforehand

Output for LINE=1

line: 1
path: ""
type: '!!map'
value:
  a:
    b:
      c: "foo"
  d: |-
    abc
    def

Output for LINE=2

line: 2
path: a
type: '!!map'
value:
  b:
    c: "foo"

Output for LINE=3

line: 3
path: a.b.c
type: '!!str'
value: "foo"

Output for LINE=4

line: 4
path: d
type: '!!str'
value: |-
  abc
  def

Output for LINE=5

line: 4
path: d
type: '!!str'
value: |-
  abc
  def
jpseng
  • 1,618
  • 6
  • 18
0

Sharing my findings since I've spent too much time on this.


As @Inian mentioned line numbers won't necessary be accurate.

YQ does provides us with the line operator, but I was not able to find a decent way of mapping that from an input.


That said, if you're sure the input file will not contain any multi-line values, you could do something like this

  1. Use to get the key of your input line, eg 3 --> C
    This assumes the value will never contain :, the regex can be edited if needed to go around this

    export searchKey=$(awk -F':' 'FNR == 3 { gsub(/ /,""); print $1 }' ii)
    
  2. Use YQ to recursive (..) loop over the values, and create each path using (path | join("."))

    yq e '.. | (path | join("."))' ii
    
  3. Filter the values from step 2, using a regex where we only want those path's that end in the key from step 1 (strenv(searchKey))

    yq e '.. | (path | join(".")) | select(match(strenv(searchKey) + "$"))' ii
    
  4. Print the path if it's found


Some examples from my local machine, where your input file is named ii and both awk + yq commands are wrapped in a bash function

$ function getPathByLineNumber () {
    key=$1
    export searchKey="$(awk  -v key=$key -F':' 'FNR == key { gsub(/ /, ""); print $1 }' ii)"
    yq e '.. | (path | join(".")) | select(match(strenv(searchKey) + "$"))' ii
}
$
$
$
$
$ yq e . ii
a:
  b:
    c: "foo"
$
$
$ getPathByLineNumber 1
a
$ getPathByLineNumber 2
a.b
$ getPathByLineNumber 3
a.b.c
$
$
0stone0
  • 34,288
  • 4
  • 39
  • 64