3

I have a config.yaml file which contains among other values the following list of kafka brokers which I want to remove from the config using a bash script.

kafka.brokers:
    - "node003"
    - "node004"

I am doing this currently by invoking vi from inside the script using the command:

vi $CONF_BENCHMARK/config.yaml -c ":%s/kafka.brokers:\(\n\s*-\s".*"\)*/kafka.brokers:/g" -c ':wq!'

I understand that sed is a more appropriate tool to accomplish the same task but when I try to translate the above regex to sed, it does not work.

sed -i -e "s/kafka.brokers:\(\n\s*-\s".*"\)*/kafka.brokers:/g" $CONF_BENCHMARK/config.yaml

I am doing something wrong ?

jaywalker
  • 1,116
  • 4
  • 26
  • 44
  • `sed is a more appropriate tool` -> No, a YAML parser is a more appropriate tool :-) YAML is pretty complex, and not at all suited to be modified ad-hoc with shells scripts and the like. – Martin Tournoij Dec 20 '16 at 00:31

4 Answers4

6

awk to the rescue!

sed is line based, this should work...

$ awk 's{if(/\s*-\s*"[^"]*"/) next; else s=0} /kafka.brokers:/{s=1}1' file

Explanation

if(/\s*-\s*"[^"]*"/) next if pattern matches skip to next line
s{if(/\s... check pattern if only s is set
/kafka.brokers:/{s=1} when header seen set s
1 shorthand for print lines (if not skipped)
s{... else s=0} if s was set but pattern not found, reset s

karakfa
  • 66,216
  • 7
  • 41
  • 56
2

As other have pointed out, you will need to be explicit to get sed working with multiply lines.

The real answer is to use AWK a beautiful answer is provided by karakfa. But for the educational purpose I will provide an sed answer:

sed  '
  /kafka.brokers/ {
    :a
    $be
    N
    /\n[[:space:]]*-[[:space:]]"[^\n]*"[^\n]*$/ba
    s/\n.*\(\n\)/\1/
    P;D
    :e
    s/\n.*//
  }
' input

Basically sed will keep append lines to the pattern space from when kafka.brokers up until \n[[:space:]]*-[[:space:]]"[^\n]*"[^\n]*$ is not matches.

This will leave the pattern space with one trailing line it in, i.e:

kafka.brokers:\n    - "node003"\n    - "node004"\nother stuff$

Replacing everything \n.*\(\n\) with a newline leaves the following pattern space:

kafka.brokers:\nother stuff$

P;D will print the first line from the pattern space and then restart the cycle with the remaning pattern space. Making the input support:

kafka.brokers:
    - "node003"
    - "node004"
kafka.brokers:
    - "node005"
more_input
Community
  • 1
  • 1
Andreas Louv
  • 46,145
  • 13
  • 104
  • 123
1

Your Vim pattern matches across multiple lines, but sed works line-by-line. (That is, it first tries to match your pattern against kafka.brokers: and fails, then it tries to match - "node003", and so on.) Your instinct to use something other than Vim was right, but sed probably isn't the best tool for the job here.

This answer addresses the problem of matching multi-line patterns with sed in more detail.

My personal recommendation would be to use a scripting language like Python or Perl to deal with complicated pattern-matching. You can run a Python command with python -c <command>, for instance, just like you did with Vim, or you could write a small Python script that you call from your Bash script. It's a little more complicated than a sed one-liner, but it will probably save you a lot of debugging and make your script easier to maintain and modify.

Community
  • 1
  • 1
Ben
  • 1,571
  • 1
  • 13
  • 29
  • I tried perl as `perl -pi -e "s/kafka.brokers:\(\n\s*-\s".*"\)*/kafka.brokers:/g" config.yaml` but that does not seem to work either. I hope python works for me. – jaywalker Dec 19 '16 at 03:19
  • @andlrc I am trying to replace in-place in the file `config.yaml` as my code in the comment above shows. Running your suggestion results `Can't open perl script "s/kafka....": No such file or directory` – jaywalker Dec 19 '16 at 03:28
  • @HaseebJaved `perl -p0e '...'` will slurp the whole file at one go. Use `-pi -0e` to do inplace edit. – Andreas Louv Dec 19 '16 at 04:38
1

Consider using yq instead of sed or awk. Deleting the key kafka.brokers then becomes as simple as:

yq d $CONF_BENCHMARK/config.yaml '"kafka.brokers"'

The following snippet demonstrates the yq delete feature:

cat <<EOF | yq d - '"kafka.brokers"'    
some:
  path: value
kafka.brokers:
  - "node003"
  - "node004"
EOF

... and results in the output

some:
  path: value
Christoffer Soop
  • 1,458
  • 1
  • 12
  • 24