0

I have a file let's call it input.txt. It has many lines, but the only relevant line contains a model statement;

height ~ mu gender

It may also contain;

height !n ~ mu date_birth !r g

So the consistent factor to identify the line in regex would be ^height.*~.*$. At least that is what I have devised so far.

I would like to append !r g to the end line only if !r g wasn't already present. I tried to mix answers from here, here and here, but I can't figure it out. I would prefer a single command. Have also been playing around with complicated awk's and sed's but I feel this is overly simple that it doesn't need to be too difficult for someone with experience.

Desired result(s):

If height ~ mu gender then height ~ mu gender !r g.

If height !bin ~ mu date_birth !r g then nothing needs to happen.

If height !bin ~ mu gender then height !bin ~ mu gender !r g

EDIT:

So far I tried;

sed '/^height.*~.*!r.*$/ ! s/$/!r g/' input.txt correctly skips line if !g is present but appends it to each line in input.txt.

sed '/^height.*$/s/$/!r g/' input.txt, correctly appends only to the matching line, but also if !r g was already present.

tstev
  • 607
  • 1
  • 10
  • 20
  • you wrote : *If `y !n ~ d e f !r` then nothing need to happen.* which contradicts with your previous requirement *to append `!r g` to the end line only if `!r g` wasn't already present* – RomanPerekhrest Aug 25 '17 at 08:29
  • @RomanPerekhrest, thanks forgot the `g`. Edited it accordingly. – tstev Aug 25 '17 at 08:32
  • I edited post to make input look more like what my actual script looks like. – tstev Aug 25 '17 at 09:29

4 Answers4

2

We can do this with sed. Firstly, we select lines that begin with height and contain a ~. With those lines, we can substitute the end of line with !r g if the line doesn't already end in that value:

#/usr/bin/sed -f

/^height .*~/{
/ !r g$/!s/$/ !r g/
# Explanation:
# / !r g/           : select lines marked with the tag
#       !s          : in lines that don't match, substitute
#          $        : end of line
#             !r g  : the tag to add
}

Demonstration

$ ./45876917.sed <<END
height ~ mu gender
height !bin ~ mu date_birth !r g
height !bin ~ mu gender
END
height ~ mu gender !r g
height !bin ~ mu date_birth !r g
height !bin ~ mu gender !r g
Toby Speight
  • 27,591
  • 48
  • 66
  • 103
  • Thanks for the explanation with the answer! really helpful – tstev Aug 25 '17 at 09:58
  • Glad to help - I'm not a believer in spoon-feeding specific answers without showing how you could get to the same result yourself. You don't learn very much from code-only answers. – Toby Speight Aug 25 '17 at 10:00
1
sed '/^y.*~.*$/{/!r g/!{s/.*/& !r g/}}' input.txt

E.g.

$ cat input.txt
y !n ~ d e f !r g
y ~ a b c

$ sed '/^y.*~.*$/{/!r g/!{s/.*/& !r g/}}' input.txt
y !n ~ d e f !r g
y ~ a b c!r g

UPDATE

Above sed command will consider all the lines with pattern ^y.*~.*$, and append !r g to the end only if the line does not contain !r g in any part of the line.

To change the lines filtered, simply update the starting regex ^y.*~.*$ into what you need.

Anubis
  • 6,995
  • 14
  • 56
  • 87
  • This is not exactly what I want. It adds the `!r g` to the beginning of the line. I need it at the end. Maybe that wasn't clear. Let me edit my original post. – tstev Aug 25 '17 at 08:24
  • @tstev if you need a space before `!r g`, just change the replace string, `&!r g` into `& !r g` – Anubis Aug 25 '17 at 08:28
  • @tstev Could you simply show a non working input and the expected output. I've used the pattern suggested by you to filter out the required lines. Seems like the primary pattern needs to be improved. I can help if you can show the problem clearly. – Anubis Aug 25 '17 at 09:24
  • I updated the problem statement. I don't know how to make it more clear. – tstev Aug 25 '17 at 09:41
  • @tstev So you just have to change the starting regex with the new one you suggested. Try `sed '/^height.*~.*$/{/!r g/!{s/.*/& !r g/}}' input.txt`. Or do you want to allow any string in place of `height`? – Anubis Aug 25 '17 at 09:44
  • I know what went wrong, `\r\n` line-endings slipped in the input file. With `dos2unix` it works :) Sorry for that! – tstev Aug 25 '17 at 09:48
1

awk solution:

awk '/^y.*~.+/ && !/!r g/{ $0=$0" !r g" }1' input.txt
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • Doesn't work for me. It **replaces** at the **beginning** instead of *appending* at the *end* of the line. – tstev Aug 25 '17 at 09:08
  • @tstev, it can not be true. `$0=$0" !r g"` will append to the end – RomanPerekhrest Aug 25 '17 at 09:12
  • Ok, well `input.txt` contains two lines `height ~ mu gender !r g` and `height ~ mu gender`. When I run your command adapted for this example; `awk '/^height.*~.+/ && !/!r g/{ $0=$0" !r g" }1' input.txt`. I get `height ~ mu gender !r g` and `!r gt ~ mu gender`. So it correctly skipped the first line as it already contains `!r g` but second line is not correct. – tstev Aug 25 '17 at 09:18
  • I have very little knowledge of whats going on with the answer as I am new to unix commands. Perhaps you can try your example on your machine. – tstev Aug 25 '17 at 09:21
  • @tstev, look here https://ibb.co/hyBZSQ. It's working on your input lines – RomanPerekhrest Aug 25 '17 at 09:25
  • @James Brown identified what the problem was. Apparently some `\r\n` slipped in there :P. Now it works with `dos2unix` https://ibb.co/nhYju5. Thanks a lot! – tstev Aug 25 '17 at 09:47
1

Another in awk:

$ awk '{sub(/( !r g)?\r?$/," !r g")}1' file
y ~ a b c !r g
y !n ~ d e f !r g
y !n ~ d e f !r g

or with the changed data:

height ~ mu gender !r g
height !bin ~ mu date_birth !r g
height !bin ~ mu gender !r g

Notice the \r? in the regex, which is the first part of the Windows line ending \r\n. If it exists, it gets replaced.

James Brown
  • 36,089
  • 7
  • 43
  • 59