3

I have some text files $f resembling the following

function
%blah
%blah

%blah
code here

I want to append the following text before the first empty line:

%
%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 
%3.0 Unported License. See notes at the end of this file for more information.

I tried the following:

top=$(cat ./PATH/text.txt) 
top="${top//$'\n'/\\n}" 
sed -i.bak 's@^$@'"$top"'\\n@' $f

where the second line (I think) preserves the new line in the text and the third line (I think) substitutes the first empty line with the text plus a new empty line.

Two problems:

1- My code appends the following text:

%n%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike n%3.0 Unported License. See notes at the end of this file for more information.\n

2- It appends it at end of the file.

Can someone please help me understand the problems with my code?

shamalaia
  • 2,282
  • 3
  • 23
  • 35

4 Answers4

5

If you are using GNU sed, following would work.

Use ^$ to find the empty line and then use sed to replace/put the text that you want.

# Define your replacement text in a variable
a="%\n%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike\n%3.0 Unported License. See notes at the end of this file for more information."

Note, $a should include those \n that will be directly interpreted by sed as newlines.

$ sed "0,/^$/s//$a/" inputfile.txt

In the above syntax, 0 represents the first occurrence.

Output:

function
%blah
%blah
%
%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike
%3.0 Unported License. See notes at the end of this file for more information.
%blah
code here

test
iamauser
  • 11,119
  • 5
  • 34
  • 52
  • The `\n` translating to a newline in the replacement string may work in some variants of sed, but it's far from universal behaviour. This solution doesn't work in FreeBSD or macOS. I didn't test further. – ghoti Jan 23 '18 at 03:54
  • MacOS uses FreeBSD `sed` by default. They are not two independent cases. Edited my answer to point out that you can use this in `GNU sed`. – iamauser Jan 23 '18 at 04:06
  • In fact, they are two independent cases, as they are different operating systems. FreeBSD's sed is indeed different from the one in macOS, despite having a common ancestor. Anyway, thanks for the clarification. If a portable answer can't be created, it's important to note an answer's limitations. – ghoti Jan 23 '18 at 04:16
3

You've included and tags in your question. Since I can't seem to come up with a way of doing this in sed, here's a bash-only solution. It's likely to perform the worst of all working solutions you might find.

The following works with your sample input:

$ while read -r x; do [[ -z "$x" ]] && cat boilerplate; printf '%s\n' "$x"; done < src

This will however insert the boilerplate before EVERY blank line, which is probably not what you're after. Instead, we should probably make this more than a one-liner:

#!/usr/bin/env bash

y=true

while read -r x; do

  if [[ -z "$x" ]] && $y; then
    cat boilerplate
    y=false
  fi

  printf '%s\n' "$x"

done < src

Note that unlike the code in your question, this doesn't store your boilerplate in a variable, it just cats it "at the right time".

Note that this sends the combined output to stdout. If your goal is to modify the original file, you'll need to wrap this in something that moves around temporary files. (Note that sed's -i option also doesn't really edit files in place, it only hides the moving-around-temp-files from you.)

The following alternatives are probably a better idea.


A similar solution to the bash one might be achieved with better performance using awk:

awk 'NR==FNR{b=b $0 ORS;next} /^$/&&!y{printf "%s",b;y++} 1' boilerplate src

This awk solution obviously reads your boilerplate into a variable, though it's not a shell variable.

Notwithstanding non-standard platform-specific extensions, awk does not have any facility for editing files "in place" either. A portable solution using awk would still need to push temp files around.


And of course, the following old standard of ed is great to keep in your back pocket:

printf 'H\n/^$/\n-\n.r boilerplate\nw\nq\n' | ed src

In bash, of course, you could always use heretext, which might be clearer:

$ ed src <<< $'H\n/^$/\n-\n.r boilerplate\nw\nq\n'

The ed command is non-stream version of sed. Or rather, sed is the stream version of ed, which has been around since before the dinosaurs and is still going strong.

The commands we're using are separated by newlines and fed to ed's standard input. You can discard stdout if you feel the urge. The commands shown here are:

  • H - instruct ed to print more useful errors, if it gets any.
  • /^$/ - search for the first occurrence of a newline.
  • - - GO BACK ONE LINE. Awesome, right?
  • .r boilerplate - Read your boilerplate at the current line,
  • w - and write the file.
  • q - Quit.

Note that this does not keep a .bak file. You'll need to do that yourself if you really want one.

And if, as you suggested in comments, the filename you're reading is to be constructed from a variable, note that variable expansion does not happen inside format quoting ($' .. '). You can either switch quoting mechanisms mid-script:

ed "$file" <<< $'H\n/^$/\n-\n.r ./TATTOO_'"$currn"$'/top.txt\nw\nq\n'

Or you could put ed script in a variable constructed by printf

printf -v scr 'H\n/^$/\n-\n.r ./TATTOO_%s/top.txt\nw\nq\n' "$currn"

ed "$file" <<< "$scr"`
ghoti
  • 45,319
  • 8
  • 65
  • 104
  • I tried to find a good reference for why to avoid `while read` to process a file. I couldn't quickly find anything really focused and authoritative, but try https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice – tripleee Jan 23 '18 at 05:31
  • @tripleee, thanks for the link. Yes, I agree, so I added explanation at the top of my question. And a solution using `ed` at the bottom. :D – ghoti Jan 23 '18 at 06:26
  • @tripleee like the the last solution but I cannot get it to work. The text that I want to append is in the directory `./TATTOO_$currn` were `$currn` is a variable. Therefore I used ed `$f <<< $'H\n/^$/\n-\n.r ./TATTOO_$currn/top.txt\nw\nq\n'` But I get a no such file or directory error. The reason being that `$currn` is not interpreted, I think – shamalaia Jan 23 '18 at 10:49
  • You can't use a variable inside single quotes. Why do you ping me anyway; this is @ghoti's answer? – tripleee Jan 23 '18 at 10:54
  • Also, as previously pointed out, you realy want double quotes around `"$f"`. The link in my answer explains shell quoting in full fascinating but excruciating detail. – tripleee Jan 23 '18 at 10:56
  • @shamalaia - added a mechanism for that. See my edit, at the bottom of the answer. – ghoti Jan 23 '18 at 11:28
1

Adding the text to a variable so you can interpolate the variable is wasteful and an unnecessary complication. sed can easily read the contents of a file by itself.

sed -i.bak '1r./PATH/text.txt' "$f"

Unfortunately, this part of sed is poorly standardized, so you may have to experiment a little bit. Some dialects require a newline (perhaps, or perhaps not, preceded by a backslash) before the filename.

sed -i.bak '1r\
./PATH/text.txt' "$f"

(Notice also the double quotes around the file name. You generally always want double quotes around variables which contain file names. More here.)

Adapting the recipe from here we can extend this to apply to the first empty line instead of the first line.

sed -i.bak -e '/^$/!b' -e 'r./PATH/text.txt' -e :a -e '$!{' -e n -e ba -e } "$f"

This adds the boilerplate after the first empty line but perhaps that's acceptable. Refactoring it to replace it or add an empty line after should not be too challenging anyway. (Maybe use sed -n and instead explicitly print everything except the empty line.)

In brief terms, this skips to the end (simply prints) up until we find the first empty line. Then, we read and print the file, and go into a loop which prints the remainder of the file without returning to the beginning of the script.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Great that this demonstrates how to read the file, but how would you cause sed to insert the file before the first blank line of the file? `'/^$/r file'` will insert it *after* the blank line. Oh, and also, does this use GNUisms? With the loop added, this isn't working for me at all in FreeBSD. – ghoti Jan 23 '18 at 05:30
  • Removing the empty line and printing it after should be a minor refactoring. To my knowledge, the loop should be portable, but I'll see if I can find a bug. Thanks for the feedback! – tripleee Jan 23 '18 at 05:33
  • Not sure what we're doing differently. I get the same behaviour in FreeBSD and macOS - `sed -e '/^$/r file2' file1` inserts file2 after after blank line, and adding the loop turns this into a very complex `cat`. :) While I can't explain the latter, the former makes sense to me. How are you thinking that your `/^$/` condition would only be matched once? – ghoti Jan 23 '18 at 05:40
  • I refactored it some more. I *think* I have it working now, with some caveats. Thanks again for prodding me. – tripleee Jan 23 '18 at 05:50
  • 1
    This was as far as I got: `sed -ne '1,/^$/{p;/^$/{r file2' -e '};};/^$/,$p' file1`. Looks like we're both stuck with the insert happening after the blank instead of before it in sed. :) And .. my pleasure, these are always fun exercises. – ghoti Jan 23 '18 at 05:52
  • @ghoti in the first range, the print statement is before the match for a blank line and corresponding file read. Can’t print be moved back a bit? And how come you both are breaking it up with -e bits? – Guy Jan 25 '18 at 01:37
  • The `-e` vs `;` is mostly because you can't tell, portably, where `r` will stop reading its argument (is the semicolon part of the argument?) so splitting on semicolons (or newlines) sometimes doesn't work. But `-e` also sometimes doesn't work -- pick your poison. Writing properly portable `sed` outside of the very core `s/foo/bar` is challenging. – tripleee Jan 25 '18 at 05:14
  • @Guy, the first range goes from the top of the file to the first blank line. If you can think of a way to express a range that ends at the line BEFORE a match, I would love to know about it. :) `sed` has no equivalent to ed's "- or ^" command (see the bottom of [my answer](https://stackoverflow.com/a/48394016/1072112)), because sed edits a *stream*, not a file. There is no going back in the stream. There is only what you saved while you were in it. And .. what tripleee said about `-e`. The `r` command, and anything with a label, wants to be at the end of a line. `-e` starts a new line. – ghoti Jan 25 '18 at 05:19
  • @ghoti the read command just outputs to stdout doesn’t it? So move the p behind that block? { /^$/ { r file2’ -e ‘} p; } or if the files been read into a variable use insert ‘i\’. I may be on completely the wrong track though. – Guy Jan 25 '18 at 05:28
1

sed that I think works. Uses files for the extra bit to be inserted.

b='##\n## comment piece\n##'

sed --posix -ne '
   1,/^$/ {
     /^$/ {
       x;
       /^true$/ !{
         x
         s/^$/true/
         i\
'"$b"'
        };
        x;
        s/^.*$//
      }
   }
p
' file1

with the examples using ranges of 1,/^$/, an empty first line would result in the disclaimer being printed twice. To avoid this, I've set it up to put a flag in the hold space ( x; s/^$/true/ ) that I can swap to the pattern space to check whether its the first blank. Once theres a match for blank line, i\ inserts the comment ($b) in front of the pattern space.

Thanks to ghoti for the initial plan.

Guy
  • 637
  • 7
  • 15