How to replace string with random line from text file multiple times

Question

I have a script that is meant to rotate thru a file and replace the place holder {{su}} with a random line from a file, the place holder appears multiple times in a file and I need to it to be a randomized each time. Currently it replaces every placeholder with the same line.

#!/bin/bash

subject=$(shuf -n1 *.subjects)
    cat tmp.$file | sed -e "s/{{su}}/$subject/" > output.file

possible duplicate of [How to use sed to replace only the first occurrence in a file?](http://stackoverflow.com/questions/148451/how-to-use-sed-to-replace-only-the-first-occurrence-in-a-file) — fredtantini, Dec 18 '14 at 15:36
I have looked at this and although this is related to my question it covers a different problem and the solution is unrelated. — BigEarl, Dec 18 '14 at 15:41
It doesn't seem like you've posted all of your code...where does `$file` come from? — Micah Smith, Dec 18 '14 at 15:59
I apologize, the file is in the same dir with named "*.subjects" — BigEarl, Dec 18 '14 at 16:13

gniourf_gniourf · Accepted Answer · 2014-12-18T18:03:32.877

The accepted answer has subtle flaws:

if {{su}} appears several times on the same line, the same replacement is performed for each pattern {{su}} of that line,
because read isn't used with IFS= and the -r switch, you'll get other nasty surprises: spaces are not going to be necessarily preserved, and you'll get backslash interpretations (but that's easy to fix),
if the replacement string contains slashes or other funny characters, sed will be confused.

A method that works, but that involves reading the whole file in memory (it's only good for small files where you have a small number of {{su}}):

#!/bin/bash

file=$(< filename.txt )

while [[ $file = *'{{su}}'* ]]; do
    repl=$(shuf -n1 file.subjects)
    file=${file/'{{su}}'/"$repl"}
done
printf '%s\n' "$file"

For an approach similar to the accepted answer, i.e., reading line by line:

#!/bin/bash

while IFS= read -r line; do
    while [[ $line = *'{{su}}'* ]]; do
        repl=$(shuf -n1 file.subjects)
        line=${line/'{{su}}'/"$repl"}
    done
    printf '%s\n' "$line"
done < filename.txt

Now about the way to select a random line: while shuf is fine, it's an external process and since it's going to be invoked many times (in a subshell), you might consider implementing something similar in Bash. If you have a limited amount of lines, you may consider slurping all the lines into an array and selecting randomly an entry from this array:

#!/bin/bash

mapfile -t replacements < file.subjects
nb_repl=${#replacements[@]}

while IFS= read -r line; do
    while [[ $line = *'{{su}}'* ]]; do
        repl=${replacements[RANDOM%nb_repl]}
        line=${line/'{{su}}'/"$repl"}
    done
    printf '%s\n' "$line"
done < filename.txt

This only works if you have a “small” number of lines in file.subjects (by small, understand less than 32767), and if you're not too worried about the distribution obtained by the modulo. There are very simple workarounds to fix this, though.

Note. You're using shuf -n1 *.subjects. It's an error to invoke shuf with several files (at least with my version of shuf). So if the glob *.subjects expands to more than one file, you'll get an error.

Note. If you don't want to run into an infinite loop, make sure that the replacements don't contain the {{su}} pattern!

Can I use the second approach to replace multiple strings? i.e. {{su}}, {{fr}}... so fourth? — BigEarl, Dec 21 '14 at 16:48
@BigEarl: Sure! you can stack lines of the form `line=${line/'{{su}}'/"$repl_su"}`, `line=${line/'{{fr}}'/"$repl_fr"}`, etc. where the replacements `repl_su` and `repl_fr` are defined appropriately. — gniourf_gniourf, Dec 21 '14 at 16:50
So would I use a wildcard? `while [[ $line = *'{{su}}'* ]]; do` to `while [[ $line = *'{{*}}'* ]]; do` ?? or use sort of regex **OR** statement? _Thank you for your assitance_ — BigEarl, Dec 21 '14 at 17:46
@BigEarl Oh sorry I didn't address this in my previous comment. Your solution is correct but might have some caveats (and infinite loops). Another safer one, to only focus on `{{su}}` and `{{fr}}` is to use extended globs: `while [[ $line = *'{{'@(su|fr)'}}'* ]]; do`. — gniourf_gniourf, Dec 21 '14 at 17:48
@BigEarl: I guess safety is more important than efficiency: so you have 2 options: one is to use extended globs as I showed, the other one would be to use regular expressions: `[[ $line =~ '{{'(fr|su)'}}' ]]`. I did a very quick benchmark and it _seems_ that the extglob `[[ $line = *'{{'@(su|fr)'}}'* ]]` solution is faster. — gniourf_gniourf, Dec 21 '14 at 17:58
Thank you for your time and assistance. If you do not mind one more request. I was using a long line of `| sed` statemtets. That allowed me to use replacements inside of replacements. i.e.. One placeholder is `{{img_map}}` and in the replacement string is`{{ml}}` as long as i replace them in that order it will replace the `{{ml}}` inside `{{img_map}}` When using the IFS it will only iterate over the line once correct? — BigEarl, Dec 21 '14 at 18:22
@BigEarl: `IFS` has nothing to do with that. You can use nested replacements: `{{img_map}}`→`stuff {{ml}} more stuff`. The order is irrelevant, as long as, eventually, there are no patterns left in the string. The `IFS= read ...` reads the line from the file, then it's the inner loop `while [[ $line = ... ]]` that does all the replacements from that line. — gniourf_gniourf, Dec 21 '14 at 18:25

score 1 · Answer 2 · edited Dec 18 '14 at 19:00

you need a loop. start by using wc -l to count the number of lines in tmp.$file. Then loop count times, in each time perform the two lines of shell script you have. So in each loop, you get a new subject and execute a new sed. The trick is to use the address,address format for the sed command to execute the replace against a single line at a time, passing in the loop counter for the address.

so something like (pseudo code here):

$count = $(wc -l tmp.$file)    
$i=1    
cp tmp.$file > output.file    
while $i < $count    
 subject = $(shuf -n1 *.subjects)    
 cat output.file | sed -e "$i,$is/{{su}}/$subject/" > output.file    
 $i=$i+1    
end while

midori · Answer 3 · 2014-12-18T17:50:16.527

1

In that case you need to iterate line by line in your file and generate random string every iteration. It will check all lines for {{su}} pattern and if finds it will substitute it with random string from another file:

while read line
do
subject=$(shuf -n1 *.subjects)
sed -e "s/{{su}}/$subject/g" <<< "$line")
done <1.txt

edited Dec 18 '14 at 17:50

answered Dec 18 '14 at 16:18

midori

4,807
5
34
62

Thank you, I turned it to a one liner and worked great. – BigEarl Dec 18 '14 at 16:30

How to replace string with random line from text file multiple times

3 Answers3