3
!/bin/bash

for file in ~/tdg/*.TXT
  do
    while read p; do
      randvalue=`shuf -i 1-99999 -n 1`
      sed -i -e "s/55555/${randvalue}/" $file
    done < $file
  done

This is my script. I'm attempting to replace 55555 with a different random number every time I find it. This currently works, but it replaces every instance of 55555 with the same random number. I have attempted to replace $file at the end of the sed command with $p but that just blows up.

Really though, even if I get to the point were each instance on the same line all of that same random number, but a new random number is used for each line, then I'll be happy.

EDIT

I should have specified this. I would like to actually save the results of the replace in the file, rather than just printing the results to the console.

EDIT

The final working version of my script after JNevill's fantastic help:

!/bin/bash

for file in ~/tdg/*.TXT
do
  while read p; 
  do
    gawk '{$0=gensub(/55555/, int(rand()*99999), "g", $0)}1' $file > ${file}.new
  done < $file
  mv -f ${file}.new $file
done
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
Simoney
  • 35
  • 1
  • 2
  • 8
  • That "final working version" is wrong in so many ways - 1) That awk script will not produce random numbers file by file since awk re-seeds the random number generator with the current time to seconds granularity on every invocation, 2) it will replace all 55555s on each line with the same number, not different numbers, 3) it's running awk on the entire file once per line of each file (so each 10 line file will be processed 10 times!) 4) GNU awk (which you're using for gensub()) has `-i inplace` so you don't need a loop or temp files at all 5) it will produce 0-99999 instead or 1-99999 numbers – Ed Morton Sep 09 '17 at 01:48
  • Obviously I'm not too familiar with awk, but many of your complaints simply aren't true of the output I'm getting. Every instance of 55555 in every file was replace by its own unique number. – Simoney Oct 05 '17 at 15:15
  • No, they are all true you apparently just haven't tested with the input that will produce some of those failures yet and haven't looked closely enough at your output. 1) check the output and you'll see that the "random" numbers for the 2nd file are identical to the ones for the first file. 2) try it with `foo 55555 bar 55555` and you'll see that you end up with the same "random" number twice on that line. etc., etc.... – Ed Morton Oct 05 '17 at 16:03
  • Here's a simple test you can run to demonstrate the problem: Execute this command line that calls seq+awk twice: `seq 11 | gawk '{$0=gensub(/1/, int(rand()*99999), "g", $0)}1'; seq 11 | gawk '{$0=gensub(/1/, int(rand()*99999), "g", $0)}1'`. Notice a few things in the output: 1) unless you got [un]lucky and crossed a seconds since the epoch boundary mid-command the output of both commands is identical because the same "random" numbers are used repeatedly; 2) `11` is replaced by 2 repetitions of the same "random" number because `rand()` is called before `gensub()` is. Exactly what I said above. – Ed Morton Oct 05 '17 at 16:29

3 Answers3

7

Since doing this is in sed gets pretty awful and quickly you may want to switch over to awk to perform this:

awk '{$0=gensub(/55555/, int(rand()*99999), "g", $0)}1' $file

Using this, you can remove the inner loop as this will run across the entire file line-by-line as awk does.

You could just swap out the entire script and feed the wildcard filename to awk directly too:

awk '{$0=gensub(/55555/, int(rand()*99999), "g", $0)}1' ~/tdg/*.TXT
JNevill
  • 46,980
  • 4
  • 38
  • 63
  • This is great, and almost gets me there, but how would I save the changes to the file directly, instead of just printing out the results of the command? I probably should have specified I wanted to do that. – Simoney Sep 08 '17 at 20:01
  • In the loop version: just shoot them over to a temp file and then `mv` the temp file over the top of the original file: `awk '{$0=gensub(/55555/, int(rand()*99999), "g", $0)}1' $file > $file.tmp && mf -f $file.tmp $file` – JNevill Sep 08 '17 at 20:03
  • Without the loop-version you could shoot the output the a tmp like: `awk '{$0=gensub(/55555/, int(rand()*99999), "g", $0); print $0 > "FILENAME.tmp"}' ~/tdg/*.TXT` And then mv all the .tmp files over the top of the non-tmp files after the commands done. – JNevill Sep 08 '17 at 20:04
  • You'll have to futz with it a bit, but that should get you in the ballpark – JNevill Sep 08 '17 at 20:06
  • shooting the output to a temp, and then force moving after the loop worked like a charm. You the man! – Simoney Sep 08 '17 at 20:21
  • 1
    Since `int` truncates, `int(rand()*99999)` gives you a number in the range 0-99998, instead of 1-99999. To get the latter, you'd have to use `1 + int(rand()*99999)`. – Benjamin W. Sep 08 '17 at 20:24
  • Appreciate the input. For my purposes I'm not too worried about it, but it's good to have it on here for when someone wants this problem solved in the future. – Simoney Sep 08 '17 at 20:30
4

This is how to REALLY do what you're trying to do with GNU awk:

awk -i inplace '{ while(sub(/55555/,int(rand()*99999)+1)); print }' ~/tdg/*.TXT

No shell loops or temp files required and it WILL replace every 55555 with a different random number within and across all files.

With other awks it'd be:

seed="$RANDOM"
for file in ~/tdg/*.TXT; do
    seed=$(awk -v seed="$seed" '
            BEGIN { srand(seed) }
            { while(sub(/55555/,int(rand()*99999)+1)); print > "tmp" }
            END { print int(rand()*99999)+1 }
        ' "$file") &&
    mv tmp "$file"
done
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
2

A variation on JNevill's solution that generates a different set of random numbers every time you run the script ...

A sample data file:

$ cat grand.dat
abc def 55555
xyz-55555-55555-__+
123-55555-55555-456
987-55555-55555-.2.
.+.-55555-55555-==*

And the script:

$ cat grand.awk
{ $0=gensub(/55555/,int(rand()*seed),"g",$0); print }
  • gensub(...) : works same as Nevill's answer, while we'll mix up the rand() multiplier by using our seed value [you can throw any numbers in here you wish to help determine size of the resulting value]
  • ** keep in mind that this will replace all occurrences of 55555 on a single line with the same random value

Script in action:

$ awk -f grand.awk seed=${RANDOM} grand.dat
abc def 6939
xyz-8494-8494-__+
123-24685-24685-456
987-4442-4442-.2.
.+.-17088-17088-==*

$ awk -f grand.awk seed=${RANDOM} grand.dat
abc def 4134
xyz-5060-5060-__+
123-14706-14706-456
987-2646-2646-.2.
.+.-10180-10180-==*

$ awk -f grand.awk seed=${RANDOM} grand.dat
abc def 4287
xyz-5248-5248-__+
123-15251-15251-456
987-2744-2744-.2.
.+.-10558-10558-==*
  • seed=$RANDOM : have the OS generate a random int for us and pass into the awk script as the seed variable
markp-fuso
  • 28,790
  • 4
  • 16
  • 36