How to replace a matrix entry in a text file with a number from another text file (OSX Sierra, bash)

Question

How to replace an entire line in a text file by line number

The question at the link above asks how to replace a line in a text file. nakeer (second answer from bottom) has provided an answer that works well for me as a Mac user:

sed -i '' -e 's/text-on-line-to-be-changed.*/text-to-replace-the=whole-line/' file-name

However, I cannot figure out how to modify it for my particular situation. If anyone can point me in the right direction, I would be very grateful.

I have many files. Each file contains this matrix, exactly as appears below:

12.345678    0.000000    0.000000    
 0.000000   12.345678    0.000000    
 0.000000    0.000000   12.345678

I have one additional file that contains a column of numbers, like:
```
87.654321   
18.765432    
21.876543
...
```
I want to take one number from each line of the column in (2). I want to use it to replace the non-zero values of one matrix in (1) (preserving the zero values). So the first matrix in the first file should look like:
```
87.654321    0.000000    0.000000    
 0.000000   87.654321    0.000000    
 0.000000    0.000000   87.654321
```

The second matrix in the second file should use "18.765432" for its non-zero values.

I'm not experienced as far as bash scripting, but I so far have (where ic is my file in (1) that contains the original matrix and I copy it into a new directory where I can change that matrix to (3)):

#!/bin/bash

let timesteps=20000
for ((step=0; step <= timesteps ; step++))
do
  mkdir $step/results
  cp ic $step/ic

  cat X >> X # <--Here I'd like to modify nakeer's expression. Any hints would be much appreciated.

Update:

I have managed to get Ed's very clear solution up and running. There is one problem, however. The files that contain the matrices (see (1) above) also contain other data. For example (before executing Ed's code):

    12.345678    0.000000    0.000000    
     0.000000   12.345678    0.000000    
     0.000000    0.000000   12.345678
   0.5   
   abc.xyx 
   90
   900
   0.125
   90
   6

Ed's code successfully changes 12.345678 in the matrix to a new value. However, 0.125 in the list of numbers below the matrix is also changed to that new value. I do not want 0.125 to be changed.

Ed's code following match seems to use the format of numbers to identify which numbers to change and it looks like 0.125 falls into the category of numbers that should be changed. If anyone has any ideas about how to exclude 0.125 from the change, I'd be grateful to know!

How can I modify Ed's code in the case that each matrix file is in its own directory; e.g., 0/file0, 1/file1, 2/file2, etc?

Are you trying to replace all but non `0.0` values ? Or always the diagonal ? Which pattern is correct, if there is any ? — Zelnes, Jul 17 '19 at 12:12
@Zelnes, hi, thanks for asking. Yes, I'm always trying to replace the diagonal (all non-zero values). I need to preserve the zeroes. — Ant, Jul 17 '19 at 12:22
You can't just create the matrix while reading the file2 ? Instead of replacing — Zelnes, Jul 17 '19 at 12:30
@Zelnes, if that's possible, then it sounds like a solution. However, I create the new file (with the matrix values I want to change) by copying an original file. So I would have to either change the original matrix during the copy process or change it after copying the original file as I have suggested in my question. I have to be very careful only to change the matrix number and preserve all other formatting because the file will be an input file for an old code that is has strict formatting requirements. — Ant, Jul 17 '19 at 12:37

score 3 · Accepted Answer · answered Jul 17 '19 at 13:41

$ ls
file1  file2  file3  numbers  tst.awk

.

$ cat tst.awk
NR==FNR { a[NR]=$1; next }

FNR==1 {
    close(out)
    out = FILENAME ".new"
    fileNr++
}

match($0,/[0-9.]+[1-9][0-9.]+/) {
    $0 = substr($0,1,RSTART-1) a[fileNr] substr($0,RSTART+RLENGTH)
}

{ print > out }

.

$ tail -n +1 numbers file*
==> numbers <==
87.654321
18.765432
21.876543

==> file1 <==
12.345678    0.000000    0.000000
 0.000000   12.345678    0.000000
 0.000000    0.000000   12.345678

==> file2 <==
12.345678    0.000000    0.000000
 0.000000   12.345678    0.000000
 0.000000    0.000000   12.345678

==> file3 <==
12.345678    0.000000    0.000000
 0.000000   12.345678    0.000000
 0.000000    0.000000   12.345678

.

$ awk -f tst.awk numbers file1 file2 file3

.

$ ls
file1  file1.new  file2  file2.new  file3  file3.new  numbers  tst.awk

.

$ tail -n +1 file*.new
==> file1.new <==
87.654321    0.000000    0.000000
 0.000000   87.654321    0.000000
 0.000000    0.000000   87.654321

==> file2.new <==
18.765432    0.000000    0.000000
 0.000000   18.765432    0.000000
 0.000000    0.000000   18.765432

==> file3.new <==
21.876543    0.000000    0.000000
 0.000000   21.876543    0.000000
 0.000000    0.000000   21.876543

If you install GNU awk then you can use -i inplace to do "inplace" editing instead of creating new output files if that's useful to you.

Thank you for such a thorough explanation. I managed to get it working and will now figure out how to make it work with the fact that my files (file1, file2, and file3, etc) reside in distinct directories. One minor issue is that each file contains a list of other numbers below its matrix. I see that one of the numbers (a decimal/float--the others are all ints) is changed to the same value that the matrix entries are changed to. If anyone has any thoughts on how to prevent that, it would be great to hear them. Very grateful, Ed. — Ant, Jul 17 '19 at 15:27
You're welcome. It's very important when posting questions asking for help to manipulate data that you post sample input/output that's truly representative of that data. Edit your question to provide that better/more realistic example instead of what you have posted if you'd like help parsing it. — Ed Morton, Jul 17 '19 at 15:56

shellter · Answer 2 · 2019-07-17T15:09:18.457

Following up on @Zelnes 's idea, here's a version that just generates a new file based on your file2 data:

#!/bin/bash

while read value ; do
    ((fileNum++))
    awk -v input="$value" '
      END{
        for (i=1;i<4;i++) {
          for (j=1;j<4;j++) {
            #dbg print "#dbg: i=" i "\tj=" j;
            if (j==i) {
              #dbg print "#dbg: matched"
              arr[i]=input
            } else {
              #dbg print "#dbg: setting arr["j"]="arr[j]
              arr[j]=0.0
            }
          }
          printf("%0.6f\t%0.6f\t%0.6f\n",arr[1],arr[2],arr[3])}} ' /dev/null \
       > file"$fileNum".txt
done < newValues.txt

The printf can be changed to get the exact spacing you need for your format issues. Change the newValues.txt at the end to the name of your real input file. You can even make that /path/to/someplace/else/newValues.txt.

You'll need to

chmod +x myDataGenerator

Before running it.

cd to the directory you want the files created in, then call the script with its full path i.e.

/full/path/to/myDataGenerator

If you want the tiniest program possible, remove the #dbglines.

If you want to understand how the script works, uncomment one #dbg line at a time, understand what is is doing and then uncomment the next #dbg.

IHTH

@shelter, thank you very much. I have nearly managed to get Ed's solution up and running, but yours sounds like it will help with another issue on my to-do list. I'm trying to make a wrapper for an old FORTRAN code. I'm fine with that, but this formatting (bash, Awk) stuff is way out of my comfort zone. Thank you all very much for your help. — Ant, Jul 17 '19 at 15:39

How to replace a matrix entry in a text file with a number from another text file (OSX Sierra, bash)

2 Answers2

Linked