0

I would like to substract 2x two columns in a text file and add into two new columns in a tab delimited text file in bash using awk.

  1. I would like to substract column 3 (h3) - column 1 (h1). And name the new added column "count1".
  2. I would like to substract column 4 (h4) - column 2 (h2). And name the new added column "count2".

I don't want to build a new text file, but edit the old one.

My text file:

    h1 h2 h3 h4 h5    
    343 100 856 216 536
    283 96 858 220 539
    346 111 858 220 539
    283 89 860 220 540
    280 89 862 220 541
    76 32 860 220 540
    352 105 856 220 538
    57 16 860 220 540
    144 31 858 220 539
    222 63 860 220 540
    305 81 858 220 539

My command at the moment looks like this:

awk '{$6 = $3 - $1}1' file.txt
awk '{$6 = $4 - $2}1' file.txt

But I don't know how to rename the new added columns and maybe there is a smarter move to run both commands in the same awk command?

Luker354
  • 659
  • 3
  • 8
  • `anycommand file.txt > tempfile ; mv tempfile file.txt` – KamilCuk Nov 29 '21 at 10:35
  • You can trivially combine the two with `awk '{$6 = $3-1; $7 = $4-$2 }1'` – tripleee Nov 29 '21 at 10:37
  • 1
    This should not have been closed. There is more to the question than editing a file in place. – dan Nov 29 '21 at 10:40
  • @tripleee : I guess the lone `1` afterwards is to ensure that the modified fields are printed? What speaks against doing an explicit `print` after the assignment has been made. Also I don't see how your code deals with the header line. The OP skipped this problem too, but it seems to me that you try to do arithmetic on the header line as well. – user1934428 Nov 29 '21 at 12:18
  • 1
    @Luker354: Technically speaking, text files (or any file representing a _stream_) can't reasonably be edited in place. Even programs such as `sed`, which give you the illusion of an edit-in-place, create a temporary file under the hood and erase it afterwards. The closest you can come to an edit-in-place is to read the content of the file into memory, calculate the new file-content in memory, reposition the file to the beginning and, after throwing away the old file content, write the new content from memory. – user1934428 Nov 29 '21 at 12:21
  • There is no particular reason to prefer the `1` but it's a common idiom and since you used it in your script, I kept that. If you like typing, `print` is certainly clearer. – tripleee Nov 29 '21 at 12:24
  • Header lines are generally a nuisance; but yes, if you want to keep it, you will need to add some processing, like in the answer you already received. The comments up here are not attempts to answer, just comments. – tripleee Nov 29 '21 at 12:25

1 Answers1

1

Pretty simple in awk. Use NR==1 to modify the first line.

awk -F '\t' -v OFS='\t' '
NR==1 {print $0,"count1","count2"}
NR!=1 {print $0,$3-$1,$4-$2}' file.txt > tmp && mv tmp file.txt
dan
  • 4,846
  • 6
  • 15
  • Thanks for your comment. When applying your command it only adds two columns with 0's and the headnames: "count1" and "count2". Do you know what could be the reason? – Luker354 Nov 29 '21 at 13:20
  • 1
    I believe it's because you don't actually have a tab delimited text file (each number separated by a single tab). Your example has leading whitespace, trailing whitespace on the first line, and appears to be delimited by spaces (not tabs). You can just remove this part: `-F '\t' -v OFS='\t'`. Alternatively, convert the file to real tab delimited, using `sed -E -e 's/^[[:space:]]+//' -e 's/[[:space:]]+$//' -e $'s/ +/\t/g' file.txt | awk #..etc > tmp && mv tmp summed.tsv`. Hope that's clear enough. – dan Nov 29 '21 at 15:45
  • 1
    @Luker354 Also, I did just run a test with your example data, converted to tab separated, and the output was as expected. – dan Nov 29 '21 at 15:49