3

I have a file with four repeated lines. I am looking to remove the last character of every fourth line. A description of the file is below.

@Header  
DNA Sequence 
+ 
Quality score!
<Pattern of four above lines repeats>

I am trying to remove the last character (an exclamation point) from every fourth Quality score line.

@Header  
DNA Sequence 
+ 
Quality score
<Pattern of four above lines repeats>

I am able to use awk to pull out every fourth line, but how do I remove the last character in place on every fourth line of the file?

This question operates only on a specific line. Currently my approach is to use awk to pull the Quality score and I can remove the last character with sed.

awk 'NR == 4 || NR % 4 == 0'
sed 's/.$//'

I am currently not sure how to overwrite the edited Quality scores into the original file. Any thoughts or more concise inplace sed / awk arguments would be appreciated.

Cody Glickman
  • 514
  • 1
  • 8
  • 30

5 Answers5

6

GNU-sed has an extension that can operate on every n-th line:

sed '4~4s/.$//'

m~n means on the m-th line repeated every n lines, run the following command.

choroba
  • 231,213
  • 25
  • 204
  • 289
3

Given:

$ cat file
1!
2!
3!
4!
5!
6!
7!
8!
9!
10!
11!
12!

You can use awk:

$ awk 'NR%4==0{sub(/!$/,"")}1' file
1!
2!
3!
4
5!
6!
7!
8
9!
10!
11!
12

And if you have gawk you can change in place:

$ gawk -i inplace 'NR%4==0{sub(/!$/,"")}1' file
$ cat file
1!
2!
3!
4
5!
6!
7!
8
9!
10!
11!
12

If you only have POSIX awk, you can effectively get an inplace replacement by using a temp file:

$ awk 'NR%4==0{sub(/!$/,"")}1' file >tmp_file && mv tmp_file file

(Which is what GNU sed or GNU awk or perl or ruby is doing under the covers anyway with 'inplace' replacement...)

dawg
  • 98,345
  • 23
  • 131
  • 206
  • Thanks @dawg, this works in place as well with the sed flag. – Cody Glickman Oct 15 '18 at 14:57
  • 3
    sed is generating the sample input to demonstrate the solution, not solving your problem. `awk 'NR%4==0{sub(/!$/,"")}1'` is the solution to your problem. – Ed Morton Oct 15 '18 at 14:59
  • 1
    If you have `gawk` you can use `gawk -i inplace` to do inplace replacements. – dawg Oct 15 '18 at 15:00
  • Thanks @EdMorton, so I would still have to mv the file unless I have gawk to do inplace. – Cody Glickman Oct 15 '18 at 15:02
  • Right, you need GNU awk for inplace editing with awk just like you need GNU or BSD sed for inplace editing with sed. – Ed Morton Oct 15 '18 at 15:05
  • 1
    @CodyGlickman remark that inplace in sed and awk is nothing more then a fancy word for "create a new file with a temporary name and then just move it, overwriting the old file. – kvantour Oct 15 '18 at 15:29
3

Perl to the rescue!

perl -lpe 'chop if 0 == $. % 4'
  • -p reads the input line by line and prints it after processing
  • -l removes a newline from the input line and adds it back to output
  • chop removes the last character
  • $. is a special perlvar that contains the input line number, % is the modulo operator
choroba
  • 231,213
  • 25
  • 204
  • 289
  • Hi @choroba, as someone who typically bemoans perl, this is very concise and readable. Also, it works great. – Cody Glickman Oct 15 '18 at 15:07
  • 1
    @CodyGlickman: Perl can be readable and concise. Unfortunately, many people can't use it that way. – choroba Oct 15 '18 at 15:14
  • 1
    Readable Perl and Ruby one line answers almost always get my +1 – dawg Oct 15 '18 at 15:15
  • 1
    @choroba, this answer has swayed my misconceptions about perl :) – Cody Glickman Oct 15 '18 at 15:15
  • And in ruby you can do: `ruby -lpe 'chop if $. % 4==0' file` Both Ruby and Perl support inplace file editing with the addition of `-i ext` to this one line command – dawg Oct 15 '18 at 15:30
  • 1
    Though not terrible, I'd argue that that code is still not clear so therefore it's brief but not concise. Show `perl -lpe 'chop if 0 == $. % 4'` to some random C programmer and ask them what it does. Now show them `awk 'NR%4==0{sub(/.$/,"")}1' file` which is the equivalent. I'm not saying the awk script is completely clear as it's relying on `1` as shorthand for `{print}` for brevity but I think the average programmer is far more likely to understand the awk script at a glance than they are the perl script. – Ed Morton Oct 15 '18 at 17:32
1

Could you please try following.

awk 'FNR%4==0{print substr($0,1,length($0)-1);next} 1' Input_file > temp_file && mv temp_file Input_file

This will save the output into Input_file itself(it will create a output directory named temp_file and then rename/move temp_file to your actual Input_file).

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • 1
    This works! I am guessing I could change the 4 after FNR% if I wanted to operate on different lines too. Thanks again! – Cody Glickman Oct 15 '18 at 14:56
  • 1
    @CodyGlickman, yes `FNR%4==0` means if line number is fully divided by `4` then do this action. Yes you could change it as per your wish too, will add full explanation in few mins too. – RavinderSingh13 Oct 15 '18 at 14:58
0

This might work for you (GNU sed):

sed 'n;n;n;s/.$//' file

Or

sed 'N;N;N;s/.$//' file
potong
  • 55,640
  • 6
  • 51
  • 83