Remove last character of every nth line in place

Question

I have a file with four repeated lines. I am looking to remove the last character of every fourth line. A description of the file is below.

@Header  
DNA Sequence 
+ 
Quality score!
<Pattern of four above lines repeats>

I am trying to remove the last character (an exclamation point) from every fourth Quality score line.

@Header  
DNA Sequence 
+ 
Quality score
<Pattern of four above lines repeats>

I am able to use awk to pull out every fourth line, but how do I remove the last character in place on every fourth line of the file?

This question operates only on a specific line. Currently my approach is to use awk to pull the Quality score and I can remove the last character with sed.

awk 'NR == 4 || NR % 4 == 0'
sed 's/.$//'

I am currently not sure how to overwrite the edited Quality scores into the original file. Any thoughts or more concise inplace sed / awk arguments would be appreciated.

Why do you want to remove only the last quality and leave at the same time the last base??? — Poshi, Oct 15 '18 at 14:49
Hi @Poshi, great question, my quality scores have an additional Phred character (!) than all my sequences. It took me a bit to figure that out. — Cody Glickman, Oct 15 '18 at 14:52

score 6 · Answer 1 · answered Oct 15 '18 at 14:50

6

GNU-sed has an extension that can operate on every n-th line:

sed '4~4s/.$//'

m~n means on the m-th line repeated every n lines, run the following command.

answered Oct 15 '18 at 14:50

choroba

231,213
25
204
289

~ is an invalid command code on my Mac's sed. I appreciate the input though, this is beautiful and I will try this later on my linux desktop. – Cody Glickman Oct 15 '18 at 15:00
2

Also see https://stackoverflow.com/questions/30003570/how-to-use-gnu-sed-on-mac-os-x – jas Oct 15 '18 at 15:12
Thanks @jas, a little homebrew goes a long way – Cody Glickman Oct 15 '18 at 15:19

dawg · Accepted Answer · 2018-10-15T15:34:01.857

3

Given:

$ cat file
1!
2!
3!
4!
5!
6!
7!
8!
9!
10!
11!
12!

You can use awk:

$ awk 'NR%4==0{sub(/!$/,"")}1' file
1!
2!
3!
4
5!
6!
7!
8
9!
10!
11!
12

And if you have gawk you can change in place:

$ gawk -i inplace 'NR%4==0{sub(/!$/,"")}1' file
$ cat file
1!
2!
3!
4
5!
6!
7!
8
9!
10!
11!
12

If you only have POSIX awk, you can effectively get an inplace replacement by using a temp file:

$ awk 'NR%4==0{sub(/!$/,"")}1' file >tmp_file && mv tmp_file file

(Which is what GNU sed or GNU awk or perl or ruby is doing under the covers anyway with 'inplace' replacement...)

edited Oct 15 '18 at 15:34

answered Oct 15 '18 at 14:53

dawg

98,345
23
131
206

Thanks @dawg, this works in place as well with the sed flag. – Cody Glickman Oct 15 '18 at 14:57
3

sed is generating the sample input to demonstrate the solution, not solving your problem. `awk 'NR%4==0{sub(/!$/,"")}1'` is the solution to your problem. – Ed Morton Oct 15 '18 at 14:59
1

If you have `gawk` you can use `gawk -i inplace` to do inplace replacements. – dawg Oct 15 '18 at 15:00
Thanks @EdMorton, so I would still have to mv the file unless I have gawk to do inplace. – Cody Glickman Oct 15 '18 at 15:02
Right, you need GNU awk for inplace editing with awk just like you need GNU or BSD sed for inplace editing with sed. – Ed Morton Oct 15 '18 at 15:05
1

@CodyGlickman remark that inplace in sed and awk is nothing more then a fancy word for "create a new file with a temporary name and then just move it, overwriting the old file. – kvantour Oct 15 '18 at 15:29

score 3 · Answer 3 · answered Oct 15 '18 at 15:04

3

Perl to the rescue!

perl -lpe 'chop if 0 == $. % 4'

-p reads the input line by line and prints it after processing
-l removes a newline from the input line and adds it back to output
chop removes the last character
$. is a special perlvar that contains the input line number, % is the modulo operator

answered Oct 15 '18 at 15:04

choroba

231,213
25
204
289

Hi @choroba, as someone who typically bemoans perl, this is very concise and readable. Also, it works great. – Cody Glickman Oct 15 '18 at 15:07
1

@CodyGlickman: Perl can be readable and concise. Unfortunately, many people can't use it that way. – choroba Oct 15 '18 at 15:14
1

Readable Perl and Ruby one line answers almost always get my +1 – dawg Oct 15 '18 at 15:15
1

@choroba, this answer has swayed my misconceptions about perl :) – Cody Glickman Oct 15 '18 at 15:15
And in ruby you can do: `ruby -lpe 'chop if $. % 4==0' file` Both Ruby and Perl support inplace file editing with the addition of `-i ext` to this one line command – dawg Oct 15 '18 at 15:30
1

Though not terrible, I'd argue that that code is still not clear so therefore it's brief but not concise. Show `perl -lpe 'chop if 0 == $. % 4'` to some random C programmer and ask them what it does. Now show them `awk 'NR%4==0{sub(/.$/,"")}1' file` which is the equivalent. I'm not saying the awk script is completely clear as it's relying on `1` as shorthand for `{print}` for brevity but I think the average programmer is far more likely to understand the awk script at a glance than they are the perl script. – Ed Morton Oct 15 '18 at 17:32

score 1 · Answer 4 · answered Oct 15 '18 at 14:50

1

Could you please try following.

awk 'FNR%4==0{print substr($0,1,length($0)-1);next} 1' Input_file > temp_file && mv temp_file Input_file

This will save the output into Input_file itself(it will create a output directory named temp_file and then rename/move temp_file to your actual Input_file).

answered Oct 15 '18 at 14:50

RavinderSingh13

130,504
14
57
93

1

This works! I am guessing I could change the 4 after FNR% if I wanted to operate on different lines too. Thanks again! – Cody Glickman Oct 15 '18 at 14:56
1

@CodyGlickman, yes `FNR%4==0` means if line number is fully divided by `4` then do this action. Yes you could change it as per your wish too, will add full explanation in few mins too. – RavinderSingh13 Oct 15 '18 at 14:58

score 0 · Answer 5 · answered Oct 16 '18 at 16:11

0

This might work for you (GNU sed):

sed 'n;n;n;s/.$//' file

Or

sed 'N;N;N;s/.$//' file

answered Oct 16 '18 at 16:11

potong

55,640
6
51
83

Remove last character of every nth line in place

5 Answers5