grep: invalid repetition count(s) when using while loop

Question

So I'm using a MacOS commandline and have two files

File A.txt

A
B
F

File B.txt

>A
abcde
>B
efghi
>C
jklmn
>D
opqrs
>E
tuvwx
>F
yz123

I want it to go through a while loop through file A.txt and only print the corresponding header and content from file B.txt

>A
abcde
>B
efghi
>F
yz123

This line works when I go through each line in File A individually. grep -n "\A\,\>\{x;p;}" B.txt

But when I do this: While read i; do grep -n "\$i\,\>\{x;p;}" B.txt >> newfile.txt; done < A.txt

I get this error: grep: invalid repetition count(s)

What am I doing wrong?

@tripleee Let's start a project for convenient handling of fasta files. This format seems to be widely used with no tools other than grep, sed, awk. That's actually scaring (me) given the purpose it is used for :) — hek2mgl, Mar 04 '19 at 12:31
Is that really `grep`? `grep -n "\A\,\>\{x;p;}" B.txt` looks like it should be `sed -n "\A\,\>\{x;p;}" B.txt` — William Pursell, Mar 04 '19 at 12:37
and @hek2mgl -it would be very helpful. Looking for answers is as only good as my google search. Thank you both very much for your help. — cms72, Mar 04 '19 at 12:38
@hek2mgl The problem is not lack of tools, they have BioPerl, BioPython etc ... it's just that biotech people are often just learning basic U*x. — tripleee, Mar 04 '19 at 12:38
@cms72 Please accept the duplicate nomination so that this question no longer comes up as unresolved. — tripleee, Mar 04 '19 at 12:39
@William Pursell - for some reason, sed works on linux fine, but not macos. Thank you for the suggestion though! — cms72, Mar 04 '19 at 12:41
@tripleee accepted duplicate! And yes, I use bioperl and biopython -very handy. But I do the simple data manipulation on command line. I still google everything when I'm stuck on a code -its just knowing what to google. It would be nice to have a database for handling fasta files..so much easier to search for. Thanks again! — cms72, Mar 04 '19 at 12:46
The `sed` dialect on MacOS is slightly different. It can generally speaking do the same things as Linux `sed`, with a few rather exotic exceptions; but you have to know the differences if you want to translate from one dialect to another. — tripleee, Mar 04 '19 at 12:48

score 1 · Accepted Answer · answered Mar 04 '19 at 12:01

1

With grep you could use:

/bin/grep -A1 -Ff fileA fileB 
>A
abcde
>B
efghi
--               <--- produces separators
>F
yz123

Alternatively with awk:

awk 'NR==FNR{a[$0];next}{sub(/^>/,"")} $0 in a {print ">"$0;p=1;next} p{print;p=0}' fileA fileB 
>A
abcde
>B
efghi
>F
yz123

answered Mar 04 '19 at 12:01

hek2mgl

152,036
28
249
266

Thanks hek2mgl! Is there a way to modify it to include content that was more than one line? ```>A abcdefghi``` ```jklmopqrs``` ```>B .....``` where jklmopqrs was on a new line? – cms72 Mar 04 '19 at 12:29
how many lines? – hek2mgl Mar 04 '19 at 12:32
It varies. Some would have a pattern of 10 lines, some just 5. – cms72 Mar 04 '19 at 12:34

grep: invalid repetition count(s) when using while loop

1 Answers1