0

I have a file that contains a sequence already broken into lines, something like this:

CGCCCATGGGTCGTATACGTAATGGGAAAACAAAGCATGGTGTAACTATGGTAAGTGCTA
GACAATACAAGAAGGCTGATATTTGTAGAATAATTCATTTGAATTATTATGCTGTAAATA
GCTAGATTATTATGCATAATTACTTTGAGAGGTGATCAATCAATTCGACCCTTGCCAATT

I want to search a specific pattern in this file like GCTGTAAATAGCTAGATTA for example. The problem is that the pattern may be cut by a newline at an unpredictable place.

I can use :

grep -e "pattern" file 

but it cannot avoid "new line" character and doesn't give the result. How can I modify my command to ignore \n in my search?

Edit: I don't know either my query exists in the file or not, and if it is there, I don't know where it exists.

The best solution that came into my mind is

tr -d '\n' < file | grep -e "CTACCCCAGACAAACTGGTCAGATACCAACCATCAGCGAAACTAACCAAACAAA"

but I know there should be more efficient ways to do that.

Romain Vincent
  • 2,875
  • 2
  • 22
  • 29
user2373198
  • 147
  • 10
  • this is different with that question, in that question, he knows that those two words are at different lines, so he can put the escape char between them. But in my case I don't know where should i put \n in my query. So even the below answer is not working too. – user2373198 Sep 23 '16 at 15:42
  • for example look at the answer 'abc.*(\n|.)*efg' test.txt, he knows he should put the \n between abc and efg. – user2373198 Sep 23 '16 at 15:44

1 Answers1

-1
pattern="GCTGTAAATA"$'\n'"GCTAGATTA"  # $'\n' is Bash's way of mentioning special chars 
grep -e "$pattern" file 

OR

pattern="GCTGTAAATA
GCTAGATTA"   # with an actual newline at the end of the first line
grep -e "$pattern" file 
euphoria83
  • 14,768
  • 17
  • 63
  • 73