2

I have a text file containing this kind of content :

d__Affenpinscher|c__Abyssinian|h__Kathiawari|
a__Gold|y__Slix|c__Kathiawari|c__Cact

And I would like to obtain all the occurence that start with "c__" and end with "|" so that the final result is :

c__Abyssinian
c__Cact

I'm not that good with regular expression, so thanks for your help in advance.

edit : I'm looking for a bash command so grep/sed/awk are available I tried to start from a basic example like :

sed -n "/<PRE>/,/<\/PRE>/p" input.html

with < PRE > and < /PRE > beeing the start and the end of the pattern to

sed -n "/c__/,/|/p" breedList.txt > breedC.txt

But I didn't obtained the wanted output

Edit 2 : I tried to adapt this answer from a similar thread How to use sed/grep to extract text between two words? but I must be doing something wrong since my output is juste empty.

Here is the command I tried :

echo "d__Affenpinscher|c__Abyssinian|h__Kathiawari|" | grep -o -P '(?<=c__).*?(?=|)'
Rei
  • 329
  • 2
  • 10
  • What is the context of your problem? Are you using a particular programming language? Do you just want to manipulate text files in Bash? – jmrah Jun 12 '18 at 13:05
  • The `addr1,addr2` syntax in `sed` selects *lines* between the line selected by `addr1` and the line selected by `addr2`. – tripleee Jun 12 '18 at 14:08
  • 1
    Possible duplicate of [How to use sed/grep to extract text between two words?](https://stackoverflow.com/questions/13242469/how-to-use-sed-grep-to-extract-text-between-two-words) – tripleee Jun 12 '18 at 14:08

1 Answers1

1

The answer from rkta did the trick, thanks :) :

echo "d__Affenpinscher|c__Abyssinian|h__Kathiawari|" | grep -o -P '(?<=c__).*?(?=\|)' The vertical bar | is a special character and needs to be escaped.

You say: start with "c__" and end with "|", but c__Cact doesn't end with |
Rei
  • 329
  • 2
  • 10