0

I would like to extract text that falls between two | signs in a file with multiple lines. For instance, I want to extract P16 from sp|P16|SM2. I have found a possible answer here. However, I cannot apply the answer to my case. I am using the following:

sed -n '/|/,/|/ p' filename

or this by escaping the | sign:

sed -n '/\|/,/\|/ p' filename

But what I receive as result are all the lines in the file unchanged even though I am using -n to suppress automatic printing of pattern space. Any ideas what I am missing?

[EDIT]:

I can get the desired result using the following. However, I would like an explanation why the above mentioned is not working:

sed 's/^sp|//' filename | sed 's/|.*//'
Community
  • 1
  • 1
Dataman
  • 3,457
  • 3
  • 19
  • 31
  • Possible duplicate of [How to use sed/grep to extract text between two words?](http://stackoverflow.com/questions/13242469/how-to-use-sed-grep-to-extract-text-between-two-words) – Benjamin W. Apr 04 '16 at 13:33
  • @BenjaminW. You can see that I have had included the exact link in my question saying that the answer is there already... – Dataman Apr 04 '16 at 13:35
  • Yes, but the way you tried it was using the wrong approach of the question itself rather than the correct top answer. – Benjamin W. Apr 04 '16 at 13:36
  • @ I see! Well, I was not able to use the top answer to solve my problem. Do you suggest that I should delete this question? – Dataman Apr 04 '16 at 13:46
  • No no, I'm not suggesting you should delete it. Arguably, the questions aren't exactly the same as the other question has the delimiters at the beginning and end of the line. – Benjamin W. Apr 04 '16 at 13:50

2 Answers2

2

the tool for this task is cut

$ echo "sp|P16|SM2" | cut -d'|' -f2
P16
karakfa
  • 66,216
  • 7
  • 41
  • 56
1

awk is better choice for column based data:

awk -F'|' '{print $2}' 

will give you P16

sed one-liner:

The following sed one-liner will only leave the 2nd column for you:

kent$  echo "sp|P16|SM2"|sed 's/[^|]*|//;s/|[^|]*//' 
P16

Or using grouping:

kent$  echo "sp|P16|SM2"|sed 's/.*|\([^|]*\)|.*/\1/'     
P16

Short explanation why your two commands didn't work:

1) sed -n '/|/,/|/ p' filename

This sed will print lines between two lines which containing |

2) sed -n '/\|/,/\|/ p' filename

Sed takes BRE as default. If you escape the |, you gave them special meaning, the logical OR. again, the /pat1/,/pat2/ address was wrong usage for your case, it checks lines, not within a line.

Kent
  • 189,393
  • 32
  • 233
  • 301