1

I'm trying to extract data/urls (in this case - someurl) from a file that contains them within some tag ie.

xyz>someurl>xyz

I don't mind using either awk or sed.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
L P
  • 1,776
  • 5
  • 25
  • 46
  • Likely same as: http://stackoverflow.com/questions/13386080/extract-text-between-two-strings-repeatedly-awk-sed, although the example there is ugly. – Ciro Santilli OurBigBook.com Jul 07 '15 at 08:41
  • Possible duplicate of [How to use sed/grep to extract text between two words?](https://stackoverflow.com/questions/13242469/how-to-use-sed-grep-to-extract-text-between-two-words) – tripleee Aug 22 '18 at 09:48

3 Answers3

9

I think the best, easiest, way is with cut:

$ echo "xyz>someurl>xyz" | cut -d'>' -f2
someurl

With awk can be done like:

$ echo "xyz>someurl>xyz" | awk  'BEGIN { FS = ">" } ; { print $2 }'
someurl

And with sed is a little bit more tricky:

$ echo "xyz>someurl>xyz" | sed 's/\(.*\)>\(.*\)>\(.*\)/\2/g'
someurl

we get blocks of something1<something2<something3 and print the 2nd one.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
0

grep was born to extract things:

kent$  echo "xyz>someurl>xyz"|grep -Po '>\K[^>]*(?=>)'
someurl

you could kill a fly with a bomb of course:

kent$  echo "xyz>someurl>xyz"|awk -F\> '$0=$2'
someurl
Kent
  • 189,393
  • 32
  • 233
  • 301
0

If your grep supports P option then you can use lookahead and lookbehind regular expression to identify the url.

$ echo "xyz>someurl>xyz" | grep -oP '(?<=xyz>).*(?=>xyz)'
someurl

This is just a sample to get you started not the final answer.

jaypal singh
  • 74,723
  • 23
  • 102
  • 147