1

I have a large log file I need to sort, I want to extract the text between parentheses. The format is something like this:

<@44541545451865156> (example#6144) has left the server!

How would I go about extracting "example#6144"?

mustaccio
  • 18,234
  • 16
  • 48
  • 57
Google
  • 113
  • 1
  • 1
  • 6
  • 1
    I disagree with the close vote. People ask questions like this here often enough. These questions are easy to answer and, more importantly, lend themselves to answers that lead OP to learn something he didn't know before. – thb Jan 31 '19 at 01:02
  • (When I commented earlier, the reason given to close was *off topic,* which I disagreed. *Duplicate* is a better reason.) – thb Jan 31 '19 at 11:20

3 Answers3

3

This sed should work here:

sed -E -n 's/.*\((.*)\).*$/\1/p' file_name

User123
  • 1,498
  • 2
  • 12
  • 26
2

There are many ways to skin this cat.

Assuming you always have only one lexeme in parentheses, you can use bash parameter expansion:

while read t; do echo $(t=${t#*(}; echo ${t%)*}); done <logfile

The first substitution: ${t#*(} cuts off everything up and including the left parenthesis, leaving you with example#6144) has left the server!; the second one: ${t%)*} cuts off the right parenthesis and everything after that.

Alternatively, you can also use awk:

awk -F'[)(]' '{print $2}' logfile

-F'[)(]' tells awk to use either parenthesis as the field delimiter, so it splits the input string into three tokens: <@44541545451865156>, example#6144, and has left the server!; then {print $2} instructs it to print the second token.

cut would also do:

cut -d'(' -f 2 logfile | cut -d')' -f 1
mustaccio
  • 18,234
  • 16
  • 48
  • 57
  • You need to quote the thing you `echo`, otherwise the shell will perform another round of whitespace tokenization and wildcard expansion on the value. See also https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable – tripleee Jan 31 '19 at 11:23
0

Try this:

sed -e 's/^.*(\([^()]*\)).*$/\1/' <logfile

The /^.*(\([^()]*\)).*$/ is a regular expression or regex. Regexes are hard to read until you get used to them, but are most useful for extracting text by pattern, as you are doing here.

thb
  • 13,796
  • 3
  • 40
  • 68
  • This will print in full any lines which do not contain the pattern. – tripleee Jan 31 '19 at 04:54
  • @tripleee: true enough. Well, it looks as though other answerers have this covered, so I'll just upvote theirs. Thanks. – thb Jan 31 '19 at 11:16