5

I have an email log file like this:

2013-09-11 12:02:08  INFO: ------------------------------
2013-09-11 12:02:08  INFO: Javamail session sending email
2013-09-11 12:02:08  INFO: Session properties: 
2013-09-11 12:02:08  INFO:    com.hof.email.starttime=20130911120208
2013-09-11 12:02:08  INFO:    mail.smtp.auth=true
2013-09-11 12:02:08  INFO:    mail.smtp.connectiontimeout=60000
2013-09-11 12:02:08  INFO:    mail.smtp.host=mailserver
2013-09-11 12:02:08  INFO:    mail.smtp.port=25
2013-09-11 12:02:08  INFO:    mail.smtp.timeout=60000
2013-09-11 12:02:08  INFO:    mail.transport.protocol=smtp
2013-09-11 12:02:08  INFO: From: Support
2013-09-11 12:02:08  INFO: To: Customer
2013-09-11 12:02:08  INFO: Subject: Your Report Data
2013-09-11 12:02:08  INFO: Message ID: <id>
2013-09-11 12:02:09  INFO: Email sent successfully
2013-09-11 12:02:09  INFO: Javamail session ended
2013-09-11 12:02:09  INFO: ------------------------------

What I need to do is print this entire record if the email subject matches a particular string.

That is, what I think I'd like to do is, when Subject = 'Your Report Data', then print all lines between and including the n-1th occurrence of '------------------------------' and the 1st occurrence of '------------------------------' from the Subject match.

Kevin Panko
  • 8,356
  • 19
  • 50
  • 61
Billy
  • 53
  • 2
  • 3
    if the part between the lines is always the same, you can use `grep` with `-A` and `-B`. – mnagel Sep 12 '13 at 19:07
  • Is subject always line 12 of the block? – user000001 Sep 12 '13 at 19:12
  • Yes, the structure of the log entry is always the same for every entry. – Billy Sep 12 '13 at 19:18
  • As @mnagel indicates, if the format is that fixed (the `----...--` lines are the same distance always from the `Subject:` line), then the "cheesy" way to do it would be: `grep -A 4 -B 12 "INFO:.*Subject: " log.txt` in this case. – lurker Sep 12 '13 at 19:36
  • Thank you, @mnagel, this works just fine for my purposes. – Billy Sep 12 '13 at 19:45

4 Answers4

2

if the part between the lines is always the same, you can use grep with -A and -B.

mnagel
  • 6,729
  • 4
  • 31
  • 66
1

This only works with gawk

awk '/Subject: Your Report Data/{printf "%s%s\n",$0,RT}' RS="------------------------------" file

Edit: a more complex version, prints correct section

cat file
2013-09-11 12:02:08  INFO: ------------------------------
2013-09-11 12:02:08  INFO: Javamail session sending email
2013-09-11 12:02:08  INFO: Session properties:
2013-09-11 12:02:08  INFO:    com.hof.email.starttime=20130911120208
2013-09-11 12:02:08  INFO:    mail.smtp.auth=true
2013-09-11 12:02:08  INFO:    mail.smtp.connectiontimeout=60000
2013-09-11 12:02:08  INFO:    mail.smtp.host=mailserver
2013-09-11 12:02:08  INFO:    mail.smtp.port=25
2013-09-11 12:02:08  INFO:    mail.smtp.timeout=60000
2013-09-11 12:02:08  INFO:    mail.transport.protocol=smtp
2013-09-11 12:02:08  INFO: From: Support
2013-09-11 12:02:08  INFO: To: Customer
2013-09-11 12:02:08  INFO: Subject: Your Report Data
2013-09-11 12:02:08  INFO: Message ID: <id>
2013-09-11 12:02:09  INFO: Email sent successfully
2013-09-11 12:02:09  INFO: Javamail session ended
2013-09-11 12:02:09  INFO: ------------------------------
2013-09-11 12:02:08  INFO: Javamail session sending email
2013-09-11 12:02:08  INFO: Session properties:
2013-09-11 12:02:08  INFO:    com.hof.email.starttime=20130911120208
2013-09-11 12:02:08  INFO:    mail.smtp.auth=true
2013-09-11 12:02:08  INFO:    mail.smtp.connectiontimeout=60000
2013-09-11 12:02:08  INFO:    mail.smtp.host=mailserver
2013-09-11 12:02:08  INFO:    mail.smtp.port=25
2013-09-11 12:02:08  INFO:    mail.smtp.timeout=60000
2013-09-11 12:02:08  INFO:    mail.transport.protocol=smtp
2013-09-11 12:02:08  INFO: From: Support
2013-09-11 12:02:08  INFO: To: Customer
2013-09-11 12:02:08  INFO: Subject: Error
2013-09-11 12:02:08  INFO: Message ID: <id>
2013-09-11 12:02:09  INFO: Email sent successfully
2013-09-11 12:02:09  INFO: Javamail session ended
2013-09-11 12:02:09  INFO: ------------------------------

awk '/---/ {if (p) {for (j=0;j<i;j++) print a[j]};i=0;p=0;delete a;a[i++]=$0} !/---/ {a[i++]=$0} /Your/ {p=1}'
2013-09-11 12:02:08  INFO: ------------------------------
2013-09-11 12:02:08  INFO: Javamail session sending email
2013-09-11 12:02:08  INFO: Session properties:
2013-09-11 12:02:08  INFO:    com.hof.email.starttime=20130911120208
2013-09-11 12:02:08  INFO:    mail.smtp.auth=true
2013-09-11 12:02:08  INFO:    mail.smtp.connectiontimeout=60000
2013-09-11 12:02:08  INFO:    mail.smtp.host=mailserver
2013-09-11 12:02:08  INFO:    mail.smtp.port=25
2013-09-11 12:02:08  INFO:    mail.smtp.timeout=60000
2013-09-11 12:02:08  INFO:    mail.transport.protocol=smtp
2013-09-11 12:02:08  INFO: From: Support
2013-09-11 12:02:08  INFO: To: Customer
2013-09-11 12:02:08  INFO: Subject: Your Report Data
2013-09-11 12:02:08  INFO: Message ID: <id>
2013-09-11 12:02:09  INFO: Email sent successfully
2013-09-11 12:02:09  INFO: Javamail session ended
iruvar
  • 22,736
  • 7
  • 53
  • 82
Jotne
  • 40,548
  • 12
  • 51
  • 55
1

For fun, here's a GNU grep way to solve this with multiline search. Details on how this works on this great answer

grep -ozP '(?s)(?<=--\n).*?Subject: Your Report Data.*?(?=\n\N*?--)' 
Community
  • 1
  • 1
iruvar
  • 22,736
  • 7
  • 53
  • 82
0

For varying number of lines you can use this Ruby code as well:

ruby -e 'exp = Regexp.new("^[^\n]+INFO: -{30}$.*?INFO: Subject: #{Regexp.escape(ARGV.shift)}$.*?-{30}$", Regexp::MULTILINE); File.read(ARGV.shift).scan(exp).each{|e| puts e}' "Your Report Data" file

It doesn't interpret characters from the subject as regexp chars.

konsolebox
  • 72,135
  • 12
  • 99
  • 105