3

I'm trying to extract a specific value (e.g userAgent in this case) from bunch of .gz log files which are essentially compressed log files. The format of each log statement in these log files looks like this :

2013-06-20;02:00:02.503 [664492205@qtp-446095113-8883]-Activity [response@12293 appId=testApp userAgent=BundleDeviceFamily/iPhone,iPad (iPad; iPad2,5; iPad2,5; iPhone OS 6.1.3) EXEC_TM=123  FLOW=response TOKN_TM=0 GW_TM=2314.529 http.status=200 id=029dde45-802c-462a-902b-138bc5490fba offeringId=iPad httpUrl= test.com AUD_TM=0 ipAddress=10.10.10.10 ]\

2013-06-20;02:00:02.504 [664492205@qtp-446095113-8883]-Activity [response@12293 appId=testApp userAgent=FNetwork/609.1.4 Darwin/13.0.0 id=029dde45-802c-462a-902b-138bc5490fba EXEC_TM=123  FLOW=response TOKN_TM=0 GW_TM=2314.529 http.status=200  offeringId=iPad httpUrl= test.com AUD_TM=0 ipAddress=10.10.10.10 ]

In this case, I want to extract userAgent field and display the result either in one of the below formats:

userAgent=BundleDeviceFamily/iPhone,iPad (iPad; iPad2,5; iPad2,5; iPhone OS 6.1.3)
userAgent=FNetwork/609.1.4 Darwin/13.0.0

and so on..

OR print just the values such as :

BundleDeviceFamily/iPhone,iPad (iPad; iPad2,5; iPad2,5; iPhone OS 6.1.3)
FNetwork/609.1.4 Darwin/13.0.0

EDIT : Just to add more info, that these space seperated fields such as key1=value1 key2=value2 could appear in any order

Appreciate the help. Thanks!

codehammer
  • 876
  • 2
  • 10
  • 27

2 Answers2

4

Using + :

zcat input.gz | sed -n 's/.*\(userAgent=[^=]*\) [^ =][^ =]*=.*/\1/p'

also can be a little shorter with -:

zcat input.gz | sed -n 's/.*\(userAgent=[^=]*\) [^ =]\+=.*/\1/p'

and some , combo:

zcat input.gz | grep -o 'userAgent=[^=]*' | sed 's/ [^ ]*$//'

and can be combined in a (thanks lhf):

zgrep -o 'userAgent=[^=]*' input.gz | sed 's/ [^ ]*$//'
Community
  • 1
  • 1
perreal
  • 94,503
  • 21
  • 155
  • 181
  • +1 though will break if `EXEC_TM` doesn't follow `userAgent` in the logs. Doesn't look like it will be a concern based on OP's sample data. – jaypal singh Jul 18 '13 at 02:42
  • Thanks perreal for the quick reply. Actually the fields can appear in any order so this will break if there is no *EXEC fields following userAgent field. I edited the example now to be more clear. Thanks anyways for the input. The solution proposed by JS웃 works perfect. – codehammer Jul 18 '13 at 03:11
  • 2
    You can use `zgrep` in the third version. – lhf Jul 18 '13 at 04:21
3

Since you mentioned that key=value pairs can appear in any order, here is one way of doing it with awk.

zcat input.gz | awk -F= '
{
  for(i=1;i<=NF;i++) {
    if($i~/userAgent/) { 
      sub(/[^ ]+$/,"",$(i+1))
      print "userAgent="$(i+1) 
    }
  }
}' 
jaypal singh
  • 74,723
  • 23
  • 102
  • 147