Using awk for if statement and split

Question

I have a system.log file that looks like.

[2019-12-20 09:06:40] main.INFO: Update Product Attributes [] []
[2019-12-20 09:18:56] main.INFO: Customer Id: . Param: {"store":101,"search":"soap"} [] []
[2019-12-20 09:19:32] main.INFO: Update Product Attributes [] []
[2019-12-20 09:20:34] main.INFO: Customer Id: . Param: {"store":101,"search":"ea"} [] []
[2019-12-20 09:23:29] main.INFO: Customer Id: . Param: {"store":101,"search":"C2"} [] []
[2019-12-20 09:23:31] main.INFO: Update Product Attributes [] []
[2019-12-20 09:23:43] main.INFO: Customer Id: . Param: {"store":101,"search":"spaghetti"} [] []
[2019-12-20 09:24:06] main.INFO: Customer Id: . Param: {"store":101,"search":"Ea"} [] []

Now I want to split like this to get the date and value of search in my log.

2019-12-20 "soap"
2019-12-20 "ea"
2019-12-20 "C2"
2019-12-20 "spaghetti"
2019-12-20 "Ea"

So far I've tried this:

awk -F '] main.INFO: Customer Id: . Param: {"store"' '{ if ( $2 ~ /search/ ) { print $1 $2} }' system.log

but they return like this, it can't split to the other layer.

[2019-12-20 10:08:04:101,"search":"ea"} [] []
[2019-12-20 10:08:35:101,"search":"ea"} [] []

RavinderSingh13 · Accepted Answer · 2020-07-02T07:12:13.550

Could you please try following, written and tested with shown samples in GNU awk.

awk -v s1="\"" '
/Customer Id/{
  match($0,/Param: {.*}/)
  val=substr($0,RSTART,RLENGTH)
  gsub(/.*:"|"}$/,"",val)
  sub(/\[/,"",$1)
  print $1,s1 val s1
  val=""
}'  Input_file

Explanation: Adding detailed explanation for above.

awk -v s1="\"" '                     ##Starting awk program from here and setting variable s1 which has " value in it.
/Customer Id/{                       ##Checking string Customer Id is present in current line then do following.
  match($0,/Param: {.*}/)            ##Using match to match regex Param: till } then do following.
  val=substr($0,RSTART,RLENGTH)      ##Creating val whose value is sub string of current line from RSTART to RLENGTH here.
  gsub(/.*:"|"}$/,"",val)            ##Globally substituting everything till :" and "} at last of val with NULL.
  sub(/\[/,"",$1)                    ##Substituting [  in first column here.
  print $1,s1 val s1                 ##Printing first column s1 val and s1 here as per OP expected output.
  val=""                             ##Nullifying val here.
}' Input_file                        ##Mentioning Input_file name here.

2nd solution: Adding 1 more solution here.

awk -v s1="\"" '
/Customer Id:/{
  match($0,/\[[0-9]{4}-[0-9]{2}-[0-9]{2}/)
  dat=substr($0,RSTART+1,RLENGTH-1)
  match($0,/Param: {.*}/)
  val=substr($0,RSTART,RLENGTH)
  gsub(/.*:"|"}$/,"",val)
  print dat,s1 val s1
  dat=val=""
}
'  Input_file

Explanation: Adding detailed explanation for above.

awk -v s1="\"" '                                     ##Starting awk program from here and setting s1 as value " here.
/Customer Id:/{                                      ##Searching string Customer Id: in current line here.
  match($0,/\[[0-9]{4}-[0-9]{2}-[0-9]{2}/)           ##Using match function of awk and using regex here for current line to get value of date here.
  dat=substr($0,RSTART+1,RLENGTH-1)                  ##Creating dat variable and having sub string value in it for current line.
  match($0,/Param: {.*}/)                            ##Using match to match regex Param: { till } here.
  val=substr($0,RSTART,RLENGTH)                      ##Creating val which has sub string of previous used match function here.
  gsub(/.*:"|"}$/,"",val)                            ##Globally substituting till :" OR "} in last of val here with NULL.
  print dat,s1 val s1                                ##Printing dat s1 val and s1 here.
  dat=val=""                                         ##Nullifying dat and val here to avoid conflict of variable values here.
}
' Input_file                                         ##Mentioning Input_file name here.

You do great work in explaining your commands. Helps a lot when someone is out of touch or new. — P...., Aug 27 '20 at 16:19

James Brown · Answer 2 · 2020-07-02T07:56:41.770

Keeping it simple:

$ awk '
match($(NF-2),/\"[^"]*\"\}/) {
    print substr($1,2),substr($(NF-2),RSTART,RLENGTH-1)
}' file

Output:

2019-12-20 "soap"
2019-12-20 "ea"
2019-12-20 "C2"
2019-12-20 "spaghetti"
2019-12-20 "Ea"

Explained:

If the antepenultimate space-separated string has a substring "..."}, print the first space-separated string starting from the second character (excluding the first character [) and the above-mentioned substring excluding the last character }.

anubhava · Answer 3 · 2020-07-02T08:02:24.693

1

You may use this gnu awk with FPAT:

awk -v FPAT='\\[[^]]+]|{[^}]+}' '
/main\.INFO: / && $2 ~ /"search":/ {
    gsub(/^\[| .*$/, "", $1)
    gsub(/^.*:|}$/, "", $2)
    print $1, $2 
}' file

2019-12-20 "soap"
2019-12-20 "ea"
2019-12-20 "C2"
2019-12-20 "spaghetti"
2019-12-20 "Ea"

edited Jul 02 '20 at 08:02

answered Jul 02 '20 at 06:49

anubhava

761,203
64
569
643

score 1 · Answer 4 · answered Jul 02 '20 at 07:09

1

Just use perl like in https://stackoverflow.com/a/2957781/1921546.

perl -n -e '/^\[([^ ]*).*search":"((?:[^"]|\\.)*)"/ && print "$1 $2\n"'

Explanation of Regular Expression used at https://regexr.com/57nhk

answered Jul 02 '20 at 07:09

pii_ke

2,811
2
20
30

score 0 · Answer 5 · answered Jul 02 '20 at 07:02

With sed

$ sed -nE 's/^\[([^ ]+).*"search":("[^"]+").*/\1 \2/p' ip.txt
2019-12-20 "soap"
2019-12-20 "ea"
2019-12-20 "C2"
2019-12-20 "spaghetti"
2019-12-20 "Ea"

-n turn off auto print
-E enable ERE
^\[ match the startin [
([^ ]+) capture the date
.*"search": match till "search":
("[^"]+") capture the value of search
.* rest of the line
\1 \2 text matched by the capture groups separated by space
p print only if substitution succeeds

score 0 · Answer 6 · answered Feb 03 '21 at 17:21

0

i think i can simplify it to

gawk/mawk/mawk2 'BEGIN { FS = "([}]|search\"[:])"; OFS = " ";

    } (NF>1) { print substr($1, 2, index($1, OFS)-1), $2; }'

answered Feb 03 '21 at 17:21

RARE Kpop Manifesto

2,453
3
11

Using awk for if statement and split

6 Answers6