11

I was analyzing logs contains information like the following:

y1e","email":"","money":"100","coi

I want to fetch the value of money, i used 'awk' like :

grep pay action.log | awk '/"money":"([0-9]+)"/' ,

then how can i get the sub-expression value in ([0-9]+) ?

RoyHu
  • 333
  • 2
  • 3
  • 13
  • To clarify, you want the numeric value after the `:`? – Levon Jun 06 '12 at 11:49
  • A sed version would be: `sed -r 's|^.*money":"([0-9]*)".*|\1|'` or if you don't want to print lines that do not contain `money`: `sed -n -r 's|^.*money":"([0-9]*)".*$|\1|p'` – Op De Cirkel Jun 06 '12 at 11:57
  • @Op De Cirkel Thank you! Seems 'sed' is more powerful! Why 'awk' has no such design? – RoyHu Jun 08 '12 at 12:17

5 Answers5

5

If you have GNU AWK (gawk):

awk '/pay/ {match($0, /"money":"([0-9]+)"/, a); print substr($0, a[1, "start"], a[1, "length"])}' action.log

If not:

awk '/pay/ {match($0, /"money":"([0-9]+)"/); split(substr($0, RSTART, RLENGTH), a, /[":]/); print a[5]}' action.log

The result of either is 100. And there's no need for grep.

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • Thanks. Pretty near to what i expected, but is there a more clever way? – RoyHu Jun 07 '12 at 04:25
  • @RoyHu: The 1 in the array index refers to the capture group. I don't know of any other way to do that in awk or gawk. Gawk has a function `gensub()` that can be used for *replacing* the contents of a capture group. You could use it, but the expressions would be more complex for the use in your question. – Dennis Williamson Jun 07 '12 at 10:39
  • Thanks. And i got one using gensub : grep pay action.log | awk -F "\n" 'm=gensub(/.*money":"([0-9]+)".*/, "\\1", "g", $1) {print m}' – RoyHu Jun 07 '12 at 12:15
  • If you have `gawk` installed, in the first example, the print clause can be simplified to `print a[1];` – Mahn May 31 '15 at 23:42
2

Offered as an alternative, assuming the data format stays the same once the lines are grep'ed, this will extract the money field, not using a regular expression:

awk -v FS=\" '{print $9}' data.txt

assuming data.txt contains

y1e","email":"","money":"100","coin.log

yielding:

100

I.e., your field separator is set to " and you print out field 9

Levon
  • 138,105
  • 33
  • 200
  • 191
  • Thanks. but the field where contains "money" info may not be fixed! – RoyHu Jun 07 '12 at 04:23
  • I think of one more way: grep pay action.log | awk -F "\n" 'm=gensub(/.*money":"([0-9]+)".*/, "\\1", "g", $1) {print m}' – RoyHu Jun 07 '12 at 04:27
0

You need to reference group 1 of the regex

I'm not fluent in awk but here are some other relevant questions

awk extract multiple groups from each line

GNU awk: accessing captured groups in replacement text

Hope this helps

Community
  • 1
  • 1
buckley
  • 13,690
  • 3
  • 53
  • 61
  • Thanks you ! inspired by 'gensub' i got grep pay user_action.log | awk -F "\n" 'm=gensub(/.*money":"([0-9]+)".*/, "\\1", "g", $1) {print m}' – RoyHu Jun 07 '12 at 04:18
0

If you have money coming in at different places then may be it would not be a good idea to hard code the positional parameter.

You can try something like this -

$ awk -v FS=[,:\"] '{ for (i=1;i<=NF;i++) if($i~/money/) print $(i+3)}' inputfile
jaypal singh
  • 74,723
  • 23
  • 102
  • 147
0
grep pay action.log | awk -F "\n" 'm=gensub(/.*money":"([0-9]+)".*/, "\\1", "g", $1) {print m}'
RoyHu
  • 333
  • 2
  • 3
  • 13
  • 3
    You should refactor out the `grep`. Remember that `grep 'foo' file | awk '{ bar }'` is basically always better written as `awk '/foo/ { bar }' file`. – tripleee Aug 24 '15 at 14:32