2

I'm trying to parse a log file that will have lines like this:

aaa bbb ccc: [DDD] efg oi    
aaa bbb ccc: lll [DDD] efg oo    
aaa bbb ccc: [DDD]

where [DDD] can be at any place in line.

Only one thing will be between [ and ] in any line

Using awk and space as a delimiter, how can I print 1st, 3rd and all data (whole string) between [ and ]?

Expected output: aaa ccc: DDD

Tadija Bagarić
  • 2,495
  • 2
  • 31
  • 48

3 Answers3

3

gawk(GNU awk) approach:

Let's say we a file with the following line:

aaa bbb ccc: ddd [fff] ggg hhh

The command:

awk '{match($0,/\[([^]]+)\]/, a); print $1,$3,a[1]}' file

The output:

aaa ccc: fff

match(string, regexp [, array])
Search string for the longest, leftmost substring matched by the regular expression regexp and return the character position (index) at which that substring begins (one, if it starts at the beginning of string). If no match is found, return zero..

RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
1

Given:

$ cat file
aaa bbb ccc: [DDD] efg oi    
aaa bbb [ccc:] lll DDD efg oo    
aaa [bbb] ccc: DDD

(note -- changed from the OP's example)

In POSIX awk:

awk 'BEGIN{fields[1]; fields[3]} 
                                {s=""
                                 for (i=1;i<=NF;i++) 
                                       if ($i~/^\[/ || i in fields) 
                                           s=i>1 ? s OFS $i : $i
                                 gsub(/\[|\]/,"",s)
                                 print s
                                }' file

Prints:

aaa ccc: DDD
aaa ccc:
aaa bbb ccc:

This does not print the field twice if it is both enclosed in [] and in the selected fields array. (i.e., [aaa] bbb ccc: does not print aaa twice) It will also print in correct field order if you have aaa [bbb] ccc ...

dawg
  • 98,345
  • 23
  • 131
  • 206
-1
awk '$5=="[DDD]"{gsub("[\\[\\]]","");print $1,$3,$5}' file

or

awk '$5=="[DDD]"{print $1,$3, substr($5,2,3)}' file

aaa ccc: DDD
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Claes Wikner
  • 1,457
  • 1
  • 9
  • 8