0

I have a json file which has 3 items listed in it as so:

{
Item 1
lots of stuff
more stuff 1545
even more
},
{
Item 2
lots of stuff
more stuff 542
},
{
Item 2
lots of stuff
more stuff 675
even more
more words
more text
}

I want to be able to grep for a string say 675 and if it finds that to return the entire 'block' of text, from the opening to closing brackets.

Dany Bee
  • 552
  • 3
  • 12
Tony
  • 8,681
  • 7
  • 36
  • 55
  • 2
    You can correct formatted json files parse with a `json parser`. but your example is **not** correct formatted. – captcha Jun 19 '13 at 17:09
  • Take a look at some command-line json parsers [here](http://stackoverflow.com/questions/3858671/unix-command-line-json-parser). – cabad Jun 19 '13 at 18:17

3 Answers3

3

GNU sed parser for your irregular file format (put your search pattern at position PATTERN):

sed -nr 'H;/PATTERN/,/\}/{s/(\})/\1/;T;x;p};/\{/{x;s/.*\n.*//;x;H}' file
  • and some examples:
$sed -nr 'H;/1545/,/\}/{s/(\})/\1/;T;x;p};/\{/{x;s/.*\n.*//;x;H}' file

{
Item 1
lots of stuff
more stuff 1545
even more
},

$sed -nr 'H;/542/,/\}/{s/(\})/\1/;T;x;p};/\{/{x;s/.*\n.*//;x;H}' file

{
Item 2
lots of stuff
more stuff 542
},

$sed -nr 'H;/more text/,/\}/{s/(\})/\1/;T;x;p};/\{/{x;s/.*\n.*//;x;H}' file

{
Item 3
lots of stuff
more stuff 675
even more
more words
more text
}
captcha
  • 3,756
  • 12
  • 21
3

If by "grep for a string" you really mean search for an RE like you normally would with grep, then:

awk -v t="675" -v ORS= '{r=r $0 RS} /^}/{if (r~t) print r; r=""}' file

but if you truly mean search for a string like you would with fgrep then:

awk -v t="675" -v ORS= '{r=r $0 RS} /^}/{if (index(r,t)) print r; r=""}' file
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
2

This can't be done with grep and shouldn't be done with bash but it's quite simple if you have GNU awk just define RS as },?\n:

# find a record containing 1545
$ awk '/1545/' RS='},?\n' ORS='}\n' file
{
Item 1
lots of stuff
more stuff 1545
even more
}

This method won't separate multiple records with a , like JSON should but you could define ORS as }, and remove the last , if you need valid JSON as the result.

Alternatively you could also use RT instead of ORS to display the separator that matched the RS regexp:

$ awk '/1545/{printf "%s",$0RT}' RS='},?\n' file
{
Item 1
lots of stuff
more stuff 1545
even more
},

But depending on whether the last record matched the given pattern you might still need to remove the trailing ,. A simple sed command would do the trick sed '$s/,$//'.

I'd probably just use a proper JSON passer however.

Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
  • +1 for a good solution. I don't know "JSON" but if `}` can occur at the end of a line elsewhere in the block of text then you need to put a `\n` before it in the RS. Of the 2 solutions you proposed, I'd definitely use the RT one since you're using GNU awk anyway to get multi-char RS. – Ed Morton Jun 19 '13 at 19:05