2

I hava a file:

{
   "test1": [
        "test_a",
        "test_b",
        "test_c"
   ]
}

I am trying to extract the text that exists between "test1": [ and ] I'm trying this command:

cat test | grep -o -P '(?<=test": [).*(?=])'

But it's not work. An idea?

Thanks !

peak
  • 105,803
  • 17
  • 152
  • 177
Crazy
  • 139
  • 3
  • 12

4 Answers4

3

Simply with jq tool:

jq -r '.test1[]' testfile

The output:

test_a
test_b
test_c
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
2

grep is not the best tool for this particular job, but if you must use it, this works:

cat test | grep -Pzo '(?s)(?<=test1\": \[)[^\]]*(?=\])'

With the input above you specified, the output of this command is:

    "test_a",
    "test_b",
    "test_c"

The -z option allows a pattern to match across multiple lines, in this case. The (?s) flag enables the [^\]] pattern to also match newline characters.

The jq utility is designed for what you're trying to do:

cat test | jq '.["test"]'
  • very nice solution. I did not know the `-z` option. To improve the post, could you update the formatting and maybe show the output. – kvantour Jan 23 '18 at 18:18
1

Update: unexpectedly grep is sadly able to grep over multiple lines. See some other answers. And jq is realy tje right tool for the job.

Nonetheless, here is an awk solution :

$ awk '/]/{p=0}p{print}/test1/{p=1}' test 
    "test_a",
    "test_b",
    "test_c"

Or a bit more generic

$ awk 'BEGIN{RS="\"test1\": \\[\n|\n[[:blank:]]*\\]"}(RT~/]/){print}' test
    "test_a",
    "test_b",
    "test_c"

The first solution searches for test1 and sets a marker to print (p=1). If it finds a ] it will set the print marker to zero.

The second solution defines a record separator to be or \"test1\": \\[\n or \n[[:blank:]]*\\]. It will check the found record separator, if this is the correct one, it will print.

kvantour
  • 25,269
  • 4
  • 47
  • 72
0

sed -n '/"test1": \[/,/\]/{//!p}' test

  • sed -n only print lines from pattern buffer (modified input stream) when p command is used.
  • From pattern /"test1": \[/ to pattern /\]/ using the /START/,/END/{ ... } syntax:
  • //!p print the line only if not matching the previous match

The generic form is sed -n '/START/,/END/{//!p}' input-file to omit START and END lines. Or simply sed -n '/START/,/END/p' input-file if you want them.

stevesliva
  • 5,351
  • 1
  • 16
  • 39