1

Trying to pick up only 62f0fac3-8b19-49de-866b-5f5cf23f2f9f and bd23d38d-8833-4fc4-b6c0-3906df0ed161 through bash shell pattern matching from below example file. The file has single line text with many such pattern occurrences.

Tried back reference of grep and sed. none seems to be working. Any help greatly appreciated.

"type": "Primary", "id": "418bf692-4f20-4597-b624-5a7242b82379", "expireIn": {"count": 0, "unit": "hours"}}], "copyModel": "ARCHIVE", "id": "6f2d6bc6-f67c-41b8-b11a-b5a59d7a6ac3"}], "id": "62f0fac3-8b19-49de-866b-5f5cf23f2f9f", "createdAt": "2020-08-11T17:33:45.754863Z", "name": "Susanta Copy Policy"}, "locked": false, "type": "Primary", "id": "ca85d285-8b73-42ec-aab1-c4d13572db94", "expireIn": {"count": 61, "unit": "days"}}], "advancedOptions": {"targetConnectivity": "Auto"}, "id": "f61e67a4-eea6-4922-9cc1-491a5429b199"}], "id": "bd23d38d-8833-4fc4-b6c0-3906df0ed161", "createdAt": "2020-07-14T19:01:33.202434Z", "name": "App Gold Policy"}, "locked": false
#
#
# cat test | grep -E '\"id\": \"(.*)\", \"createdAt\": \"'
"type": "Primary", "id": "418bf692-4f20-4597-b624-5a7242b82379", "expireIn": {"count": 0, "unit": "hours"}}], "copyModel": "ARCHIVE", "id": "6f2d6bc6-f67c-41b8-b11a-b5a59d7a6ac3"}], "id": "62f0fac3-8b19-49de-866b-5f5cf23f2f9f", "createdAt": "2020-08-11T17:33:45.754863Z", "name": "Susanta Copy Policy"}, "locked": false, "type": "Primary", "id": "ca85d285-8b73-42ec-aab1-c4d13572db94", "expireIn": {"count": 61, "unit": "days"}}], "advancedOptions": {"targetConnectivity": "Auto"}, "id": "f61e67a4-eea6-4922-9cc1-491a5429b199"}], "id": "bd23d38d-8833-4fc4-b6c0-3906df0ed161", "createdAt": "2020-07-14T19:01:33.202434Z", "name": "App Gold Policy"}, "locked": false
#
# cat test | grep -E '\"id\": \"(.*)\", \"createdAt\": \"\1'
#
# cat test | sed -E 's/\"id\"(.*)\"\, \"createdAt\": \"/\1/'
"type": "Primary", : "418bf692-4f20-4597-b624-5a7242b82379", "expireIn": {"count": 0, "unit": "hours"}}], "copyModel": "ARCHIVE", "id": "6f2d6bc6-f67c-41b8-b11a-b5a59d7a6ac3"}], "id": "62f0fac3-8b19-49de-866b-5f5cf23f2f9f", "createdAt": "2020-08-11T17:33:45.754863Z", "name": "Susanta Copy Policy"}, "locked": false, "type": "Primary", "id": "ca85d285-8b73-42ec-aab1-c4d13572db94", "expireIn": {"count": 61, "unit": "days"}}], "advancedOptions": {"targetConnectivity": "Auto"}, "id": "f61e67a4-eea6-4922-9cc1-491a5429b199"}], "id": "bd23d38d-8833-4fc4-b6c0-3906df0ed1612020-07-14T19:01:33.202434Z", "name": "App Gold Policy"}, "locked": false
#```
anubhava
  • 761,203
  • 64
  • 569
  • 643
Susanta Dutta
  • 377
  • 1
  • 4
  • 15

2 Answers2

1

For this to work with grep, you need option -P for Perl syntax. Option -o will print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

You can then do a negative lookahead and non-greedy matching (already well explained here: Shortest match in regex from end). Also, to just display the ID part of your expression, you can include look-around assertions to remove part of the grep output (which is explained here: https://unix.stackexchange.com/questions/13466/can-grep-output-only-specified-groupings-that-match).

Putting it all together, the following regex matches anything from "id": " to ", "createdAt" and asserts that the desired "inner" match does not contain the pattern "id" itself (which is what I understand you want) and only returns all the "inner" matches, each in a separate line:

cat test | grep -Po '(?<=\"id\"\: \")((?:(?!\"id\").)*?)(?=\", \"createdAt)'

will return

62f0fac3-8b19-49de-866b-5f5cf23f2f9f
bd23d38d-8833-4fc4-b6c0-3906df0ed161
buddemat
  • 4,552
  • 14
  • 29
  • 49
1

Thank you buddemat. I found another alternate solution as well. But yours is better


# cat test2 | awk -F"\", \"createdAt\": " '{for(i=1;i<=NF-1;i++)printf $i "\n" }' | while read id; do echo ${id: -36}; done
62f0fac3-8b19-49de-866b-5f5cf23f2f9f
bd23d38d-8833-4fc4-b6c0-3906df0ed161
cf4464cc-3ef7-4852-a4a0-2771538f6866
fbe16f4e-8f1c-4ffd-9d5e-86b2d2d419fb
36beae3e-208a-489e-a8ea-03f9c3e2de1b
386a08c3-35e6-4469-8b3a-7b49fd09da21
# cat test2  | grep -Po '(?<=\"id\"\: \")((?:(?!\"id\").)*?)(?=\", \"createdAt)'
62f0fac3-8b19-49de-866b-5f5cf23f2f9f
bd23d38d-8833-4fc4-b6c0-3906df0ed161
cf4464cc-3ef7-4852-a4a0-2771538f6866
fbe16f4e-8f1c-4ffd-9d5e-86b2d2d419fb
36beae3e-208a-489e-a8ea-03f9c3e2de1b
386a08c3-35e6-4469-8b3a-7b49fd09da21
#```
Susanta Dutta
  • 377
  • 1
  • 4
  • 15