0

I have some data which includes a bunch of bracketed codes, sometimes multiple on one line.

Lorem ipsum  (ABC123) dolor sit amet
consectetur adipiscing elit (BCD234)
sed do  (345CDE) eiusmod tempor (8675309) incididunt

All my attempts to pull out the bracketed strings (grep -P -i -o "(?<=\().*(?=\))" and grep -E -i -o "\(CAS.*\)") have resulted in an output like:

ABC123
BCD234
345CDE) eiudmod tempor (8675309

whereas what I need is :

ABC123
BCD234
345CDE
8675309

How should I go about this? I'm using GNU grep.

I bonus would not be broken by unmatched brackets and pull out ABC123 from ut labore (et dolore (ABC123) magna aliqua but that's not too important.

Some_Guy
  • 484
  • 6
  • 20
  • Just use lazy matching: `"(?<=\().*?(?=\))"` – Wiktor Stribiżew Nov 25 '16 at 12:51
  • `grep -oP '\(\K[^()]+(?=\))' file` would work for the bonus case as well.. – Sundeep Nov 25 '16 at 13:00
  • if you add `ut labore (et dolore (ABC123) magna aliqua` as part of sample input, I think this question would not be a duplicate of https://stackoverflow.com/questions/22444/my-regex-is-matching-too-much-how-do-i-make-it-stop? – Sundeep Nov 25 '16 at 13:05

1 Answers1

0
grep -oP '\(\K.*?(?=\))' inputfile

ABC123
BCD234
345CDE
8675309

jrbedard
  • 3,662
  • 5
  • 30
  • 34
P....
  • 17,421
  • 2
  • 32
  • 52