-2

I have the following string

ISA\*01*ABCD BA1\*Y\*1\*\*FR\*DTST103L000P72323232342\*\*NO\*NY\*\*DEVS\*\*ALB VESSEL EXAM1

all I want is to extract DTST103L000P72323232342

I am using the following regex. My question is how do I get second group (\w+) only. (BA1\\\*\w\\\*\w{0,2}\\\*\w{0,2}\\\*\w{0,3}\\\*)(\w+)

when I tried to backreference using the following

(?<=BA1\\\*\w\\\*\w{0,2}\\\*\w{0,2}\*\w{0,3}\\\*)(\w+)

I am getting per https://regex101.com/ the following error

{0,2} A quantifier inside a lookbehind makes it non-fixed width

{0,2} A quantifier inside a lookbehind makes it non-fixed width

{0,3} A quantifier inside a lookbehind makes it non-fixed width

Thanks in advance.

muratp
  • 1
  • 1
    What programming language or regex flavor are you using? – 41686d6564 stands w. Palestine Jul 28 '20 at 17:22
  • what is our regex language? also what do you man that you want only 2nd group? why don't you simply stop capturing first group? – mjrezaee Jul 28 '20 at 17:35
  • I am using ruby language. and I tried to capture only the second group...but I do not have a good regex for it since the length varies as well so DTST103L000P72323232342 could be any length or any combination or alphanumeric chrs.... – muratp Jul 28 '20 at 17:55
  • 1) You are testing the regex in a regex tester with PCRE setting, which is wrong since you are using JS. 2) JS regex now supports infinite width patterns in lookbehinds, Chrome, Node.js, Firefox, Opera, Samsung Internet, really, the majority now supports them. Your problem does not exist in the first place. 3) There are ALWAYS ways to get the captured substrings in all languages. Just grab Group 2 value. – Wiktor Stribiżew Jul 28 '20 at 19:15
  • I don't understand why the string you wish to extract is the second word. Is it not the eighth, after `'ISA'`, `'01'`, `'ABCD'`, `'BA1'`, `'Y'`, `'1'` and `'FR'`? If so, you can use the regular expression `\A\W*(?:\w+\W+){7}\K\w+` to match the desired string. [ref](https://regex101.com/r/hyXfAx/1/). Move your cursor across the regex to obtain explanations of each of its tokens. – Cary Swoveland Jul 28 '20 at 19:50

1 Answers1

-1

Simply don't capture the first part:

BA1\*\w\*\w{0,2}\*\w{0,2}\*\w{0,3}\*(\w+)
Blindy
  • 65,249
  • 10
  • 91
  • 131
  • Unfortunately, the above expression is also returning the entire string instead of just the second part. I verified this using both my ruby program and regex101.com. Thank you for your reply. Greatly appreciated. – muratp Jul 28 '20 at 18:41
  • I'm confused, how are you seeing that it returns everything? Are you looking at the string representation of the entire match or something? I gave you one capture, read the one capture. – Blindy Jul 28 '20 at 18:45
  • And what do you mean you're using Ruby? You tagged your question as JavaScript. – Blindy Jul 28 '20 at 18:47
  • Sorry, I did not realize that it got marked as Javascript. I only selected regex, but may be quick fingers :) so just like most OO languages, ruby has a way of returning a portion of a string if a pattern is supplied in /.../ format to its methods. When I use the above approach you recommended it is returning everything. Also, if you try the solution above in regex101.com, it returns most of BA1 line. – muratp Jul 28 '20 at 19:34
  • muratp, forget what it is returning. It is the content of capture group 1 that you want. Blindy, for this is to work for an arbitrary string you don't want to hardwire `'BA1'` or the length of each word. – Cary Swoveland Jul 28 '20 at 19:56
  • Thank you Cary and Blindy...I was hoping to get regex to do the heavy lifting, but I will probably write up a small method to parse the returned results...unfortunately, my regex knowledge is mediocre at best. – muratp Jul 28 '20 at 21:01