0

I am trying to come up with a RegEx (POSIX like) in a vendor application that returns data looking like illustrated below and presents a single line of data at a time so I do not need to account for multiple rows and need to match a row indvidually.

It can return one or more values in the string result

The application doesn't just let me use a "\d+\.\d+" to capture the component out of the string and I need to map all components of a row of data to a variable unfortunately even if I am going to discard it or otherwise it returns a negative match result.

My data looks like the following with the weird underscore padding.

USER |   ___________   3.58625 |   ___________   7.02235 |
USER |   ___________  10.02625 |   ___________  15.23625 |

The syntax is supports is Matches REGEX "(Var1 Regex), (Var2 Regex), (Var3 Regex), (Var 4 regex), (Var 5 regex)" and the entire string must match the aggregation of the RegEx components, a single character off and you get nothing.

The "|" characters are field separators for the data.

So in the above what I need is a RegEx that takes it up to the beginning of the numeric and puts that in Var1, then capture the numeric value with decimal point in var 2, then capture up to the next numeric in Var 3, and then keep the numeric in var 4, then capture the space and end field | character into var 5. Only Var 2 and 4 will be useful but I have to capture the entire string.

I have mainly tried capturing between the bars "|" using ^.*\|(.*).\|*$ from this question.

I have also tried the multiple variable ([0-9]+k?[.,]?[0-9]+)\s*-\s*.*?([0-9]+k?[.,]?[0-9]+) mentioned in this question.

I seem to be missing something to get it right when I try using them via RegExr and I feel like I am missing something pretty simple.

In RegExr I never get more than one part of the string I either get just the number, the equivalent of the entire string in a single variable, or just the number which don't work in this context to accomplish the required goal.

The only example the documentation provides is the following from like a SysLog entry of something like in this example I'm consolidating there with "Fault with Resource Name: Disk Specific Problem: Offline"

WHERE value matches regex "(.)Resource Name: (.), Specific Problem: ([^,]),(.)" SET _Rrsc = var02 SET _Prob = var03

I've spun my wheels on this for several hours so would appreciate any guidance / help to get me over this hump.

CRSouser
  • 658
  • 9
  • 25
  • Using the string you provided, and the regex you provided `\d+\.\d+`, I'm able to capture all numbers. Are you by any chance sending the quotes also to regexr? – Sander Saelmans Feb 22 '18 at 18:51
  • @SanderSaelmans I can capture all the numbers as well.. but the catch is my MATCHES expression has to account and match exactly the entire string.. if it doesn't do so I get nothing returned. – CRSouser Feb 22 '18 at 18:53
  • Does it mean you need to have 5 capturing groups in the pattern? Try [`^(.*) \| (.*) (\d+\.\d+) (\|) .* (\d+\.\d+) .*$`](https://regex101.com/r/vyyOJU/1/). Group 3 and 5 contain the numbers. You may make these groups have IDs 1 and 2, too. – Wiktor Stribiżew Feb 22 '18 at 18:58
  • @WiktorStribiżew Yes everything must be contained within a capture group though or it won't word, so if I took yours it doesn't seem to match. (^(.*) ),(\| (.*) ),(\d+\.\d+) ,(\| (.*) ),(\d+\.\d+) , (.*$) – CRSouser Feb 22 '18 at 19:17
  • You have 9 capturing groups in *your* pattern, it is not mine. – Wiktor Stribiżew Feb 22 '18 at 19:18
  • @WiktorStribiżew I think it is that I am required both encapsulate each group into () and then separate groups by commas. I'll try it again – CRSouser Feb 22 '18 at 19:23

1 Answers1

2

Something like this should work:

(\D+)([\d.]+)(\D+)([\d.]+)(.*)

Or in normal words: Capture everything but numbers, capture a decimal number, capture everything but numbers, capture a decimal number, capture everything.

Using USER | ___________ 10.02625 | ___________ 15.23625 |

  • $1 = USER | ___________  
  • $2 = 10.02625
  • $3 =  | ___________  
  • $4 = 15.23625
  • $5 =  |
Sander Saelmans
  • 831
  • 4
  • 13