2

I'm trying to create a regex for the below text

[2021-11-15 23:43:41.867] INFO    [Mule-util]nz.co.ha.mule.common.logging.CustomMessageLogger [[MuleRuntime].uber.02: [sample-logging-app].sample-logging-appFlow.CPU_LITE @4ffd3e60]: event:dd3aa370-466d-11ec-83a1-0ab55c0b0cf4 ||transactionID=null|txnState=start|apiDomain=system|apiLayer=system|customMessage=Im%20here%20at%202021-11-15%2023:43:41.866|direction=incoming|messageName=sample-loggin-app|messageType=sample-loggin-app|name=main|payloadIn=false||

And i have got 5 groups out for this one, but the fourth group should be divided into 2 so i should be having 6 groups in total as below

Group-1: [2021-11-15 23:43:41.867]

Group-2: INFO

Group-3: [Mule-util]abc.co.common.logging.CustomMessageLogger

Group-4: [[MuleRuntime].uber.02: [sample-logging-app].sample-logging-appFlow.CPU_LITE @4ffd3e60]: event:

Group-5: dd3aa370-466d-11ec-83a1-0ab55c0b0cf4

Group-6: ||transactionID=null|txnState=start|apiDomain=system|apiLayer=system|customMessage=Im%20here%20at%202021-11-15%2023:43:41.866|direction=incoming|messageName=sample-loggin-app|messageType=sample-loggin-app|name=main|payloadIn=false||

NOTE: the above groups are the expected division, so currently Group-4 is holding the group-5 value as well. Can anyone let me know how can i divide the group into two, Regex Link can be found here

The reason why i wanted to have is i'm trying to use a sema-text logagent where it assigns the values of each group to a variable. I can't modify the existing functionality to read part of the payload from the above group as it is being currently used by other systems. So the only way i can do is through regex where it divides it into groups and assign to some fields provided.

My code:

var myString = "[2021-11-15 23:43:41.867] INFO    [Mule-util]nz.co.ha.mule.common.logging.CustomMessageLogger [[MuleRuntime].uber.02: [sample-logging-app].sample-logging-appFlow.CPU_LITE @4ffd3e60]: event:dd3aa370-466d-11ec-83a1-0ab55c0b0cf4 ||transactionID=null|txnState=start|apiDomain=system|apiLayer=system|customMessage=Im%20here%20at%202021-11-15%2023:43:41.866|direction=incoming|messageName=sample-loggin-app|messageType=sample-loggin-app|name=main|payloadIn=false||";
var myRegexpStr = /^(\[[0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}[.][0-9]{3}\])\s([A-Z]*)\s*([a-zA-z\[\]\.0-9:\-]*)\s([0-9a-zA-Z@\[\].\s-_:]*)?([\s|\S]+)/g;
var myRegexp = new RegExp(myRegexpStr);
var match = myRegexp.exec(myString);
console.log(match[1]); // [2021-11-15 23:43:41.867]
console.log(match[2]); // INFO
console.log(match[3]); // [Mule-util]nz.co.ha.mule.common.logging.CustomMessageLogger
console.log(match[4]); // [[MuleRuntime].uber.02: [sample-logging-app].sample-logging-appFlow.CPU_LITE @4ffd3e60]: event:dd3aa370-466d-11ec-83a1-0ab55c0b0cf4
console.log(match[5]); // ||transactionID=null|txnState=start|apiDomain=system|apiLayer=system|customMessage=Im%20here%20at%202021-11-15%2023:43:41.866|direction=incoming|messageName=sample-loggin-app|messageType=sample-loggin-app|name=main|payloadIn=false||
Mike 'Pomax' Kamermans
  • 49,297
  • 16
  • 112
  • 153
Pathfinder
  • 934
  • 1
  • 12
  • 23
  • updated the original question to include note and the reason below that – Pathfinder Nov 16 '21 at 01:33
  • Where's the regex you came up with so far, and how does that fall short? (remember: even if you have things in a regex101 share, put those details in your post, too. Links are fine, but [only in _addition_ to](/help/how-to-ask) having all the details in your post) – Mike 'Pomax' Kamermans Nov 16 '21 at 01:36
  • @Mike'Pomax'Kamermans Regex link is available in my original post, you can see the link in Notes. But to make it easy i'm placing it here https://regex101.com/r/IjAk3a/1/ – Pathfinder Nov 16 '21 at 01:56
  • 1
    Again: links are [in addition to](/help/how-to-ask) having that information in your post. So put your actual regex in your post, too, _even if_ you have a nice, off-site runnable example. Posts on SO need to still make sense if your links die. – Mike 'Pomax' Kamermans Nov 16 '21 at 02:09
  • @Mike'Pomax'Kamermans I have added the code-snippet as suggested. – Pathfinder Nov 16 '21 at 02:24
  • also, please don't put "update" or "edit" in your post: it's not an answer that needs explicit edits because different times lead to different answers. Your question should always, even if you update it, just be your question. Self contained, and complete. Any edits you need to make, make them part of the question, not tacked on. So: one way to at least make your regex less huge is to replace `[0-9]` with `\d`, making things easier to read and talk about. similarly, use `\w` instead of `[a-zA-Z]` and things get much easier to work with, and crucially, help you with. – Mike 'Pomax' Kamermans Nov 16 '21 at 15:46
  • finally, if your regex already does the right thing for "most of the string", omit the parts that already work. You're basically trying to present a [mcve] so that people don't have to bother with the parts that you already solved: if the "group 4" data is the problem, just show _that_ data, and show the regex you currently have that just for that part, and talk about how you've already tried to split it up, with regex you already tried, and talk about how those didn't (quite, or at all) work as desired. – Mike 'Pomax' Kamermans Nov 16 '21 at 15:49

1 Answers1

1

There are a lot of ways to achieve that. A possible solution will look like

^(\[\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\.\d{3}])\s+([A-Z]*)\s+([a-zA-Z0-9[\].:-]*)\s+([\w@[\].\s:-]*:)([a-fA-F0-9]{4}(?:[a-fA-F0-9]{4}-){4}[a-fA-F0-9]{12})\s+([\s\S]*)$

See the regex demo.

The main part is ([\w@[\].\s:-]*:)([a-fA-F0-9]{4}(?:[a-fA-F0-9]{4}-){4}[a-fA-F0-9]{12}):

  • ([\w@[\].\s:-]*:) - Group 4: zero or more word, @, [, ], ., :, - and whitespace chars and then a : char
  • ([a-fA-F0-9]{4}(?:[a-fA-F0-9]{4}-){4}[a-fA-F0-9]{12}) - Group 5: a UUID pattern.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563