1

I need to match certain parts of a string using regex and am having a terrible time trying to figure it out. The string in question will always look like this:

CN=Last.First.M.1234567890, OU=OrganizationalUnit, O=Organization, C=CountryName

The resulting string will look like CN=1234567890, so I need to just get the first part of the string up to and including the , and strip out the Last.First.M. part. Can this be done?

Note: I am passing this regex into a function which I cannot touch, so I cannot use easier methods such as splitting the string or getting just the digits and adding the CN= to it.

Thanks.

ninesided
  • 23,085
  • 14
  • 83
  • 107
James
  • 427
  • 3
  • 5
  • 15
  • What language or tool are you using? – squiguy Feb 06 '13 at 21:15
  • 2
    What parameters does this function take exactly, and what does it return? – Tim Pietzcker Feb 06 '13 at 21:16
  • @squiguy I am using java – James Feb 06 '13 at 21:17
  • @TimPietzcker The function takes a `String` value and returns a `Pattern` from the `java.util.regex.Pattern` class. – James Feb 06 '13 at 21:19
  • What do you mean by that it will always look like that? I'm guessing that you mean that it will always be of the form 'CN=***, OU=***, O=***, C=***', but that the *** can be any other text? – steinar Feb 06 '13 at 21:21
  • Oh, so it just compiles the regex you pass to it into a `Pattern` object? I don't get it - what do you need this function for? What can you do with that object? – Tim Pietzcker Feb 06 '13 at 21:21
  • Creating the regex for this is simple, but I agree that we need details on how that `Pattern` object that's returned will be used. Do you have control over the code that calls this particular method? – steinar Feb 06 '13 at 21:24
  • @steinar For the most part, yes. The CN form will always be Last.First.M.10digitcode, the rest is irrelevant. The `Pattern` object will be passed to an extractor class. @TimPietzcker I need it for an extractor class, more specifically an x509 principal extractor – James Feb 06 '13 at 21:28
  • Possible duplicate: [Regular expression to skip character in capture group](http://stackoverflow.com/q/277547/299327) – Ryan Gates Feb 07 '13 at 17:24

3 Answers3

1

Here's something I do when I'm to lazy to play around with regex.

String[] myStrings = "CN=Last.First.M.1234567890, OU=OrganizationalUnit, O=Organization, C=CountryName"
    .split(",");
// myStrings [0] now contains CN=Last.First.M.1234567890

myStrings[0] = myStrings[0].replace("Last.First.M.", "");
// now we replaced the stuff we didnt want with nothing and myStrings[0]
// is looking pretty nice. This is a lot more readable but probably
// performs worse. For even more readable code assign to variables rather then to modify myStrings[0]
Karl Kildén
  • 2,415
  • 22
  • 34
1

I believe I found the answer to what I am looking for here after doing more digging.

Regular expression to skip character in capture group

All answers here were great, just didn't apply to what I was working on. Now I will work on a different method to solve this. Thanks for all the assistance.

Community
  • 1
  • 1
James
  • 427
  • 3
  • 5
  • 15
0

With the regexp ([A-Z]{2})=(?:\w+\.)+(\d+), you can obtain the parts that you want.

Cyrille Ka
  • 15,328
  • 5
  • 38
  • 58
  • 1
    This matches the whole `CN=Last.First.M.1234567890,` rather than just selecting `CN=1234567890,` when I tried it using http://regexr.com?33mcn – Ryan Gates Feb 06 '13 at 21:45
  • it captures two groups: `CN` and `123456789`. Then one should just use a variation of @aleroot solution like `m.group(1) + "=" + m.group(2)` – Cyrille Ka Feb 06 '13 at 21:59
  • The OP stated that `I am passing this regex into a function which I cannot touch, so I cannot use easier methods such as splitting the string or getting just the digits and adding the CN= to it.` So I believe concatenating the groups won't work for this. – Ryan Gates Feb 07 '13 at 17:07