0
" john smith (idjs) <js@email.com>"

How do I break the preceding into 3 parts?

1: john smith
2: (idjs)
3: <js@email.com>

I'm having trouble just trying to get any of the parts:

tried this:

var fullname = Regex.Match(item, $"(?=^).*(?=()").Value;
Jonathan
  • 4,916
  • 2
  • 20
  • 37
Rod
  • 14,529
  • 31
  • 118
  • 230

3 Answers3

2

You can use named matched groups for this:

var item = " john smith (idjs) <js@email.com>";
String[] patternArr =
{
    "(?:\\s*)", 
    "(?<fullname>[a-zA-Z\\s]*?[a-zA-Z])", // captures the full name part
    "(?:\\s*)",
    "(?<idjs>\\([a-zA-Z]*\\))", // captures the idjs part
    "(?:.*)",
    "(?<email>(?:<).*@.*(?:>))" // captures the email part
};

var pattern = String.Join("", patternArr);
var m = Regex.Match(item, pattern);

if (m.Success)
{
    Console.WriteLine("fullname: {0}", m.Groups["fullname"]);
    Console.WriteLine("idjs: {0}", m.Groups["idjs"]);
    Console.WriteLine("email: {0}", m.Groups["email"]);
}

Output:

fullname: john smith
idjs: (idjs)
email: <js@email.com>

Demo: https://dotnetfiddle.net/y6U5j4

Kenan Güler
  • 1,868
  • 5
  • 16
  • 1
    You beat me to the punch. Now I need to throw away my nearly identical code. I was, however going to suggest two things to the OP: 1) consider removing the parentheses around the `idjs` part and the angle bracket around the email during the parsing, and 2) there are much better email regexes floating around - steal one of those – Flydog57 Sep 17 '21 at 22:57
  • how would you remove the parenthesis, desired? – Rod Sep 18 '21 at 03:15
  • 1
    @Rod To get rid of the parenthesis around the `idjs`, you can use `((?:\\()(?[a-zA-Z]*)(?:\\)))`, and to get rid of the angle brackets around the `email` you can use `((?:<)(?.*@.*)(?:>))` instead (by replacing the respective line in the `patternArr`). – Kenan Güler Sep 18 '21 at 10:54
  • 1
    By the way, I assumed that the email entries are reasonably trustworthy - hence my "naive" email regex. You can of course choose/steal a regex that fits your needs. E.g. regarding the email validation this would be a much better choice `((?:<)(?\\S+@\\S+)(?:>))`. The `MailAddress` class also provides great help for easy validation of emails. For further reading: [1](https://stackoverflow.com/questions/5342375/regex-email-validation) and [2](https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch04s01.html) – Kenan Güler Sep 18 '21 at 11:02
1
string pattern = 
    @"\s*" +       // zero or more whitespace characters
    @"(.*)" +      // any set of one or more characters
    @"\s+" +       // one or more whitespace characters
    @"(\(.*\))" +  // zero or more characters inside parens
    @"\s" +        // a single whitespace
    @"(<.*>)"      // zero or more characters inside brackets
    ;

Note that Regex.Match().Value will not give you the parts - only the whole string if it matches. What you want is Regex.Match().Groups which will return a GroupCollection that you can iterate over to get the parts.

var groups = Regex.Match(item, pattern).Groups;
foreach(var group in groups)
    Console.WriteLine(groups);
D Stanley
  • 149,601
  • 11
  • 178
  • 240
0

Though this is not by using Regex, I would use split for this:

    var input=" john smith (idjs) <js@email.com>";

    var first=input.Split('(');
    var second=first[1].Split(')');

    var name=first[0].Trim();
    var mid=second[0].Trim();
    var email=second[1].Trim();


/*
result:
john smith
idjs
<js@email.com>
*/

Mehrdad Dowlatabadi
  • 1,335
  • 2
  • 9
  • 11