0

I tried to create a regular expression which pulls everything that matches:

[aA-zZ]{2}[0-9]{5}

The problem is that I want to exclude from matching when I have eg. ABCD12345678

Can anyone help me resolve this?

EDIT1: I am looking two letters and five digits in the string, but I want to exclude from matching when I have string like ABCD12345678, because when I use above regular expression it will return CD12345.

EDIT2: I didn't check everything but I think I found answer:

WHEN field is null then field WHEN fnRegExMatch(field, '[a-zA-Z]{2}[0-9]{5}') = 'N/A' THEN field WHEN field like '%[^a-z][a-z][a-z][0-9][0-9][0-9][0-9][0-9][^0-9]%' or field like '[a-z][a-z][0-9][0-9][0-9][0-9][0-9][^0-9]%' THEN fnRegExMatch(field, '[a-zA-Z]{2}[0-9]{5}') ELSE field

ironcurtain
  • 650
  • 9
  • 35
  • Maybe `\b[aA-zZ]{2}[0-9]{5}\b` ([word boundary](http://www.regular-expressions.info/wordboundaries.html)). – Uwe Keim Apr 16 '14 at 14:31
  • Please explain better what you want to do with that regex. – ʞᴉɯ Apr 16 '14 at 14:32
  • 1
    possible duplicate of [Reference - What does this regex mean?](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) – Andy Apr 16 '14 at 14:33
  • the string is a single word or it is in a long text? – ʞᴉɯ Apr 16 '14 at 14:46
  • 1
    I would recommend http://regex101 which will explain all the part of your regex for you. It will show you that `[aA-zZ]` will match either `a`, anything between `A` and `z`, or just `Z`... which as others have stated is the main issue with your expression – freefaller Apr 16 '14 at 14:56
  • Apologies - the above link should be http://regex101.com – freefaller Apr 16 '14 at 15:22

2 Answers2

2

First [aA-zZ] haven't any sense, second use word boundaries:

\b[a-zA-Z]{2}[0-9]{5}\b

You could also use case insensitive modifier:

(?i)\b[a-z]{2}[0-9]{5}\b

According to your comment, it seems you may have underscore after the five digits. In this case, word boundary doesn't work, you have to use ths instead:

(?i)(?<![a-z])([a-z]{2}[0-9]{5})(?![0-9])

(?<![a-z]) is a negative lookbehind that assumes you haven't a letter before the two that are mandatory
(?![0-9]) is a negative lookahead that assumes you haven't a digit after the five that are mandatory

Toto
  • 89,455
  • 62
  • 89
  • 125
  • This is almost what I am looking for. The problem is that for very little amount of records I have strings like: .AB12345_1, @@AB12345/WS123456789, @@AB12345_11, _AB12345_11. From those strings I want to pull AB12345. What should I add to extract what I want (AB12345) ? NOTE: The AB12345 is just example. Generally there are two letters and five digits. – ironcurtain Apr 17 '14 at 07:38
1

This would be the code, along with usage samples.

public static Regex regex = new Regex(
          "\\b[a-zA-Z]{2}\\d{5}\\b",
    RegexOptions.CultureInvariant
    | RegexOptions.Compiled
    );



//// Replace the matched text in the InputText using the replacement pattern
// string result = regex.Replace(InputText,regexReplace);

//// Split the InputText wherever the regex matches
// string[] results = regex.Split(InputText);

//// Capture the first Match, if any, in the InputText
// Match m = regex.Match(InputText);

//// Capture all Matches in the InputText
// MatchCollection ms = regex.Matches(InputText);

//// Test to see if there is a match in the InputText
// bool IsMatch = regex.IsMatch(InputText);

//// Get the names of all the named and numbered capture groups
// string[] GroupNames = regex.GetGroupNames();

//// Get the numbers of all the named and numbered capture groups
// int[] GroupNumbers = regex.GetGroupNumbers();
ʞᴉɯ
  • 5,376
  • 7
  • 52
  • 89