-1

Objective To find an exact match for a word that is exactly 7 characters, beginning with 3 letters followed by 4 digits within a sentence.

Valid Inputs: ABC1234, MOW0912, qbc1239

I have tried setting the length using

var result = Regex.Match(myString, "[A-Za-z]{3}[0-9]{4}\\d{9}");

I have referred couple of similar posts and didn't help me resolve the issue

C# Regular Expression Help Needed

Code: I have so far

using System;
using System.Text.RegularExpressions;

public class Test
{
   public static void Main()
   {
      string[] data = { "This, is a test ,* P.O ABC12 and ABCDE5134 this is some random words UserID ABC1234 that the users may have, regards, and the end of it" };



      foreach (string myString in data)
      {
         if (Regex.IsMatch(myString, "[A-Za-z]{3}[0-9]{4}"))
         {
            var result = Regex.Match(myString, "[A-Za-z]{3}[0-9]{4}");
            Console.WriteLine("{0} matches", myString);
            Console.WriteLine("result is" + result);
         }
         else
         {
            Console.WriteLine("does not match");
         }
      }
   }
}

OUTPUT

enter image description here

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Clint
  • 6,011
  • 1
  • 21
  • 28

4 Answers4

2

At the actual state of your question you can't. Unless you have some constraints that let you restrict the search to exactly what you need to find.

ABCDE5134 will always be matched as a 3 letters/4 digit match (resulting in BCDE5134) because the string is scanned and the condition is found.

If you expect ABC1234 as a result you will have to use the spaces around it as a matching condition (so " ABC1234 " must be matched).

This will be the regex then:

\s[A-Za-z]{3}[0-9]{4}\s

This should wourk as long as any word is enclosed by spaces.

Watch out if the words you need could end with dots or punctuation.

In that case you have to consider those too (or remove the final \s, or replace with a .)

So the "puctuation tolerant" version of the regex would be:

\s[A-Za-z]{3}[0-9]{4}

Liquid Core
  • 1
  • 6
  • 27
  • 52
2

One way to do it without regex would be to split the string into "words", then return the first one that begins with 3 letters and ends with 4 numbers:

private static string GetCode(string input)
{
    return input?.Split().FirstOrDefault(word => 
        word.Length == 7 &&
        word.Take(3).All(char.IsLetter) && 
        word.Skip(3).Take(4).All(char.IsNumber));
}

If you want to collect more than one code in a string, you can modify the above to return them all:

private static List<string> GetCodes(string input)
{
    return input?.Split().Where(word => 
        word.Length == 7 &&
        word.Take(3).All(char.IsLetter) && 
        word.Skip(3).Take(4).All(char.IsNumber))
        .ToList();
}
Rufus L
  • 36,127
  • 5
  • 30
  • 43
2

Reading your code, it seems like you wanted to split your data-string. And it seems you are working with .NET. Split your string into multiple strings by replacing the initialization of string[] data with

string[] data = "This, is a test ,* P.....".Split(' ');

an explaination with a simple example is here: https://stackoverflow.com/a/7559140/9749761

Now, you have two possibilities to match a string. You could add something to your if-condition like this:

if (myString.Length == 7 && Regex.IsMatch(myString, "[A-Za-z]{3}[0-9]{4}"))

Source: https://msdn.microsoft.com/en-us/library/system.string.length(v=vs.110).aspx

Or you could change the matching pattern twice like this: Add a leading ^ and trailing $, if you want to match the entire string. Source: https://stackoverflow.com/a/5066560/9749761

I hope this helps you. I did not program C# for a year, but I know the ropes of regex and C#. Also, it will not match AAA1111 with a special character appended or prefixed. This might be a case to consider.

Minh Ngo
  • 174
  • 6
1

If I understand you correctly, what you want is a word boundary before and after "[A-Za-z]{3}[0-9]{4}". Wrapping it in space characters (\s) won't work if there are matches at the beginning or end of the string or if there's punctuation before or after it (you don't want to end with a . either because it matches strings like abc12345).

Try this:

@"\b[A-Za-z]{3}[0-9]{4}\b"

(The @ saves you from having to escape every \ you use in your expression)

Zachary
  • 282
  • 1
  • 9