I have a string in which I want to check for the existence of 2 different types of item. These two types are not mutually exclusive, so I want to also avoid overlap. They can also occur in any order. There are also words within the string that should be ignored, even though they fit the regex pattern.
- There must be one alpha only item, one to many characters:
[a-zA-Z]+
. - Another item needs to be alphanumeric, also one to many characters:
[a-zA-Z0-9]+
. - The alphanumeric item cannot also satisfy the criteria of the alpha-only item, and vice versa.
- The items in an exclusion list should be ignored.
I tried following the post Regex: I want this AND that AND that... in any order, but I still can't figure out how to exclude the words I need, and I could not figure out how to leverage that answer so that one word didn't satisfy both alphanumeric and alpha only criteria.
This is what I'm currently doing, and it seems to be working, just not very concise. If possible, I'd like to learn how I can expand this out to a single regex check. Apart from not being super concise, I feel that regex will be safer down the road in case I end up needing to add more conditions.
bool bHasAlpha = false;
bool bHasAlphaNum = false;
string Test = "123 ABC SomeWord A12"; //The string to check against.
string[] RemoveWords { "ABC", "DEF" }; //I don't want these matches to count, if found.
//Split my string into "tokens" and check each individually, ignoring the RemoveWords.
string[] TestTokens = Test.Split(' ')
.Select(s => s)
.Where(w => !RemoveWords.Contains(w, StringComparer.OrdinalIgnoreCase))
.ToArray();
foreach (string s in TestTokens)
{
//Is this item alpha-only? (Checking this before alphanumeric)
if (!bHasAlpha && Regex.IsMatch(s, @"^[a-zA-Z]+$"))
bHasAlpha = true;
//Is this item alphanumeric?
else if (!bHasAlphaNum && Regex.IsMatch(s, @"^[a-zA-Z0-9]+$"))
bHasAlphaNum = true;
}
if (bHasAlpha && bHasAlphaNum)
Console.WriteLine("String Passes!");
In the test code above, the string would pass because "123" is caught by the alphanumeric check, and "SomeWord" is caught by the alpha-only check. "ABC" was not because I purposely ignore it.
Examples of strings that should fail:
"123 abc 456"
(abc ignored, no valid alpha-only item)"X"
(X can satisfy either alpha or alphanumeric, not both)"ABC DEF 123 456"
(ABC and DEF ignore, no valid alpha-only item)
The following should pass:
"ABCDEF 123"
(ABCDEF as a whole word are not considered the same as ABC and DEF separately)"X X"
(2 "words", neither are in the excluded list. One satisfies the alphanumeric criteria, on the alpha-only.)"ABC XYZ ABC DEF A1B2 ABC"
(XYZ is alpha, A1B2 is alphanumeric)- 123 XYZ (order of the 2 items does not matter. Alpha-only can be 2nd)