How do I remove all non alphanumeric words from a list of strings (List<string>
) ?
I found this regex !word.match(/^[[:alpha:]]+$/)
but in C# how can I obtain a new list that contains only the strings that are purely alphanumeric ?
How do I remove all non alphanumeric words from a list of strings (List<string>
) ?
I found this regex !word.match(/^[[:alpha:]]+$/)
but in C# how can I obtain a new list that contains only the strings that are purely alphanumeric ?
You can use LINQ for this. Assuming you have a theList
(or array or whatever) with your strings:
var theNewList = theList.Where(item => item.All(ch => char.IsLetterOrDigit(ch)));
Add a .ToList()
or .ToArray()
at the end if desired. This works because the String
class implements IEnumerable<char>
.
Regex rgx = new Regex("^[a-zA-Z0-9]*$");
List<string> list = new List<string>() { "aa", "a", "kzozd__" ,"4edz45","5546","4545asas"};
List<string> list1 = new List<string>();
foreach (var item in list)
{
if (rgx.Match(item).Success)
list1.Add(item);
}
With LINQ + regex, you can use this:
list = list.Where(s => Regex.IsMatch(s, "^[\\p{L}0-9]*$")).ToList();
^[\\p{L}0-9]*$
can recognise Unicode alphanumeric characters. If you want to use ASCII only, ^[a-zA-Z0-9]*$
will work just as well.
There's a static helper function that removes all non-alphanumeric strings from a List:
public static List<string> RemoveAllNonAlphanumeric(List<string> Input)
{
var TempList = new List<string>();
foreach (var CurrentString in Input)
{
if (Regex.IsMatch(CurrentString, "^[a-zA-Z0-9]+$"))
{
TempList.Add(CurrentString);
}
}
return TempList;
}