I have the need to parse a comma separated list of numbers and number ranges. The strings are entered into a UI by the user and will look something like one of these (six different samples inputs):
1-3, 5, 7-10
1
21.1
1.2-3,5.1,7-10.1
1-3, 5.1, 7-10, 21
1.1-3.1,5.1,7.1-10.1
My end goal is have a collection of number and number ranges that I can process later downstream. For example, after parsing the first string sample above my end result would be a collection that contains 3 elements: 1-3, 5 and 7-10.
Using C# and a .NET Regex this pattern nicely fills the Matches collection with just the items I need (Note the use of non-capturing groups):
(\d+(?:\.\d+)?-\d+(?:\.\d+)?)|(\d+(?:\.\d+)?)
I have two questions though:
Do I need all of that in my pattern, or is there a more brief pattern possible?
Is there something I can add to the pattern to return 0 matches when there are invalid characters contained in the string? For example if I include an alpha character in the string anywhere I would want no matches to occur. Right now I do this with two passes, one to validate that the string only has valid characters [\d,.- ], and another pass to get the matches assuming it validated in the first pass.
Thanks in advance for your ideas.
Update:
Here's the solution I ended up going with (see @Xiaoy312 answer):
public static IEnumerable<DataRange> ParseInput(string input)
{
if (!Regex.IsMatch(input.Replace(" ", string.Empty), @"^[\d\.,\-]+$"))
return Enumerable.Empty<DataRange>();
return Regex.Matches(input.Replace(" ", string.Empty),
@"(?<A>\d+(?:\.\d+)?)(?:-(?<B>\d+(?:\.\d+)?))?").Cast<Match>()
.Select(m => new DataRange
{
A = double.Parse(m.Groups["A"].Value,
System.Globalization.CultureInfo.InvariantCulture),
B = m.Groups["B"].Success ? double.Parse(m.Groups["B"].Value,
System.Globalization.CultureInfo.InvariantCulture) : (double?)null
});
}
public class DataRange
{
public double A;
public double? B;
}
Here's sample usage:
static void Main(string[] args)
{
Console.WriteLine("A\tB");
var items = ParseInput("1");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
items = ParseInput("21.1");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
items = ParseInput("1-3,5,7-10");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
items = ParseInput("1.2-3,5.1,7-10.1");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
items = ParseInput("1-3, 5.1, 7-10,21");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
items = ParseInput("1.1-3.1,5.1,7.1-10.1");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
items = ParseInput("1.1-3.1,5.1,7.1-10.1a");
Array.ForEach(items.ToArray(), i => Console.WriteLine("{0}\t{1}", i.A, i.B));
}
Sample output:
A B
1
21.1
1 3
5
7 10
1.2 3
5.1
7 10.1
1 3
5.1
7 10
21
1.1 3.1
5.1
7.1 10.1