-1

I'm working on a C# project and I need to parse and extract some dates from some strings. Theese are my strings:

dalle ore 19.30 del 04.02.2016 alle ore 19.30 del 06.02.2016
dalle ore 19.30 del 06.02.2016 alle ore 19.30 del 08.02.2016
...

For each one I'd like to extract the two dates (ex. 04.02.2016 06.02.2016) and save to two variables. Next I'll parse them to create two DateTime objects. Now I'm using this code:

 public static string isdate(string input)
 {
  Regex rgx = new Regex(@"\d{2}.\d{2}.\d{4}");
  Match mat = rgx.Match(input);
  if(mat.Success)
  return mat.ToString();
  else return null;
 }

With this code i can extract the first date but not the second one. How can I improve my regular expression? Thanks!

Try code below

        static void Main(string[] args)
        {
            string[] inputs = {
                 "dalle ore 19.30 del 04.02.2016 alle ore 19.30 del 06.02.2016", 
                 "dalle ore 19.30 del 06.02.2016 alle ore 19.30 del 08.02.2016"
                             };

            string pattern = @"(?'hour'\d\d).(?'minute'\d\d)\sdel\s(?'day'\d\d.\d\d.\d\d\d\d)";

            foreach (string input in inputs)
            {
                MatchCollection matches = Regex.Matches(input, pattern);
                foreach (Match match in matches)
                {
                    TimeSpan time = new TimeSpan(int.Parse(match.Groups["hour"].Value), int.Parse(match.Groups["minute"].Value), 0);
                    DateTime date = DateTime.ParseExact(match.Groups["day"].Value, "MM.dd.yyyy", CultureInfo.InvariantCulture);

                    Console.WriteLine("Time : {0}", date.Add(time));
                }
            }
            Console.ReadLine();
        }

Ok the solution by jdwend is good but the problem is that between HH.mm and the date could be several spaces and characters. several times is in this form: HH:mm del dd.MM.YYYY but sometimes is in this form dd.MM.YYYY del     dd.MM.YYYY . Do You think is still possible to parse all data with one regexp or do I have to tokenize the string? Thank U so much!

DaX
  • 1
  • 1
  • 3
  • If the date format is **really** fixed (DD.MM.YYYY) you can scan it using `Regex rgx = new Regex(@"\d{1,2}\.\d{1,2}\.\d{4}");` (note that you have to escape the `\.` or else it's a `.` which matches everything except linebreaks. (http://regexr.com/3d3fv) – Maximilian Gerhardt Mar 26 '16 at 13:03
  • Be careful, if its locale dependent - than that's the regex thrown out the window! MM/DD/YYYY, YYYY-MM-DD, or even MM.DD.YYYY. You're setting yourself up for headaches. Do not rely on it if you are 100% sure and confident that it will always be that fixed date format. – t0mm13b Mar 26 '16 at 13:07
  • Do you mean you need a code to get multiple matches? Your regex [matches all the dates](https://regex101.com/r/qB6sT9/1). – Wiktor Stribiżew Mar 26 '16 at 13:10
  • I edited the posting because somebody incorrectly marked this posting as a duplicate without testing the solution. This issue is that simple because the parse for TimeSpan and DateTime doesn't like the format "hh.mm". It thinks the period is the minute/second seperator. – jdweng Mar 26 '16 at 14:03
  • Ok the solution by jdwend is good but the problem is that between HH.mm and the date could be several spaces and characters. several times is in this form: HH:mm del dd.MM.YYYY but sometimes is in this form dd.MM.YYYY del     dd.MM.YYYY . Do You think is still possible to parse all data with one regexp or do I have to tokenize the string? Thank U so much! – DaX Mar 26 '16 at 19:20

1 Answers1

0

Your regular expression is fine, but you only retrieve the first match. To get all matches, use Matches instead of Match:

private static final Regex dateRegex = new Regex(@"\d{2}.\d{2}.\d{4}");

public static IEnumerable<string> ExtractDates(string input)
{
     return from m in dateRegex.Matches(input).Cast<Match>()
            select m.Value.ToString();
}

Notes:

  • Since Regex objects are thread-safe and immutable, you don't need to rebuild it every time. You can store it safely in a static variable.

  • Since the Matches method predates .NET generics, we need to the Cast<Match> call to cast the resulting collection to an IEnumerable<Match>, so that we can use LINQ.

Heinzi
  • 167,459
  • 57
  • 363
  • 519