10

first of all excuse me for not being regex familiar that much.What I Would like is a regex that will extract a date like mysql date from any type of string. Until now I was using this : ^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$

However now I want to extract date patterns from other strings and datetime strings I tried altering the regex to ^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1]). based on some online regex testers, but it fails. Also some times it gave me a result with a 3 digit day.

In other words sting starts with, yyyy-mm-dd and is followed up by spaces characters numbers or anything. How do I extract the date?

UPDATE

I'm testing regex with preg_match here: http://www.pagecolumn.com/tool/pregtest.htm

so far the only thing that seems to work is

[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])
0x_Anakin
  • 3,229
  • 5
  • 47
  • 86

5 Answers5

8

If your string has more than one date form value occurrence and you wanna capture all of them you should use preg_match_all function and if it's not preg_match is enough. Also using ^ and $ means input string is should be just a date, so avoid it.

<?php
$input_string = "yes today is 2013-10-24";
if(preg_match("/\d{4}-\d{2}-\d{2}/", $input_string, $match))
{
    print_r($match);
}
else
    echo "not matched";
////////////////////////
/* Output:
Array
(
    [0] => 2013-10-24
)
*/

Live demo

revo
  • 47,783
  • 14
  • 74
  • 117
6

Try this: you can use preg_match() or preg_match_all()

   $dateArray = preg_match("/(\d{4}-\d{2}-\d{2})/", $str, $match);

Then use strtotime and date

$date = date('Y-m-d',strtotime($match[0]));
Sunil Kumar
  • 1,389
  • 2
  • 15
  • 32
4

To match dates wherever they appear, remove the $ and ^ anchors from your original regex.

To match dates at the start of any input remove the $ at the end (leave the ^).

You can also put the remaining pattern inside parentheses for convenience, so that the match is also captured as a whole.

Your suggested improvement has a spurious dot at the end which will match any character; that was the reason for returning matches with three-digit days.

Jon
  • 428,835
  • 81
  • 738
  • 806
2

Just replace ^ for \b.

\b(\d{4}-\d{2}-\d{2})
edi_allen
  • 1,878
  • 1
  • 11
  • 8
1

It is dot in the end of your regexp (it matches any character, except for line breaks) Try removing it

^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])
Mizax
  • 86
  • 5