-3

I have outside HTML file, that I read to PHP variable. In this HTML file there are different dates(and times). What would be the easiest way to read them to variable. Date-time is always in the following format: <td align="left" valign="middle" class="contentValueFont">01-Mar-14 19:24:45 GMT</td>
Ofcourse date and time change.
In other place I used to search with class name and read all the following to variable until </td> comes, but this time same class name is used in different places, not associated with my date/time.
I am not good at regular expressions.

Thanks for help. T.

ykstom
  • 23
  • 3
  • 1
    Not good at regular expressions? Well, you should use them, so better make that effort. Nothing is easy at programming. –  Mar 03 '14 at 15:44

4 Answers4

3

A regular expression is not the right tool for the job. Use an HTML parser to extract the date and DateTime class to do the processing:

$html = <<<HTML
<td align="left" valign="middle" class="contentValueFont">01-Mar-14 19:24:45 GMT</td>
HTML;

$dom = new DOMDocument;
$dom->loadHTML($html);
$date = $dom->getElementsByTagName('td')->item(0)->nodeValue;

$dateObj = new DateTime($date);
echo $dateObj->format('Y-m-d H:i:s');

Output:

2014-03-01 19:24:45
Amal Murali
  • 75,622
  • 18
  • 128
  • 150
  • Isn't a RegEx a better option (faster and memory efficient). Especially since dates follow a common format. – Itay Grudev Mar 03 '14 at 15:47
  • 2
    @ItayGrudev: Sure, the dates *can* be extracted using a regex, but it's just better to use an HTML parser instead, so you can be sure that it works *every* time. – Amal Murali Mar 03 '14 at 15:49
  • @ItayGrudev: See [Can you provide some examples of why it is hard to parse XML and HTML with a regex?](http://stackoverflow.com/q/701166/1438393). – Amal Murali Mar 03 '14 at 15:50
  • They stated that the classname is not unique, so how do you propose finding the correct `td` for the date? – AbraCadaver Mar 03 '14 at 16:54
  • @AmalMurali I totally agree. Just the statement "you can be sure that it works every time" is enough. – Itay Grudev Mar 03 '14 at 19:03
0

This might work:

preg_match('/\d\d-[A-Z]{3}-\d\d \d\d:\d\d:\d\d [A-Z]{3}/i', $html, $match);
print_r($match);

Based on your comment, this will get all of the dates of that format:

preg_match_all('/\d\d-[A-Z]{3}-\d\d \d\d:\d\d:\d\d [A-Z]{3}/i', $html, $matches);
print_r($matches[0]); 
AbraCadaver
  • 78,200
  • 7
  • 66
  • 87
-1

You will need to use regular expressions (very little), but php provides this function for exactly your needs.

codehitman
  • 1,148
  • 11
  • 33
  • 2
    The question is about finding the date string within a HTML document string, not about converting that string into a timestamp. – Patrick Q Mar 03 '14 at 15:50
-2
<?php

$unix = strtotime('01-Mar-14 19:24:45 GMT'); //get unix timestamp
echo date('d-M-y H:i:s', $unix); // output in any format
//output - 01-Mar-14 19:24:45

Reference to all character that can be used in date() for formatting: http://in1.php.net/manual/en/function.date.php

Akshay Kalose
  • 787
  • 1
  • 5
  • 15