I have created a simple PHP script that parses an HTML document and returns meta tags using getElementByTagName and getAttribute. It works perfectly apart from one thing, if the HTML tag is not in lower case then it does not return the content of the tag. For example:
<title>My Title</title>
Will return "My Title" but
<Title>My Title</Title>
or
<TITLE>My Title</TITLE>
will return nothing. Is there any easy way to get it to match the tag regardless of the case? I'm guessing that it might involve regex.
Sample of code below:
$nodes = $doc->getElementsByTagName('title');
$heading = $doc->getElementsByTagName('h1');
$title = $nodes->item(0)->nodeValue;
$h1 = $heading->item(0)->nodeValue;
$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description')
$description = $meta->getAttribute('content');
if($meta->getAttribute('name') == 'keywords')
$keywords = $meta->getAttribute('content');
if($meta->getAttribute('name') == 'robots')
$robots = $meta->getAttribute('content');
}