I have to check wether a page has a robots noindex meta tag in its source code and I want to catch as many different html syntax variants as possible.
First i tried get_meta_tags() function, but it has some limitations, so I decided to stick with preg_match.
I tried this regular expression:
"/<meta\s+name\s*=\s*[\"'](.*?)[\"']\s*content\s*=\s*[\"'].*?noindex.*?[\"']\s*\/?>/i"
however it fails when the noindex meta tag is like this (content part first):
<meta content="follow, index" name="robots" />
Can anyone share a more appropriate regular expression to achieve my goal?