My intention is to fetch the link with PHP and maybe with Simple PHP DOM parser (or something similar) parse the content and look for H1-H6 tags. But prior to that I would need to find out if the page is being indexed at all.
Other than parsing the content and searching for <meta name="robots" content="noindex">
or similar, is there a way I could check if a page is set to noindex also in robots.txt?