I'm currently working on a way of validating whether or not any given url corresponds to a properly formatted podcast feed.
Right now I have a two-phase approach which appears to be acting as a fairly sufficient catch-most. The first is just using CURL to check for a response, but then I use DOMDocument's validateOnParse to check the formatting, ie.
$dom = new DOMDocument();
$dom->validateOnParse = true;
if($dom->load($url, LIBXML_NOERROR)){
Which seems to be a bit oversensitive, and will occasionally reject poorly structured podcast feeds. It also passes regular, non-podcast rss feeds.
Note: I'm certain the poorly structured podcast feeds are still acceptable as I've tested them by subscribing to them through a podcast app.
Obviously validateOnParse isn't designed specifically to check for podcasts, but is there another method or library that is? It seems like there is very little conformity to any sort of standards on the part of podcast makers.
-- UPDATE --
Anyone who has searched and looked at this question will most likely find the duplicate as an appropriate solution. However, in my case it turned out that rather than poor formatting, I was getting errors because some of the requests for $url were being blocked based on whatever my server was giving as User Agent.
In simple terms the solution to this was to fake the User Agent, something like this:
$options = array('http' => array('user_agent' => 'some user agent string'));
$context = stream_context_create($options);
$file = file_get_contents($url,false,$context);
This seemed to solve all the cases of false-negative, and the duplicate seems to fix the false-positives.