1

I'm currently working on a way of validating whether or not any given url corresponds to a properly formatted podcast feed.

Right now I have a two-phase approach which appears to be acting as a fairly sufficient catch-most. The first is just using CURL to check for a response, but then I use DOMDocument's validateOnParse to check the formatting, ie.

$dom = new DOMDocument();
$dom->validateOnParse = true;
if($dom->load($url, LIBXML_NOERROR)){

Which seems to be a bit oversensitive, and will occasionally reject poorly structured podcast feeds. It also passes regular, non-podcast rss feeds.

Note: I'm certain the poorly structured podcast feeds are still acceptable as I've tested them by subscribing to them through a podcast app.

Obviously validateOnParse isn't designed specifically to check for podcasts, but is there another method or library that is? It seems like there is very little conformity to any sort of standards on the part of podcast makers.

-- UPDATE --

Anyone who has searched and looked at this question will most likely find the duplicate as an appropriate solution. However, in my case it turned out that rather than poor formatting, I was getting errors because some of the requests for $url were being blocked based on whatever my server was giving as User Agent.

In simple terms the solution to this was to fake the User Agent, something like this:

$options  = array('http' => array('user_agent' => 'some user agent string'));
$context  = stream_context_create($options);

$file = file_get_contents($url,false,$context);

This seemed to solve all the cases of false-negative, and the duplicate seems to fix the false-positives.

  • If the suggested duplicate wasn't helpful to you, please leave a notice. Also please provide two or three sample podcasts: one non-podcast, one invalid but working and one valid one for example. – hakre Mar 11 '15 at 17:12

1 Answers1

0

Use the '@' sign like this:

@$dom->validateOnParse = true;

Because not always you're going to find validated documents, the '@' sign will ignore any warnings that might occur.

yassine2020
  • 811
  • 6
  • 8
  • `$dom->validateOnParse = true;` will never emit a warning (if `$dom` is properly initialized) so placing the error control operator in front of it is misleading at best. – hakre Mar 10 '15 at 19:14
  • Yeah you're right, sorry my bad ^^ – yassine2020 Mar 11 '15 at 16:52