1

I have searched the site and found many resources to help me load and XML file to parse, however after following some of the examples, I can still not get this to work. What am I missing?

thanks

<?php
$url = 'http://api.l5srv.net/job_search/api/web/find_jobs.srv?CID=2239&SID=u9xcvY234AA09&format=XML&q=Sales&l=95054&r=25&s=relevance&a=2014-09-30&start=1&limit=8&highlight=off&userip=25.158.22.121&useragent=Mozilla%2F5.0';

$xml = simplexml_load_file($url) or die("feed not loading");

var_dump($xml);
?>
DEM
  • 333
  • 2
  • 4
  • 16

2 Answers2

2

You're most likely confusing what you see in the browser under that URL with an XML document.

What you have at the URL

http://api.l5srv.net/job_search/api/web/find_jobs.srv?CID=2239&SID=u9xcvY234AA09&format=XML&q=Sales&l=95054&r=25&s=relevance&a=2014-09-30&start=1&limit=8&highlight=off&userip=25.158.22.121&useragent=Mozilla%2F5.0

is not an XML document. When you request that URL and you look into the response headers, you can see that it is a HTML document:

HTTP/1.1 200 OK
Content-Type: text/html;charset=ISO-8859-1
P3P: CP="IDC CON TEL CUR DEV SAM IND"
Date: Mon, 10 Aug 2015 15:50:46 GMT
Content-Language: en-US
Connection: Keep-Alive
Set-Cookie: X-Mapping-gjinjpae=F462690912B62A0C5476B15FCDB01A81; path=/
Set-Cookie: JSESSIONID=BC6666D2A357FB969A60E05E67B0888C; Path=/
Set-Cookie: JSESSIONID=700D01B2F92909260A05215F86AB8EE5; Path=/
Content-Length: 7147

You can also easily verify that with your browser by making use of the view source feature in your browser or by seeing, that the XML is not displayed "pretty".

However simplexml_load_file expects a well-formed XML document. In your case the main problem you've got is missing error handling. As you interact with a remote system, error handling is crucial for stable use, so make it part of your script:

Next to that, as it's an HTML document and not an XML document, you need to parse it with a HTML parser - not an XML parser. So don't try with an XML parser at that stage, use a HTML parser first.

Edit:

The problem with that service is only when you request XML format. If you change the format parameter to JSON (&format=JSON) you can parse the data straight-away despite the wrong response content-type given:

$url = 'http://api.l5srv.net/job_search/api/web/find_jobs.srv?CID=2239&SID=u9xcvY234AA09&format=JSON&q=Sales&l=95054&r=25&s=relevance&a=2014-09-30&start=1&limit=8&highlight=off&userip=25.158.22.121&useragent=Mozilla%2F5.0';

$result = json_decode(file_get_contents($url));

print_r($result);

Gives:

Array
(
    [0] => stdClass Object
        (
            [response] => stdClass Object
                (
                    [query] => Sales
                    [location] => 95054
                    [highlight] => off
                    [totalresults] => 25406
                    [start] => 1
                    [end] => 9
                    [radius] => 25
                    [pageNumber] => 0
                    [results] => Array
                        (
                            [0] => stdClass Object
                                (
                                    [jobtitle] => Sales
                                    [zip] => 95101
                                    [company] => Commercial Janitorial Company
                                    [city] => San Jose
                                    [state] => CA
                                    [country] => US
                                    [date] => 2015-07-14
                                    [url] => http://api.l5srv.net/job_search/api/web/get_job.srv?token=3aeyzv6V2SA%2BKYF5lzqWmxyivVwoE3LFernO291sVVpLbWCG9bBAbVO%2BCGXuN1V%2F9QMmDY3KeK5iYg2phrtjypXtQ82Jngf1q8zQIzix14EuBlSL96sqjffsuHozTZ4SJ6Mf%2B%2BVwRrC65gRtKxH6wg0F50WEZtnD9Xv0%2Bxc2GMhFMszKNEOyrfCNg5YTn%2Flj
                                    [snippet] => Company Description:

We are a Christian owned janitorial company doing business here in the Bay Area for nearly 40 years. You do not have to be Christian to work for us.
We operate in a fast paced, f
                                    [onmousedown] => l5_trk(this)
                                )

                            [1] => stdClass Object
                                (
                                    [jobtitle] => Sales
[...]
Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
  • There might be even a request header missing, looking into it. – hakre Aug 10 '15 at 16:02
  • Okay, no header problem, just most likely a very disturbed server. – hakre Aug 10 '15 at 16:04
  • @DavidMaldonado: Jup, that service is a total mess. It can't deliver XMl properly, but it does deliver the JSON. I've edited the anser and gave an example. – hakre Aug 10 '15 at 16:09
  • yes i seew hat you mean, thats a great tip, thanks for help! – DEM Aug 10 '15 at 16:30
1

The code coming from this server is not valid XML. Try this:

<?php
    $url = 'http://api.l5srv.net/job_search/api/web/find_jobs.srv?CID=2239&SID=u9xcvY234AA09&format=XML&q=Sales&l=95054&r=25&s=relevance&a=2014-09-30&start=1&limit=8&highlight=off&userip=25.158.22.121&useragent=Mozilla%2F5.0';

    $data = file_get_contents($url);
    $data = '<' . '?xml version="1.0" encoding="UTF-8"?' . '>' . str_replace(array("&lt;", "&gt;"), array("<", ">"), $data);

    $xml = simplexml_load_string($data) or die("feed not loading");

    var_dump($xml);
  • Hi, I just put this in my script and it still no go, any thoughts? thanks for help/ – DEM Aug 10 '15 at 15:48
  • The script above is working for me in a sandbox - can you update your post with your new code? –  Aug 10 '15 at 15:52
  • Hi, its works, made a typo error, thanks for the help. So the reason this happened was because they person who provided me the code, does not have valid XML? – DEM Aug 10 '15 at 15:53
  • Yes, the code that comes off of the server has been HTML encoded, so the '<' and '>' characters have been encoded to '<' and '>' (HTML codes). They have also missed out `` at the beginning of the file. –  Aug 10 '15 at 15:55
  • 1
    This answer is wrong. Even it might work, the character encoding of the HTML is different. Use a HTML parser (e.g. **DOMDocument**) and then load it as HTML and extract the XML. No need to manually punch holes into that string. – hakre Aug 10 '15 at 16:00