Parse a huge XML array in PHP

Question

I want to get data from this URL: http://livingsocial.com/cities.atom. Each time I hit this URL the browser get stuck. I tried to hit it directly, through curl, and by file_get_contents() but the result is same.

This URL sends a huge Xml which I have to get and collect the desired information from it and save it in database.

Please help me in accomplishing this task or at least tell me how to get this XML?

@Gordon Try the edited version. The auto-linker was picking up the ".Each". — cwallenpoole, Mar 11 '13 at 18:44
see if http://stackoverflow.com/questions/911663/parsing-huge-xml-files-in-php?rq=1 helps. — Gordon, Mar 11 '13 at 19:26
Gordon try this url livingsocial.com/cities.atom.Then you will figure out the problem in which i have been stuck. — Ahsan Habib, Mar 11 '13 at 19:32

score 1 · Accepted Answer · answered Mar 12 '13 at 06:39

once i face the same problem.. to get the file contents of this URL open in chrome and after 1 or 2 second stop it.. it will show the structure of the xml.. complete the last 1 or 2 tags and enjoy.. i am pasting the structure here..

<?xml version="1.0"?>
  <feed xmlns:ls="http://livingsocial.com/ns/1.0" xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" xml:lang="en-US">
  <title>LivingSocial Deals</title>
  <updated>2013-03-12T00:49:21-04:00</updated>
  <id>tag:livingsocial.com,2005:/cities.atom</id>
  <link rel="alternate" type="text/html" href="http://www.livingsocial.com/"/>
  <link rel="self" type="application/atom+xml" href="http://www.livingsocial.com/cities.atom"/>
    <entry>
      <id></id>
      <published></published>
      <updated></updated>
      <link type="text/html" href="http://www.livingsocial.com/cities/1759-sacramento-citywide/deals/620554-set-of-two-organic-yoga-leggings" rel="alternate"/>
      <title></title>
      <long_title></long_title>
      <deal_type></deal_type>
      <merchandise_type></merchandise_type>
      <market_id></market_id>
      <market_name></market_name>
      <georss:point></georss:point>
      <georss:featureTypeTag>city</georss:featureTypeTag>
      <country_code>US</country_code>
      <subtitle></subtitle>
      <offer_ends_at></offer_ends_at>
      <price></price>
      <value></value>
      <savings></savings>
      <orders_count></orders_count>
      <merchant_name></merchant_name>
      <image_url></image_url>
      <categories></categories>
      <sold_out></sold_out>
      <national></national>
      <description></description>
      <details></details>
      <content type="html"></content>
      <ls:merchant></ls:merchant>
      <author>
        <name></name>
      </author>
    </entry>
  </feed>
</xml>

score 0 · Answer 2 · answered Mar 11 '13 at 18:52

0

I can't manage to even load the file on my browser, so my guess is that it is excessively large and you should try to limit the amount you have to load somehow (are there parameters which lets you specify only one city?) But, if that is not an option, the first example here has a class which should do roughly what you're looking for. Only be sure to pass a URL instead of the contents of the CURL request.

answered Mar 11 '13 at 18:52

cwallenpoole

79,954
26
128
166

No this solution does not work at all, rather it flags the following error message: Error: Cannot open hhttp://livingsocial.com/cities.atom – Ahsan Habib Mar 11 '13 at 19:46

score 0 · Answer 3 · answered Mar 12 '13 at 00:02

The URL http://www.livingsocial.com/cities.atom is just large (94 354 882 bytes, that is roughly 90 MB) and takes its time to load (here 33 seconds).

As this is a remote resource you can not change that.

However if you store that feed to disk (cache it) you can reduce the time loading the file into Simplexml or DOMDocument to ca. 1.5 seconds.

// Store URL to disk (takes ca. 33 seconds)
$url = 'http://www.livingsocial.com/cities.atom';
$out = 'cities.atom.xml';
$fh  = fopen($url, 'r');
$r   = file_put_contents($out, $fh);
fclose($fh);

If that still is too slow you not only need to cache the remote-file but also the parsing.

Parse a huge XML array in PHP

3 Answers3