2

I'm using simpleXML to parse a xml file and want to iterate through the elements using a foreach loop and store it to a mysql database with help of a Yii model.

I have a sample xml file containing 1083 entries each having three attributes. When using PHP version 5.3.1 on Windows it creates 1208 entries in the database table instead of 1083. I tried to find the problem in my code but after hours of testing I decided to switch to PHP version 5.4.7 on Windows. In this version it creates 1083 entries which is correct. But I want to find out what the problem exactly is because I need to find a solution because on my productive environment on a linux server I have PHP version 5.4.15-1 which also creates a wrong number of entries.

That's some example xml:

<listentries>
  <listentry>
    <name>downs</name>
    <value>-1.5</value>
    <type>STATIC</type>
  </listentry>
  <listentry>
    <name>upcoming</name>
    <value>1.0</value>
    <type>STATIC</type>
  </listentry>
  ...
</listentries>

So my problem is that the foreach loop is iterating too often.

$list = simplexml_load_file($filename);
foreach($list->listentries->listentry as $entry){
     $model = new Entry;
     $model->name = $entry->name;
     $model->value = $entry->value;
     $model->type = $entry->type;
     $model->save();
}

When echoing the number of list elements I get 1083. When using a counter for counting the iterations I get 1208. So something very weird is going on. The 1208 inserted entries in the database looks the following: each entry is inserted at least once but some are inserted twice.

Does anybody have an idea where the missbehavior is coming from exactly?

I found out that the correct number of iterations is done when the xml file contains less than 970 entries.

My configurations and number of inserted entries after importing xml containing 1083 entries:

  • Windows, PHP 5.3.1, libxml 2.7.6: 1208 entries
  • Windows, PHP 5.4.7, libxml 2.7.8: 1083 entries
  • Linux, PHP 5.4.15-1, libxml 2.7.6: 1208 entries

So it looks like updating libxml could help!?

sandro1111
  • 63
  • 1
  • 8
  • 3
    Hello @sandro1111 and welcome to StackOverflow! Could you please give us an example of the xml file containing a couple of records? – tftd Jun 11 '13 at 12:07
  • try `return $list;` to see what exactly the number of records are there. – Sudhanshu Saxena Jun 11 '13 at 12:31
  • @SudhanshuSaxena the user never stated this was a function and he also said the number of records in the bottom of his question. – tftd Jun 11 '13 at 12:45
  • 1
    What you experience is most likely a flaw within the PHP simplexml extension. I write this because the different behavior with the different PHP versions suggests that. *however* please be careful: Please extract a self-containing test-case which reproduces your issue (with as little code/data as possible). This is important if you really want to research further on this issue and if you aim for creating a bug-report that can be fixed effectively. – hakre Jun 11 '13 at 12:46
  • Also please deactivate the foreach-loop and just tell what `var_dump(count(iterator_to_array($list->listentries->listentry, FALSE)), count($list->listentries->listentry));` returns with the different PHP versions. – hakre Jun 11 '13 at 12:50
  • @hakre I don't know if that's true. I've been using `simplexml` for quite some time now and I have never seen this effect (I guess I'm lucky). I'm rather thinking the xml file is broken somewhere. – tftd Jun 11 '13 at 13:04
  • @tftd: A flaw is something you can experience under extraordinary conditions. Because so far I can only base my assumptions on on what you wrote and you didn't provide any factual data nor concrete code to reproduce your issue, this is only a guess. So far, the highest chance from what happens is that *you* made a mistake and *you* describe the issue wrong. However, just writing that is not that helpful, so first of all I made some suggestions how you can trouble-shoot and invited you to provide more information. – hakre Jun 11 '13 at 13:08
  • Also capture the libxml versions on all those different PHP configurations: [`LIBXML_DOTTED_VERSION`](http://php.net/libxml.constants#LIBXML_DOTTED_VERSION) - This is the only flaw I myself ran over, just linking to show you that flaws can exist in software and in specific within the simplexml component [XPath query result order](http://stackoverflow.com/q/8195733/367456) - and I have been using it for years without any problem like you. – hakre Jun 11 '13 at 13:16
  • Is it possible to post the XML file? Maybe a link to download it somewhere? – chrislondon Jun 11 '13 at 13:31
  • @hakre I'm not sure if you haven't confused @sandro1111 with me in your statements. Anyway, I haven't been able to confirm this flaw exists. I've been parsing long xml files (5,000+ records) with `SimpleXML` and I haven't had any problems parsing, validating and inserting my data into a db. So I suspect it's either `broken` xml records or some very specific windows/linux bug in the libraries as @hakre suggested. – tftd Jun 11 '13 at 13:32
  • 2
    Upsi :) I might have confused, yes. Sorry. In any case, as the OP didn't provide the actual data, this is likely not to be reproduced. At least I won't care unless OP provides a shortened test-case that can reproduce. – hakre Jun 11 '13 at 13:36
  • First of all thank you for helping me. I've added some xml code and I'm sure the xml file is not broken. @hakre: I ran the var_dump function as suggested from you and the result was int(1083) int(1083). I'll try libxml now and will inform you about the result. – sandro1111 Jun 12 '13 at 06:18
  • I checked the libxml version on all of my three configurations. PHP version 5.3.1 on Windows uses libxml version 2.7.6, PHP version 5.4.7 on Windows uses libxml version 2.7.8 and PHP version 5.4.15-1 on Linux uses libxml version 2.7.6. So this could probably mean changing the libxml version could help? Or what do you guys think? – sandro1111 Jun 12 '13 at 06:45
  • Can you create a list of those systems in your question (maybe at the end), listing those systems each with Operating System, PHP version, libxml version and the count values? From the libxml versions I'd say you would like to use libxml version 2.7.8, the 2.7.6 version probably has a flaw you're affected by. If you can gain 1083 with the var_dump on *all* systems there might be some possibility of a work-around, however you should first find out if you can upgrade libxml and secondly which bug in libxml this is. That requires scanning the libxml changelog of 2.7.7/.8. – hakre Jun 12 '13 at 07:17
  • I've just added a list with this information. I could not identify which fixed bug could cause this failure in the [release notes](http://www.xmlsoft.org/news.html). – sandro1111 Jun 12 '13 at 07:34
  • sorry for not responding any more. the problem was an old version of LIBXML. I've updated and tested different configurations with different LIBXML and PHP versions and it finally came out it really was the buggy version of LIBXML which caused the problem!! Thanks!!! – sandro1111 Jul 03 '13 at 06:20

0 Answers0