3

Im trying to read a large XML file and INSERT all the data in the database. I am trying out 3 different scenarios (and I need help with the third):

  1. After each XML element I do a single query: $this->mysqli->query($sql);
  2. I add all the INSERTs to a $sql variable and after the loop I use one multi query $this->mysqli->multi_query($sql); to INSERT all at once.

I ran tests and both scenarios are not to my liking:

  1. Inserts all data in the db but runs for 113 seconds. It's not a big problem, but I want to do better!
  2. This is the option I use for my smaller XML files and it works wonders. For one of my functions it reduced execution time from 6.5s to 0.17s and just as accurate. The problem with this file is: this $sql packet is becoming too large and the script becomes unstable, the output is wrong.

So now I came up with a third, hybrid solution. But It's not working as expected. The idea behind it is to create a $this->mysqli->muli_query($sql); call after each 200 elements, and then restart the process. I think I'm missing the right understanding of the mysqli->multi_query function, because to me the code below makes perfect sense. But maybe I need to free up / point / next() something?

Anyways: after running the script the database is not completely filled.

As stated, below my code :

if ($xmlObj) {
        $sql = "";
        $i = 1;
        foreach ($xmlObj->Response->FeaturesList->Feature as $feature) {
            $sid = $feature["ID"];
            $sql .= "INSERT INTO ...;";
            // Hybrid solution
            if (($i % 200) == 0) {
                $this->mysqli->multi_query($sql);
                $sql = "";
            }
            $i++;
        }
        // Hybrid: Update the leftovers
        $this->mysqli->multi_query($sql);
    }

UPDATE Option #4 as added by Mihai works fine. I created an INSERT..VALUES(a),(b),(c) string and added that using mysqli->query() Runs 10 second and adds all elements. Thanks

The question about the multi_query mystery remains though!

Hans Wassink
  • 2,529
  • 3
  • 30
  • 45
  • 2
    Another option which might be faster:build you query INSERT..VALUES(a),(b),(c) so you hit your table only once – Mihai Aug 03 '15 at 16:09
  • Thanks for the option. The code is simplified though. For every feature I also add INSERT queries to a language table. So that's not that easy, but I can obviously create a second $sql2 variable for those. Won't I run in the same packets problem then? – Hans Wassink Aug 03 '15 at 16:16
  • I`m not sure what exactly the packets error is,but probably you can tweak some php or mysql variables to get over it.Wild guess though.What is certain that only a single call to the db is faster than 200. – Mihai Aug 03 '15 at 16:17
  • I agree that it's faster, that's why Im trying to do so :D The packet error basically states that the SQL string I feed to it is too big. I can tweak it only so much. I have tried that already, but I kinda feel that if it's taking too much memory or size you need to rethink your code, not upgrade the limits. I'm gonna try your option out as scenario 4, and let you know how it goes! Question remains though :D – Hans Wassink Aug 03 '15 at 16:21
  • Or maybe you can use this option https://dev.mysql.com/doc/refman/5.5/en/load-xml.html – Mihai Aug 03 '15 at 16:23
  • If I can map that it might be faster. The XML is not built the same way as the example, lots of different fieldnames etc... Ill try it out, thanks – Hans Wassink Aug 03 '15 at 16:25
  • As for the mistery: you likely hit `max_allowed_packet` limit. – Marek Aug 04 '15 at 08:39
  • That is not really the mystery :D I pointed that out in my post. But I just want to know why it is not possible to use multi_query moreoften. – Hans Wassink Aug 04 '15 at 11:57

0 Answers0