-2

I am trying to get a lot of words into a database as individual entries.

The file is a 17MB text file of comma separated words.

The PHP is:

$file = file_get_contents(plugins_url( '/cipher.txt' , __FILE__ ));
    $words = explode(',', $file);

    foreach($words as $word){
        $wpdb->query("INSERT IGNORE INTO cipher_words (word) VALUES (\"" . $word . "\")");
    }

I keep running into a memory error similar to:

[20-Feb-2016 15:26:26 UTC] PHP Fatal error:  Out of memory (allocated 247726080) (tried to allocate 16777216 bytes) in C:\xampp\htdocs\testsite\wp-content\plugins\joshscipher\cipher.php on line 26

[20-Feb-2016 15:26:29 UTC] PHP Fatal error:  Out of memory (allocated 139460608) (tried to allocate 8388608 bytes) in C:\xampp\htdocs\testsite\wp-content\plugins\joshscipher\cipher.php on line 26

[20-Feb-2016 15:26:29 UTC] PHP Fatal error:  Out of memory (allocated 247726080) (tried to allocate 16777216 bytes) in C:\xampp\htdocs\testsite\wp-content\plugins\joshscipher\cipher.php on line 26

Is there a better way to handle such a large array? Something maybe asynchronous?

Would a CSV work better, or would it just meet the same limit?

I have tried increasing PHP limits and WP limits to no avail.

Edit: The question marked as a duplicate does not get a memory error. Or any error at all.

JDSlimz
  • 113
  • 1
  • 9
  • Possible duplicate of [Reading very large files in PHP](http://stackoverflow.com/questions/162176/reading-very-large-files-in-php) – Aziz Saleh Feb 20 '16 at 15:56
  • He says in his situation " PHP doesn't seem to throw an error, it just returns false." My PHP is erroring. – JDSlimz Feb 20 '16 at 16:00
  • @JDSlims did you even bother reading the questions/answers? `file_get_contents` reads the entire data which will eventual break for growing files. Use `fopen` or `fgets`. – Aziz Saleh Feb 20 '16 at 17:58
  • @AzizSaleh I will admit that I did not read every answer. The question does not appear to match because he had no errors. Nor was he even using file_get_contents(). 2 comments mention it but it is not the same problem. Though the solutions may be similar, the question is not a duplicate. – JDSlimz Feb 20 '16 at 18:17

2 Answers2

1

Here is a quick python script if you want to give that a try, it usually handles larger amounts of data better than PHP.

import string
import mysql.connector

# Database connection
cnx = mysql.connector.connect(user='admin', password='password', database='python')
cursor = cnx.cursor()

# Get the file contents
with open('cipher.txt', 'r') as content_file:
    content = content_file.read()

# 'Explode' the string
content_array = string.split(content, ',')

# Foreach item, insert into db
for x in content_array:
    query = "INSERT IGNORE INTO cipher_words (word) VALUES ('"+x+"');"
    cursor.execute(query)

# Make sure data is committed to the database
cnx.commit()

# Close database connections
cursor.close()
cnx.close()
Mark
  • 373
  • 2
  • 11
  • I don't use windows but I think you can just save this as a .py file and double click it. – Mark Feb 20 '16 at 16:49
  • How can I debug it to see why it isn't working right? It opens for a split second and closes but nothing is input. – JDSlimz Feb 20 '16 at 17:22
  • Maybe from command prompt if you try running something like this. C:\python23\python YOURSCRIPTNAME.py – Mark Feb 20 '16 at 17:26
  • ImportError: No module named mysql.connector. – JDSlimz Feb 20 '16 at 17:33
  • Ok, I got it running without errors about 30 minutes ago. It is still running, which I expect due to 2.2 million words. I will keep you updated. – JDSlimz Feb 20 '16 at 18:19
  • Got impatient with nothing seeming to happen so added some outputs to the command line. It looks like it assembles ALL of the queries and then submits. By Design? Wait no, it executes but nothing is appearing in the db – JDSlimz Feb 20 '16 at 19:54
  • Yea I think it makes sense that it would do them all at once since the `cnx.commit()` submits everything, I never imagined it would take so long. – Mark Feb 20 '16 at 20:46
  • over 2 million queries – JDSlimz Feb 20 '16 at 20:47
  • I moved the commit to the each to test the queries. It is working beautifully. Its gonna beat up my sql server but for this once thats ok. – JDSlimz Feb 20 '16 at 20:53
  • Just FYI that script is STILL running. If you would like I can post the edits. I am at 106,000 words at the moment. There are a lot of duplicates. – JDSlimz Feb 22 '16 at 00:08
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/104192/discussion-between-jdslimz-and-mark). – JDSlimz Feb 22 '16 at 16:17
0

Are you open to increasing your PHP memory limit just for the import?

You can try putting this in the top of your php file.

ini_set('memory_limit', '128M');

Or you can change it globally in php.ini. (\xampp\php\php.ini)

memory_limit = 128M

You'll probably need to restart apache after changing the global memory limit.

Mark
  • 373
  • 2
  • 11
  • I have already set the PHP memory limit to 512M and restarted apache in order to attempt to bypass the issue. It did not work. – JDSlimz Feb 20 '16 at 16:18
  • Wow. Must be a massive file!! Are you open to using Python to do this import instead of PHP? – Mark Feb 20 '16 at 16:19
  • It is only 17M so I have no clue why it is saying that it ran out of memory. Python is something I would have to learn to do. But to be perfectly honest, it only ever has to happen once....so i may be able to use a csv and import right into mysql. – JDSlimz Feb 20 '16 at 16:21