0

I've Googled this topic a lot, but cannot find a solution which fits my needs :(

I have a MySQL DB with a table containing e-mail adresses (10,000+).

I would like to run a batch job on them every 5 minute.

So I'll guess Python is a good choice for retrieving the resultset from MySQL and then call a command-line with the e-mail address' as arguments.

How do I do this the best way? I'm think of getting the entire resultset from MySQL and then have a bunch of workes calling the command-line with the arguments until there aren't anymore e-mail address. Can this be done in a simple, yet stable, way?

user649542
  • 103
  • 1
  • 4
  • 7

2 Answers2

0

you could use the multiprocessing module like this :

from multiprocessing import Pool
p = Pool()    # let python choose the optimal number of processes (= number of CPU cores)
def treat_email(email_adress):
    # do the stuff you need with email address

email_addresses = grab_the_list_from_mysql()  # something like "select mail from my_user_table"

p.map(treat_email, email_addresses)    # this will treat all the emails in the X processes
Cédric Julien
  • 78,516
  • 15
  • 127
  • 132
  • I've tried to modify your example to a real one: `code` from multiprocessing import Pool import MySQLdb p = Pool() email_addresses = [] def treat_email(email_adress): print "%s" % (email_adress) conn = MySQLdb.connect (host = "localhost", user = "YYY", passwd = "XXX", db = "ZZZ") cursor = conn.cursor() cursor.execute ("SELECT email FROM data GROUP BY email") rows = cursor.fetchall() for row in rows: email_addresses.append(row[0]) cursor.close() conn.close() p.map(treat_email, email_addresses) `code` But it fails with AttributeError: 'module' object has no attribute 'treat_email' – user649542 May 02 '11 at 09:21
  • It works with: if __name__ == '__main__': Thank you very much :) – user649542 May 02 '11 at 09:28
0

An alternative to using an ORM module you could dump the emails in a CSV file:

SELECT name, address
FROM email
INTO OUTFILE '/tmp/emails.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'

From: Dump a mysql database to a plaintext (CSV) backup from the command line

And post process the CSV file in python:

import csv
data = csv.reader(open('/tmp/emails.csv', 'rb'), delimiter=',')
for row in data:
    name, address = row
    print '%s <%s>' % (name, address)

CSV File Reading and Writing: http://docs.python.org/library/csv.html

When your dealing with very large files you might want to use the file.readlines() function to prevent Python from reading the whole file into memory:

with open('/tmp/emails.csv', 'rb') as f:
    for line in f.readlines():
        name, address = line.split(',')
        print '%s <%s>' % (name, address)
Community
  • 1
  • 1
Lars Wiegman
  • 2,397
  • 1
  • 16
  • 13