0

I have the following definitions:

def sql_processes(db1, infile_name, z):
    cursor = db1.cursor()
    print "Processing " + infile_name
    PrintLog("Adding " + infile_name + "to MySQL...")
    vcf_reader = vcf.Reader(open(infile_name, 'r'))
    for record in vcf_reader:
        snp_position='_'.join([record.CHROM, str(record.POS)])
        ref_F = float(record.INFO['DP4'][0])
        ref_R = float(record.INFO['DP4'][1])
        alt_F = float(record.INFO['DP4'][2])
        alt_R = float(record.INFO['DP4'][3])
        AF = (alt_F+alt_R)/(alt_F+alt_R+ref_F+ref_R)
        sql_test_query = "SELECT * from snps where snp_pos='" + snp_position + "'"
        try:
            sql_insert_table = "INSERT INTO snps (snp_pos, " + str(z) + "g) VALUES ('" + snp_position + "', " + str(AF) + ")"
            cursor.execute(sql_insert_table)
        except db1.IntegrityError, e:
            sql_insert_table = "UPDATE snps SET " + str(z) + "g=" + str(AF) + " WHERE snp_pos='" + snp_position + "'";
            cursor.execute(sql_insert_table)
        db1.commit()
    cursor.close()
    print "Processing of " + infile_name + "done!"
    PrintLog("Added " + infile_name + "to MySQL!")

def extractAF(files_vcf):
    z=6
    snp_dict=[]
    db1 = MS.connect(host="localhost",user="root",passwd="sequentia2",db="SUPER_SNP_calling")
    threads = []
    for infile_name in sorted(files_vcf):
        t = Thread(target = sql_processes, args = (db1, infile_name, z))
        threads.append(t)
        z+=1
    count_t = 1
    my_threads = []
    for t in threads:
        t.start()
        my_threads.append(t)
        if count_t == 8:
            for x in my_threads:
                x.join()
            my_threads = []
            count_t = 0
        count_t+=1

The aim here is to read multiple files at the same time and update a MySQL database. However, this throws the following error:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "./SUPER_mysql4.py", line 457, in sql_processes
    cursor.execute(sql_insert_table)
  File "/usr/local/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 202, in execute
    self.errorhandler(self, exc, value)
  File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
InterfaceError: (0, '')

*** Error in `Segmentation fault (core dumped)

Why is it happening?

user2979409
  • 773
  • 1
  • 12
  • 23

2 Answers2

0

I think this error is based on mysql lock

MySQL uses table-level locking for MyISAM, MEMORY and MERGE tables, page-level locking for BDB tables, and row-level locking for InnoDB table.Table updates are given higher priority than table retrievals. Therefore, when a lock is released, the lock is made available to the requests in the write lock queue and then to the requests in the read lock queue. This ensures that updates to a table are not “starved” even if there is heavy SELECT activity for the table. However, if you have many updates for a table, SELECT statements wait until there are no more update.

Since inserting into same table with different threads may lock the table for insert of 1st thread started.So other threads and SELECT has to wait for finish INSERT in all threads.

May this help you

for t in threads:
    t.start()
    t.join()

Thread.join

Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs

For more http://dev.mysql.com/doc/refman/5.0/en/internal-locking.html

itzMEonTV
  • 19,851
  • 4
  • 39
  • 49
0

MySQLdb does not support multiple threads writing to the same connection. You must start a new connection for each thread or in some other way make sure that every connection is only used by one thread at the same time.
See this answer for an excerpt from the MySQLdb user guide describing the problem. Or directly the parts that affect you:

The MySQL protocol can not handle multiple threads using the same connection at once. (...) The general upshot of this is: Don't share connections between threads. (...) If you let two threads use a connection simultaneously, the MySQL client library will probably upchuck and die.

Community
  • 1
  • 1
aleju
  • 2,376
  • 1
  • 17
  • 10
  • I've put `db1 = MS.connect(host="localhost",user="root",passwd="sequentia2",db="SUPER_SNP_calling")` inside `sql_processes`... This should open a new connection each time, but it throws `Deadlock error`. – user2979409 Apr 11 '15 at 13:48