0

[Using Python3.x] The basic idea is that I have to run a first query to pull a long list of IDs (text) (about a million IDs) and use those IDs in an IN() clause in a WHERE statement in another query. I'm using python string formatting to make this happen, and works well if the number of IDs is small - say 100k - but gives me an error (pyodbc.Error: ('08S01', '[08S01] [MySQL][ODBC 5.2(a) Driver][mysqld-5.5.31-MariaDB-log]MySQL server has gone away (2006) (SQLExecDirectW)')) when the set is indeed about a million IDs long.

I tried to read into it a bit and think it might have something with the default(?) limits set by SQLite. Also I am wondering if I'm approaching this in the right way anyway.

Here's my code:

Step 1: Getting the IDs

def get_device_ids(con_str, query, tb_name):

    local_con = lite.connect('temp.db')
    local_cur = local_con.cursor()

    local_cur.execute("DROP TABLE IF EXISTS {};".format(tb_name))
    local_cur.execute("CREATE TABLE {} (id TEXT PRIMARY KEY, \
        lang TEXT, first_date DATETIME);".format(tb_name))

    data = create_external_con(con_str, query)

    device_id_set = set()

    with local_con:
        for row in data:
            device_id_set.update([row[0]])
            local_cur.execute("INSERT INTO srv(id, lang, \
                first_date) VALUES (?,?,?);", (row))
        lid = local_cur.lastrowid
        print("Number of rows inserted into SRV: {}".format(lid))

    return device_id_set

Step 2: Generating the query with 'dynamic' IN() clause

def gen_queries(ids):
    ids_list = str(', '.join("'" + id_ +"'" for id_ in ids))

    query = """
    SELECT      e.id,
                e.field2,
                e.field3
    FROM        table e
    WHERE       e.id IN ({})
    """.format(ids_list)

    return query

Step 3: Using that query in another INSERT query

This is where things go wrong

def get_data(con_str, query, tb_name):

    local_con = lite.connect('temp.db')
    local_cur = local_con.cursor()

    local_cur.execute("DROP TABLE IF EXISTS {};".format(tb_name))
    local_cur.execute("CREATE TABLE {} (id TEXT, field1 INTEGER, \
        field2 TEXT, field3 TEXT, field4 INTEGER, \
        PRIMARY KEY(id, field1));".format(tb_name))

    data = create_external_con(con_str, query) # <== THIS IS WHERE THAT QUERY IS INSERTED

    device_id_set = set()

    with local_con:
        for row in data:
            device_id_set.update(row[1])
            local_cur.execute("INSERT INTO table2(id, field1, field2, field3, \
                field4) VALUES (?,?,?,?,?);", (row))
        lid = local_cur.lastrowid
        print("Number of rows inserted into table2: {}".format(lid))

Any help is very much appreciated!

Edit

This is probably the right solution to my problem, however when I try to use "SET SESSION max_allowed_packet=104857600" I get the error: SESSION variable 'max_allowed_packet' is read-only. Use SET GLOBAL to assign the value (1621). Then when I try to change SESSION to GLOBAL i get an access denied message.

Community
  • 1
  • 1
Matthijs
  • 779
  • 1
  • 8
  • 19
  • Why would SQLite output a MySQL message? Which database are you actually using? – CL. Sep 05 '13 at 13:47
  • It seems the ids are coming from a query to a DB server (`data = create_external_con(con_str, query)`). Is that the same DB you're using for the second query? Then the trick would be to not transfer the IDs at all, but use a `JOIN` or a [nested SQL query](http://stackoverflow.com/questions/12467354/nesting-queries-in-sql) – cfi Sep 05 '13 at 13:51
  • 1
    Yes that would work technically, however the the tables/schema are in different databases. Hence the reason I need to programmatically connect everything together. The external connection goes to a MySQL database – Matthijs Sep 05 '13 at 14:11

1 Answers1

0

Insert the IDs into a (temporary) table in the same database, and then use:

... WHERE e.ID IN (SELECT ID FROM TempTable)
CL.
  • 173,858
  • 17
  • 217
  • 259
  • This is not possible. I cannot insert the table in the same database as the one that I'm trying to query. – Matthijs Sep 05 '13 at 13:57