3

I am fairly new to Python and MySQL. I am writing code that queries 60 different tables each containing records for each second in a five minute period. The code executes every five minutes. A few of the queries can reach 1/2 MB of data but most are in the 50 KB range. I am running on a workstation running Windows 7,64-bit using MySQL Connector/Python. I am testing my code using PowerShell windows but the code will eventually run as a scheduled task. The workstation has plenty of RAM (8 GB). Other processes are running but according to the Task Manager only half of memory is being used. Mostly everything performs as expected but sometimes processing hangs. I have inserted print statements in the code (I've also used debugger tracing) to determine where the hang occurs. It is occurring on a call to fetchall. Below is the germane parts of the code. All CAPS are (pseudo)constants.

mncdb = mysql.connector.connect(
    option_files=ENV_MCG_MYSQL_OPTION_FILE,
    option_groups=ENV_MCG_MYSQL_OPTION_GROUP,
    host=ut_get_workstation_hostname(),
    database=ENV_MNC_DATABASE_NAME
    )
for generic_table_id in DBR_TABLE_INDEX:
    site_table_id = DBR_SITE_TABLE_NAMES[site_id][generic_table_id]
    db_cursor = mncdb.cursor() 
    db_command = (
                  "SELECT *"
                  +" FROM "
                  +site_table_id
                  +" WHERE "
                  +DBR_DATETIME_FIELD
                  +" >= '"
                  +query_start_time+"'"
                  +" AND "
                  +DBR_DATETIME_FIELD
                  +" < '"
                  +query_end_time+"'"
                 )
    try:
        db_cursor.execute(db_command)
        print "selected data for table "+site_table_id
        try:
            table_info = db_cursor.fetchall()
            print "extracted data for table "+site_table_id
        except:
            print "DB exception "+formatExceptionInfo()
            print "FETCH failed to return any rows..."
            table_info = []
            raise
    except:
        print "uncaught DB error "+formatExceptionInfo()
        raise

. . . other processing that uses the data . . . db_cursor.close() mncdb.close() . . . No exceptions are being raised. In a separate PowerShell window I can access the data being processed by the code. For my testing all data in the database is loaded before the code is executed. No processes are updating the database while the code is being tested. The hanging can occur on the first execution of the code or after several hours of execution.

My question is what could be causing the code to hang on the fetchall statement?

halr9000
  • 9,879
  • 5
  • 33
  • 34

2 Answers2

0

You can alleviate this by setting the fetch size:

mncdb = mysql.connector.connect(option_files=ENV_MCG_MYSQL_OPTION_FILE, option_groups=ENV_MCG_MYSQL_OPTION_GROUP,host=ut_get_workstation_hostname(,database=ENV_MNC_DATABASE_NAME, cursorclass = MySQLdb.cursors.SSCursor)

But before you do this, you should also use the mysql excuse for prepared statements instead of string concatenation when building your statement.

Community
  • 1
  • 1
hd1
  • 33,938
  • 5
  • 80
  • 91
0

Hanging could involve the MySQL tables themselves and not specifically the Python code. Do they contain many records? Are they very wide tables? Are they indexed on the datetime_field?

Consider various strategies:

  1. Specifically select the needed columns instead of the asterisk, calling all columns.

  2. Index on the DBR_DATETIME_FIELD being used in the where clause (i.e., implicit join).

  3. Diagnose further with printed timers print(datetime.datetime.now()) to see which are the bottleneck tables. In doing so, be sure to import the datetime module.

Parfait
  • 104,375
  • 17
  • 94
  • 125