2

I'm writing a script to access data in an established database and unfortunately, I'm breaking the DB. I'm able to recreate the issue from the command line:

    [user@box tmp]# python
    Python 2.7.2 (default, Sep 19 2011, 15:02:41) 
    [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pgdb
    >>> db = pgdb.connect('localhost:my_db:postgres')
    >>> cur = db.cursor()
    >>> cur.execute("SELECT * FROM mytable LIMIT 10")
    >>> cur.close()
    >>> 

At this point any activity to mytable is greatly degraded and "select * from pg_stat_activity" shows my connection as "IDLE in transaction". If I call db.close() everything is fine, but my script loops infinitely and I didn't think I'd need to open and close the db connection with each loop. I don't think it has anything to do with the fact that I'm not using the data above as in my real script I am calling fetchone() (in a loop) to process the data. I'm not much of a DB guy so I'm not sure what other info would be useful. My postgres version is 9.1.0 and python is 2.7.2 as shown above.

fredsnertz
  • 453
  • 1
  • 4
  • 6

2 Answers2

2

Try calling db.rollback() before you close the cursor (or if you're doing a write operation, db.commit()).

Amber
  • 507,862
  • 82
  • 626
  • 550
2

I suggest using psycopg2 instead of pgdb. pgdb uses the following semantics:

connect() -> open database connection, begin transaction
commit() -> commit, begin transaction
rollback() -> rollback, begin transaction
execute() -> execute statement

psycopg2, on the other hand, uses the following semantics:

connect() -> open database connection
commit() -> commit
rollback() -> rollback
execute() -> begin transaction unless already in transaction, execute statement

so, as Amber mentioned, you can do a rollback or commit after your select statement and terminate the transaction. Unfortunately, with pgdb, you will immediately start a new transaction after you rollback or commit (even if you haven't performed any work).

For many database systems, pgdb's behavior is fine, but because of the way PostgreSQL handles transactions, it can cause trouble for you if you've got lots of connections accessing the same tables (trouble specifically with vacuum).

Why does pgdb start a transaction right away? The Python DB-API (2.0) spec calls for it to do so. Seems kind of silly to me, but that's the way the spec is written.

jlp
  • 849
  • 4
  • 11