5

So if I do

 import MySQLdb
 conn = MySQLdb.connect(...)
 cur = conn.cursor()
 cur.execute("SELECT * FROM HUGE_TABLE")
 print "hello?"
 print cur.fetchone() 

It looks to me that MySQLdb gets the entire huge table before it gets to the "print". I previously assumed it did some sort of "cursor/state" lazy retrieval in the background, but it doesn't look like it to me. Is this right? If so is it because it has to be this way or is this due to a limitation of the MySQL wire protocol? Does this mean that java/hibernate behave the same way?

I guess I need to use the "limit 1" MySQL clauses and relatives if I want to walk through a large table without pulling in the whole thing at once? Or no? Thanks in advance.

Aaron Watters
  • 2,784
  • 3
  • 23
  • 37

3 Answers3

4

In the _mysql module, use the following call:

conn.use_result()

That tells the connection you want to fetch rows one by one, leaving the remainder on the server (but leaving the cursor open).

The alternative (and the default) is:

conn.store_result()

This tells the connection to fetch the entire result set after executing the query, and subsequent fetches will just iterate through the result set, which is now in memory in your Python app. If your result set is very large, you should consider using LIMIT to restrict it to something you can handle.

Note that MySQL does not allow another query to be run until you have fetched all the rows from the one you have left open.

In the MySQLdb module, the equivalent is to use one of these two different cursor objects from MySQLdb.cusrors:

  • CursorUseResultMixIn
  • CursorStoreResultMixIn
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • thanks, but... the use_result() method is present, but I don't see it actually changing the timing behavior in my test script. Does it actually work? – Aaron Watters Jan 14 '10 at 21:15
  • All the documentation and examples at multiple sites says so, but I haven't measured it myself with Python to be sure. Could it be that the delay is due to executing the query, before any results are fetched? – Bill Karwin Jan 14 '10 at 21:26
  • If it caches the whole "select * from table" on the server before sending any results to the client then that's not much help. I just tried a "select * from table" with no restrictions on a table with 13M rows or so and it just sat there... "select * from table limit 1" by contrast was fast... – Aaron Watters Jan 14 '10 at 21:54
0

This is correct in every other language I've used. The fetchone is just going to only retrieve the first row of the resultset which in this case is the entire database. It's more of a convenience method than anything, it's designed to be easier to use if you KNOW that there's only one result coming down or you only care about the first.

Chuck Vose
  • 4,560
  • 24
  • 31
0

oursql is an alternative MySQL DB-API interface that exposes a few more of the lower-level details, and also provides other facilities for dealing with large datasets.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358