0

I am trying to read a 100GB+ table in python using pymysql python package. the query I am firing is

select * from table

But I want to be able to process records in chunks instead of hitting the database for 100 GB records, below is my code

    with self.connection.cursor() as cursor:
        logging.info("Executing Read query")
        logging.info(cursor.mogrify(query))
        cursor.execute(query)
        schema = cursor.description
        size = cursor.rowcount
        for i in range((size//batch)+1):
            records = cursor.fetchmany(size=batch)
            yield records, schema

but when the query gets executed at cursor.execute(query) it tried to get those 100GB records and end up killing the process.

Is there any better way to read data in chunk from mysql using python?

esqew
  • 42,425
  • 27
  • 92
  • 132
Deepak Verma
  • 653
  • 1
  • 10
  • 24
  • https://stackoverflow.com/questions/6195988/about-mysql-cursor-and-iterator – waynetech Jul 24 '19 at 18:59
  • 2
    Any reason you can't paginate through the data using MySQL's [`LIMIT`](https://dev.mysql.com/doc/refman/8.0/en/select.html#idm139812212181472) functionality? – esqew Jul 24 '19 at 19:05
  • What I want to understand is how fetchmany executes, does it asks the mysql for these many of row or results are already there and its just does slicing? – Deepak Verma Jul 24 '19 at 19:28

0 Answers0