0

I want to fetch all rows from a table, using the following code:

table_row_count = 1000000
batch_size = 10000

sql = """SELECT t.*
FROM (
    SELECT ROWNUM AS row_num,
           sub_t.*
    FROM (
        SELECT name_employer
        FROM my_table
        WHERE section = 'OTHER'
    ) sub_t
) t
WHERE t.row_num BETWEEN :LOWER_BOUND AND :UPPER_BOUND"""

data = []
for lower_bound in range(0, table_row_count, batch_size):
    cursor.execute(sql, {'LOWER_BOUND': lower_bound, 
                         'UPPER_BOUND': lower_bound + batch_size - 1})

    for row in cursor.fetchall():
       data.append(row)

The original source of the code: cx_Oracle: fetchall() stops working with big SELECT statements

However, it is taking forever. My data has 5 mil. of rows. Is there any other way to do this?

  • Do you need to have all of the rows in memory? If so, why are you doing it in batches? If not, you can process some of the rows (or a single row at a time) and use a simple(r) query. – Anthony Tuininga Oct 25 '22 at 18:58

1 Answers1

1

For big result sets, increase arraysize. Try something like cursor.arraysize = 10000 and then tune the size to suit your data and performance requirements.

Refer to the Tuning cx_Oracle manual.

You may also want to look at the best practices in https://github.com/cjbj/cx-oracle-notebooks

Also see the latest cx_Oracle release announcement - it's time to upgrade to python-oracledb.

Christopher Jones
  • 9,449
  • 3
  • 24
  • 48