0

I have millions of records in database and I want to read it through Python and store it in pandas data frame . The problem is the select query processing time is very high. To reduce the query processing time I try to perform multi threading on it I created 3 threads and make the query on basis of each thread like

Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=0 


Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=1


Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=2

Then I run the each query with threading in Python by threading package.

But it also not reducing the time much

Is there any other approach I can take to reduce the query reading time. Note- I have used both jdbc and odbc connection

tryingToLearn
  • 10,691
  • 12
  • 80
  • 114
  • This might help - https://stackoverflow.com/questions/49658348/reading-large-tables-into-pandas-is-there-a-intermediate-step – Underoos May 31 '19 at 05:14
  • Thank you for looking into it. I am doing query to a virtualization tool named DENODO. One strange thing I noticed that when I am running the simple query select * from table, it takes same time what my threading query (mention in actual problem) is taking . I am still not sure why it is happening. I have created diffrent jdbc connection for each thread. – rishi kumar agarwal May 31 '19 at 17:08

2 Answers2

0

The below link helped me Multiprocessing with JDBC connection and pooling I can get around 25% gain on my local.machine.

-1

You can use multi-threading only if the underlying database engine supports it. You should check for that. For your question, I think the attached link will help you: see this If the answer helps you then help the community by selecting it as the best answer.

  • Thank you for looking into it. I am doing query to a virtualization tool named DENODO. One strange thing I noticed that when I am running the simple query select * from table, it takes same time what my threading query (mention in actual problem) is taking . I am still not sure why it is happening. I have created diffrent jdbc connection for each thread. – rishi kumar agarwal May 31 '19 at 17:04