0

I am using ubuntu. I have some management commands which when run, does lots of database manipulations, so it takes nearly 15min.

My system monitor shows that my system has 4 cpu's and 6GB RAM. But, this process is not utilising all the cpu's . I think it is using only one of the cpus and that too very less ram. I think, if I am able to make it to use all the cpu's and most of the ram, then the process will be completed in very less time.

I tried renice , by settings priority to -18 (means very high) but still the speed is less.

Details:

its a python script with loop count of nearly 10,000 and that too nearly ten such loops. In every loop, it saves to postgres database.

user2139745
  • 1,721
  • 4
  • 19
  • 30
  • "lots of database manipulations" sounds like the bottleneck is I/O and not CPU. You could try using `ionice` to increase the I/O priority of your process, but without more information on what those database manipulations actually are it's only a rough guess. – robertklep Mar 28 '13 at 09:28
  • its a python script with loop count of nearly 10,000 and that too nearly ten such loops. In every loop, it saves to postgres database. – user2139745 Mar 28 '13 at 09:33
  • It sounds like you could benefit from a refactoring of your code. – robertklep Mar 28 '13 at 09:36
  • Refactoring is a different story, but what I mean to say is, it is running slowly in my system, but running faster in the server(usually high configuration). But, my laptop has 4cpus and 6gb ram and it is using only some part. There must be a definite solution for this. – user2139745 Mar 28 '13 at 10:05
  • Did you try `ionice`? Also, MattWritesCode's answer contains useful suggestions. – robertklep Mar 28 '13 at 10:07
  • Do you have `DEBUG = True` in your settings? – Burhan Khalid Dec 02 '13 at 08:44

2 Answers2

0

If you are looking to make this application run across multiple cpu's then there are a number of things you can try depending on your setup.

The most obvious thing that comes to mind is making the application make use of threads and multiprocesses. This will allow the application to "do more" at once. Obviously the issue you might have here is concurrent database access so you might need to use transactions (at which point you might loose the advantage of using multiprocesses in the first place).

Secondly, make sure you are not opening and closing lots of database connections, ensure your application can hold the connection open for as long as it needs.

thirdly, Ensure the database is correctly indexed. If you are doing searches on large strings then things are going to be slow.

Fourthly, Do everything you can in SQL leaving little manipulation to python, sql is horrendously quick at doing data manipulation if you let it. As soon as you start taking data out of the database and into code then things are going to slow down big time.

Fifthly, make use of stored procedures which can be cached and optimized internally within the database. These can be a lot quicker than application built queries which cannot be optimized as easily.

Sixthly, dont save on each iteration of a program. Try to produce a batch styled job whereby you alter a number of records then save all of those in one batch job. This will reduce the amount of IO on each iteration and speed up the process massivly.

Django does support the use of a bulk update method, there was also a question on stackoverflow a while back about saving multiple django objects at once.

Community
  • 1
  • 1
Matt Seymour
  • 8,880
  • 7
  • 60
  • 101
0

Just in case, did you run the command renice -20 -p {pid} instead of renice --20 -p {pid}? In the first case it will be given the lowest priority.

kta
  • 19,412
  • 7
  • 65
  • 47