11

Recently i have developed a billing application for my company with Python/Django. For few months everything was fine but now i am observing that the performance is dropping because of more and more users using that applications. Now the problem is that the application is now very critical for the finance team. Now the finance team are after my life for sorting out the performance issue. I have no other option but to find a way to increase the performance of the billing application.

So do you guys know any performance optimization techniques in python that will really help me with the scalability issue

Guys we are using mysql database and its hosted on apache web server on Linux box. Secondly what i have noticed more is the over all application is slow and not the database transactional part. For example once the application is loaded then it works fine but if they navigate to other link on that application then it takes a whole lot of time.

And yes we are using HTML, CSS and Javascript

fear_matrix
  • 4,912
  • 10
  • 44
  • 65
  • 3
    Before giving you any technic, we must know what is the bottleneck of your application is. – Bite code Mar 30 '10 at 14:10
  • 2
    What is the rest of your architecture? mod_wsgi? Python? MySQL? SQLite? – S.Lott Mar 30 '10 at 15:22
  • What is the rest of your architecture? mod_wsgi? mod_python? How is your static content served? How many python processes are attached to Apache? – S.Lott Mar 30 '10 at 18:48

9 Answers9

11

As I said in comment, you must start by finding what part of your code is slow.

Nobody can help you without this information.

You can profile your code with the Python profilers then go back to us with the result.

If it's a Web app, the first suspect is generally the database. If it's a calculus intensive GUI app, then look first at the calculations algo first.

But remember that perf issues car be highly unintuitive and therefor, an objective assessment is the only way to go.

Bite code
  • 578,959
  • 113
  • 301
  • 329
  • 1
    True, with one caveat. You said "finding out what part of your code is slow". There's nothing to indicate, yet, that *his code* is slow. We only know that the *system* is slow. If you were, say, running a Python web app on a computer with 32MB of RAM and 100K users, you could "optimize" your Python code until you're blue in the face, but the system will still be slow. – Ken Mar 30 '10 at 16:54
  • 1
    Lol. So True. It remember the soty of this guy who spend six months so his programm will consume only 256 of RAM. He proudly introduces the improvement to management which responded : "why didn't you just bought 1G RAM" ? – Bite code Mar 30 '10 at 18:56
  • 2
    I walked into a meeting of 5 DBA's debating ways to save a 100Gb in a data warehouse project. I said we could all drive down the Apple Store and buy 1Tb of disk for less than what this meeting was costing the company. Intense effort spent in the wrong place is time wasted. – S.Lott Mar 30 '10 at 19:58
6

ok, not entirely to the point, but before you go and start fixing it, make sure everyone understands the situation. it seems to me that they're putting some pressure on you to fix the "problem".

well first of all, when you wrote the application, have they specified the performance requirements? did they tell you that they need operation X to take less than Y secs to complete? Did they specify how many concurrent users must be supported without penalty to the performance? If not, then tell them to back off and that it is iteration (phase, stage, whatever) one of the deployment, and the main goal was the functionality and testing. phase two is performance improvements. let them (with your help obviously) come up with some non functional requirements for the performance of your system.

by doing all this, a) you'll remove the pressure applied by the finance team (and i know they can be a real pain in the bum) b) both you and your clients will have a clear idea of what you mean by "performance" c) you'll have a base that you can measure your progress and most importantly d) you'll have some agreed time to implement/fix the performance issues.

PS. that aside, look at the indexing... :)

rytis
  • 2,649
  • 22
  • 27
  • 1
    +1 You should ask your user what they want you to improve. Plus, you must make them understand that performance is a feature must be treated as such. – Bite code Mar 30 '10 at 14:38
4

A surprising feature of Python is that the pythonic code is quite efficient... So a few general hints:

  • Use built-ins and standard functions whenever possible, they're already quite well optimized.
  • Try to use lazy generators instead one-off temporary lists.
  • Use numpy for vector arithmetic.
  • Use psyco if running on x86 32bit.
  • Write performance critical loops in a lower level language (C, Pyrex, Cython, etc.).
  • When calling the same method of a collection of objects, get a reference to the class function and use it, it will save lookups in the objects dictionaries (this one is a micro-optimization, not sure it's worth)

And of course, if scalability is what matters:

  • Use O(n) (or better) algorithms! Otherwise your system cannot be linearly scalable.
  • Write multiprocessor aware code. At some point you'll need to throw more computing power at it, and your software must be ready to use it!
fortran
  • 74,053
  • 25
  • 135
  • 175
  • The last one is a good point in the general case, but unfortunately you cannot write multiprocessor aware code in python because of it's internal structure. Also this is what prevents python from being my preferred language :( – Dacav Mar 30 '10 at 16:55
  • 1
    @Dacav - You can certainly write multiprocessor aware code in Python and pretty much any language. It's lot more awkward in Python than it needs to be due to the GIL, but if you stop thinking of threads and shared memory and instead think of processes and message passing, your problems are significantly diminished. – Kylotan Mar 31 '10 at 10:56
  • 1
    That's it, a single Python instance cannot run threads with real parallelism, but you can have different processes cooperating. There are modules to ease this job as PP (http://www.parallelpython.com/), plus you have bindings for MPI and CORBA too – fortran Mar 31 '10 at 11:18
  • @Kylotan, @fortran Thanks guys, I'll have a look at it. – Dacav Mar 31 '10 at 12:02
  • WSGI apps work in Apache MPM-worker mode: it runs n separate worker processes. Each worker can serve requests via threads. A worker can have several threads alive at a time. PHP runs in prefork: a process can be in two states: idle and serving (one request). The threads help in IO/DB intensive apps, the processes help in case of computing-intensive work (the GIL causes problems with these only, AFAIK). – Jesvin Jose Dec 24 '11 at 05:08
2

before you can "fix" something you need to know what is "broken". In software development that means profiling, profiling, profiling. Did I mention profiling. Without profiling you don't know where CPU cycles and wall clock time is going. Like others have said to get any more useful information you need to post the details of your entire stack. Python version, what you are using to store the data in (mysql, postgres, flat files, etc), what web server interface cgi, fcgi, wsgi, passenger, etc. how you are generating the HTML, CSS and assuming Javascript. Then you can get more specific answers to those tiers.

1

You may be interested in this document I've found some time ago. As personal advice, be as more pythonic as you can: lazy evaluation is the keyword, so learn to use iterators and generators.

Dacav
  • 13,590
  • 11
  • 60
  • 87
0

For the type of application you are describing (a web application probably backed by a database) your performance problems are unlikely to be language specific. They are far more likely to stem from design or architecture issues, though they could be simple coding problems too.

To sort this out you need to figure out where the bottlenecks are in your application and for that you need some sort of profiler.

Once you have found your bottlenecks you will be in a much better position. You can evaluate then problem areas for common issues including:

The specifics of any solution are going to depend on the specifics of the bottlenecks your find.

Community
  • 1
  • 1
Tendayi Mawushe
  • 25,562
  • 6
  • 51
  • 57
0

http://wiki.python.org/moin/PythonSpeed/PerformanceTips

I optimized some python code a while back, the most surprising thing to me was how much each function call costs. If you minimize function calls or replace loops with builtins you'll be running much faster.

Dan Olson
  • 22,849
  • 4
  • 42
  • 56
0

There are some great suggestions here… So let me suggest an implementation detail. I have found the runprofileserver command found in django-command-extensions very convenient for profiling my Django code.

David Wolever
  • 148,955
  • 89
  • 346
  • 502
0

I am not sure if this would solve the problem but you should have a look at psyco

anijhaw
  • 8,954
  • 7
  • 35
  • 36