2

I have a web server program written in java and my boss wants it to run faster.

Iv always been happy if it ran without error so efficiency is new to me.

I tried a profiler but it crashed my computer and turned out to be a dead opensource project.

I have no idea what I am doing except from reading a few questions on here. I see that re factoring code is the best option but Im not sure how to go about that and that i need a profiler to see what code to re factor.

So does anyone know of a free profiler that I can use ? Im using java and eclipse. if possible some instructions or a like to easy instruction would be great.

But what I really want if anyone can give it is a basic introduction to the subject so I can understand enough to go do in depth research on the subject to get the best results.

I am a complete beginner when it comes to optimising code and the subject seems very complex from what I have seen so far, any help with how to get started would be greatly appreciated.

Im new to java as well so saying things like check garbage collection would mean nothing to me, id need a more detailed explanation.

EDIT: the program uses tomcat for the networking. it connects to an SQL database. the main function is a polling loop which checks all attached devices on the network, reads events from them writes the event to the database and the performs the event functions.

I am trying to improve the polling loop. the program is heavily multithreaded and uses a lot of interfaces and proxies so it is hart to see were code goes the farther you get from the polling loop.

I hope this information helps you offer solutions. also I did not build it, I inherited the code.

Skeith
  • 2,512
  • 5
  • 35
  • 57
  • 6
    Rules of Optimization: Rule 1 - Don’t do it. Rule 2 (for experts only) - Don’t do it yet – Tom Squires Aug 31 '11 at 12:25
  • Start by telling people your *exact* environment. For example: what application server do you use? Tomcat? Websphere? WebLogic? Then, how is your application deployed to the server. Can you run it from within Eclipse? Can you generate a realistic load in a development environment? – parsifal Aug 31 '11 at 12:26
  • 1
    And, by the way, which profiler you have used? As far as I know, in java world VisualVM is de-facto. – om-nom-nom Aug 31 '11 at 12:27
  • 1
    It's axiomatic that you always optimize the wrong thing. The code itself is rarely the bottleneck -- it's usually I/O of some sort. – Hot Licks Aug 31 '11 at 12:27
  • When you say web server program do you mean that it is an actual web server, a service running in a web server, etc. Could you give a little more information about what this program does? – John Kane Aug 31 '11 at 12:29
  • And, the first (OK, maybe third) rule of optimization is simply: Don't do stupid things. A classic is repeatedly recreating a string as you add and subtract characters, vs using a StringBuffer. – Hot Licks Aug 31 '11 at 12:29

9 Answers9

2

First of all detect the bottlenecks. There is no point in optimizing a method from 500ms to 400ms when there is a method running for 5 seconds, when it should run for 100ms.

You can try using the VisualVM as a profiler, which is built-in in the JDK.

Petar Minchev
  • 46,889
  • 11
  • 103
  • 119
1

If you want a free profiler, use VisualVM when comes with Java. It is likely to be enough.

You should ask your boss exact what he would like to go faster. There is no point optimising random pieces of code he/she might not care about. (Its easily done)

You can also log key points in you task/request to determine what it spends the most time doing.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • i figured that if i got the whole program running more efficiently then every action would benefit, am i wrong ? – Skeith Aug 31 '11 at 12:35
  • That is true, but it could mean 10-20x times the work compared with only optimising the key piece of software which matters. You might have to only re-write one method, instead of re-writing your whole application. ;) – Peter Lawrey Aug 31 '11 at 12:38
  • @Skeith: Ever heard of the 80-20 rule? When optimizing software, there's an offspin of that rule, namely "80% of the time is spent in 20% of the code" and the basic idea rarely completely wrong (and if it is, you can only know it by proving it with a profiler). While you can try optimizing some random piece of code, the chances that speeding it up by 100% has any significant effect on the overall performance are rather bad. Speeding up some of those 20% has a much greater effect, and as it's also less code to improve, it's likely much more rewarding. –  Aug 31 '11 at 12:39
  • @Skeith, I once improve the perform of a major web site by 10% by changing one argument. It was performing a search and sorting by Date. The Date was initialised using `new Date()` and `setTime()` By just changing the line to `new Date(0)` (which doesn't look up the current time to be ignored later) the program ran 10% faster. I would never have found this without a profiler. – Peter Lawrey Aug 31 '11 at 12:48
0

If your new to java then Optimization sounds like a bad idea. Its very easy to get wrong. Its not trivial to rewrite code and keep all the outputs the same while changing the inner workings.

Possibly have a look at your stored procedures and replace any IN statments with INNER JOIN. Thats a fairly low risk and high reward way of speeding thing up.

Tom Squires
  • 8,848
  • 12
  • 46
  • 72
0
    Start by identifying the time taken by various steps in your application (use logging to identify). Notice if there is anything unusual.
    Step into each of these steps to see if there are any bottlenecks. Identify if something can be cached to save a db call. Identify if there is scope of parallelism by breaking down your tasks into independent units.

Hope you have some unit/ integration tests to ensure you don't accidentally break anything.

Scorpion
  • 3,938
  • 24
  • 37
  • 1. how can logging tell time taken ? 2. dont know how to "step into" 3. what is "can be cashed" – Skeith Aug 31 '11 at 12:33
  • This depends on your applications's logging. The application log should for any decent server application suggest the time taken for important steps. By step into, I meant look into the code of the modules/ functionality taking more than time than the others i.e. the suspect code. "Cached" means keeping in memory, so if there is an entity which is not huge in number (say type of sports in a sports application) you can retrieve the data in-memory rather than making db calls. Hope it helps. – Scorpion Aug 31 '11 at 12:37
  • 1
    I think he means; by logging he means that you calculate the time yourself then print either to the console or a log file. Step into means look over the steps which took long time according to the logger.Caching would be to intermediate save the results for reuse later on. –  Aug 31 '11 at 12:39
0
  1. Measure (with a profiler - as others suggested, VisualVM is good) and locate the spots where your program spends most of its time.
  2. Analyze the hot spots and try to improve their performance.
  3. Measure again to verify that your changes had the expected effect.
  4. If needed, repeat from step 1.
Péter Török
  • 114,404
  • 31
  • 268
  • 329
0

Start very simple.

  • Make a list of whats slow from a user perspective.
  • Try to do high level profiling yourself. Maybe an interceptor that prints the run time for your actions.
  • Then profile only those actions with Start time = System.currentTime...
  • This easy way could be a starting point into more advanced profiling and if your lucky it may fix your problems.
  • how would i do the intercertor thing ? that sounds promising – Skeith Aug 31 '11 at 12:43
  • It's a whole topic unfortunately and something you will have to Google but the principle is something that automatically runs before and after you process a request. So the time from the start interceptor to the end interceptor will be your profiling time. If you are familiar with Aspect it's something like that. –  Aug 31 '11 at 12:47
0

Before you start optimizing, you have to understand your workload, and you have to be able to recreate that workload. One easy way to do that is to log all requests, in production, with enough detail that you can recreate the requests in a development environment.

At the same time that you log your load, you can also log the performance of those requests: the time from the start of the request to the end. One way to do that (and, incidentally, to capture the data needed to log the request) is to add a servlet filter into your stack.

Then you can start to think about optimization.

  1. Establish performance goals. Simply saying "make it faster" is pointless. Instead, you need to establish goals such as "all pages should respond within 1.5 seconds, as long as there are less than 100 concurrent users."
  2. Identify the requests that fail your performance goals. Focus on the biggest failure first.
  3. Identify why the request takes so long.

To do #3, you need to be able to recreate load in a development environment. Then you can either use a profiler, or simply add trace-level logging into your application to find out how long each step of the process takes.


There is also a whole field of holistic optimization, of which garbage collection tuning is probably the most important. But again, you need to establish and replicate your workload, otherwise you'll be flailing.

parsifal
  • 1,507
  • 8
  • 7
0

EDIT: the program uses tomcat for the networking. it connects to an SQL database. the main function is a polling loop which checks all attached devices on the network, reads events from them writes the event to the database and the performs the event functions.

I am trying to improve the polling loop. the program is heavily multithreaded and uses a lot of interfaces and proxies so it is hart to see were code goes the farther you get from the polling loop

This sounds like you have a heavily I/O bound application. There really isn't much that you can do about that because I/O bound applications aren't inefficiently using the CPU--they're stuck waiting for I/O operations on other devices to complete.

FWIW, this scenario is actually why a lot of big companies are contemplating moving toward cheap, ARM-based solutions. They're wasting a lot of power and resources on powerful x86 CPUs that get underutilized while their code sits there waiting for a remote MySQL or Oracle server to finish doing its thing. With such an application, why throw more CPU than you need?

Mike Thomsen
  • 36,828
  • 10
  • 60
  • 83
0

When starting to optimize an application, the main risk is to try to optimize every step, which does often not improve the program efficiency as expected and results in unmaintainable code.

It is likely that 80% of the execution time of your program is caused by a single step, which is itself only 20% of the code base.

The first thing to do is to identify this bottleneck. For example, you can log timestamps (using System.nanoTime and/or System.currentTimeMillis and you favorite logging framework) to do this.

Once the step has been identified, try to write a test class which runs this step, and run it with a profiler. I have good experience with both HPROF (http://java.sun.com/developer/technicalArticles/Programming/HPROF.html) although it might require some time to get familiar with, and Eclipse Test and Performance Tools Platform (http://www.eclipse.org/tptp/). If you have never used a profiler, I recommend you start with Eclipse TPTP.

The execution profile will help you find out in what methods your program spends time. Once you know them, look at the source code, and try to understand why it is slow. It might be because (this list is not exhaustive) :

  • unnecessary costly operations are performed,
  • a sub-optimal algorithm is used,
  • the algorithm generates lots of objects, thus giving a lot of work to the garbage collector (especially true for objects which have a medium to long life expectancy).

If there is no visible defect in the code, then you might consider :

  • making the algorithm more parallel in order to leverage all your CPUs
  • buying faster hardware.

Regarding JVM options, the two most important ones for performance are :

  • -server, in order to use the server VM (enabled by default depending on the hardware) which provides better performance at the price of a slower startup (http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client),
  • -Xms and -Xmx which define the heap size available on startup, and the maximum amount of memory that the JVM can use. If the JVM is not given enough memory, garbage collection will use a lot of your CPU resources, slowing down your program, however if the JVM already has enough memory, increasing the heap size will not improve performance, and might even cause longer GC pauses. (http://stackoverflow.com/questions/1043817/speed-tradeoff-of-javas-xms-and-xmx-options)

Other parameters usually have lower impact, you can consult them at http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html.

jpountz
  • 9,904
  • 1
  • 31
  • 39
  • could you explain your last paragraph more please, I dont understand what it is talking about. thank you for such a detailed answer – Skeith Aug 31 '11 at 13:20