3

I have a multithreaded program whose thread number could be customized. The program is responsible for generating HTTP requests, sending the requests to a web service, receiving response and parsing the response to get some result value.

Since it takes almost 1 sec for each request to get the response, to make the program get as many responses as possible, multiple threads are started.

Following code is used to start multithreads:

    ...
    for (int i = 0; i < threadNum; i++) {
        threadArray[i] = new Thread(requestGeneratorArray[i]);
        threadArray[i].start();
    }

    for (int i = 0; i < threadNum; i++) {
        try {
            threadArray[i].join();
        } catch (InterruptedException e) {
            logger.error(e);
        }
    }

...

When the thread number is 50, and totally 10K requests are generated and sent, the program works well. When the thread number is still 50, and the total request number is 100K. The program was hanging when 95K+ requests were sent. No exception existed, and the program just hanging there.

And then I add a few JVM arguments like this: java -Xmx1024M -Xms512M -XX:MaxPermSize=256m ... With such arguments, 50 threads/ 100K request worked. However, 50 threads/ 1M requests was hanging again. I set the thread number to 20 and request number as 1M, it worked again.

I want to set the thread number to 50, since as tested with fewer requests number (10K), 50 threads makes the highest efficiency. The request number could be much larger as 10M, 100M, event 1B. In this cases, it would not be a good idea to increase the size of -Xmx -Xms or MaxPermSize. What should I do? What's the root cause of program hanging?

===========================================================

I used Executor Service instead of threads directly. But the problem occurred as well. I rechecked the code, and found there is a mistake : I instantiated a HttpClient object for each request. I changed the code to instantiated a HttpClient instance for each thread, and the program doesn't hang anymore.

I think the root cause of program hanging is that it set up too many HTTP connections to the web service, and used out all threads of the web service. That made the web service cannot answer to any new arrived requests. Is it correct?

JuliaLi
  • 327
  • 1
  • 5
  • 16
  • What do you mean by "pending"? Do you mean hanging? There are a number of things that could affect that, not least that the webserver you are connecting to may have a maximum number of concurrent connections, meaning most of the threads in your program are waiting for the webserver to start talking to them. – John Farrelly Jan 01 '13 at 08:41
  • You could make your life easier by using a built-in threadpool, such as one of those provide by the [Executors factory](http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html) instead of managing one yourself (but it would probably not solve your issue). – assylias Jan 01 '13 at 08:50
  • How do you send the requests? How many threads are sending and how many threads are parsing the requests? – Olaf Dietsche Jan 01 '13 at 08:50
  • If you want your application to scale to an arbitrary #/requests, perhaps you might wish to consider using a [thread pool]( http://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html), instead of spawning a new thread for each request. IMHO... – paulsm4 Jan 01 '13 at 08:54
  • @JohnFarrelly: Yes, it ought to be "hanging" instead of "pending". I have confirmed with the web service developer and make sure there is no limitation of connections. – JuliaLi Jan 01 '13 at 09:59
  • @OlafDietsche: I sent the requests with HttpClient. And each thread provides the same function -- it sends requests and receives response. – JuliaLi Jan 01 '13 at 10:01
  • @assylias Thanks for your suggestion. I would try the thread pool. Thanks! – JuliaLi Jan 01 '13 at 10:02
  • @paulsm4 Thanks. I would try the thread pool. – JuliaLi Jan 01 '13 at 10:02

3 Answers3

1

It's hard to tell only from this information, but from the fact that your heap setting affects the result, my bet will be on poor scheduling between generation of content and parsing (storing) of content.

One frequent scenario in this kind of application is that the threads which generate contents generate at faster rate than the threads who take that content and store it away. This will gradually increase the amount of heap memory used to hold the content in-memory, and at some point throughput will start to plummet.

The first thing to do is to confirm this hypothesis by attaching a heap viewer like VisualVM. If you heap usage gradually increase and starts to peg at high levels and your throughput decreases, this is likely the culprit (You could also confirm that the stuff in your memory is indeed the contents generated).

Usually, the bottleneck is IO of the persistent layer used to store the content. You can have a CPU bottleneck in the parsing code (or elsewhere) depending on what your code is doing, but this is generally rare.

The most common remedy for this situation is to use a bounded queue to make the generation process wait for the parsing (storing) process to catch up. Have a look at this SO answer: How to make ThreadPoolExecutor's submit() method block if it is saturated?. You will have to learn about thread pools but it's really a vast improvement over raw threads, and it's the most clean way to deal with this kind of problems.

Community
  • 1
  • 1
Enno Shioji
  • 26,542
  • 13
  • 70
  • 109
  • If the heap was filled to capacity, wouldn't this trigger an OutOfMemoryError? – John Dvorak Jan 01 '13 at 09:41
  • @Jan Dvorak: I'm not sure how it works but I've seen apps surviving for quite a long time before they die from primarily "GC overhead reached" OOME. Perhaps the JVM GC has some strategy to use when heap becomes scarce and in some situation it works for a long time? But I'm just guessing. – Enno Shioji Jan 01 '13 at 10:08
  • The "grinding to a halt" behavior is not unknown to me. However, what I understand from the description, the application halts abruptly, not gradually. This seems to indicate lost tasks, not 99.9% time spend on GC. – John Dvorak Jan 01 '13 at 10:12
  • 1
    @JanDvorak You are right. The program actually halts abruptly. But no exception indicates any task lost. How should I do to confirm whether any task lost? – JuliaLi Jan 02 '13 at 09:37
  • @JuliaLi: First thing I would do is to take a thread dump to see what's going on. – Enno Shioji Jan 02 '13 at 10:13
1

+1 to all suggestion to use Executor Service.

You've also mentioned that you're using Http Client. There are few configuration parameters which will make http client faster for concurrent usage.

Regarding program hanging. It could be either deadlocks of large garbage collector runs. I believe tools like jconsole and Visual VM could help you debugging both these scenarios.

Petro Semeniuk
  • 6,970
  • 10
  • 42
  • 65
1

Since we don't know where the threads hang: Did you consider some kind of thread monitoring, getting periodic stack traces of your threads, either using a profiler or using ThreadMXBeans? As some of the other posters have mentioned, with any scalability issues you should have an eye on gc.log, too. And watching your memory footprint. But this may not be the problem here since even a full gc should finish eventually, whereas your program does not.

Ralf H
  • 1,392
  • 1
  • 9
  • 17
  • Yeah, it's a good idea. However, is it too complicated for such a simple program? – JuliaLi Jan 05 '13 at 07:37
  • For this one problem, actually it is :) but you should be able to reuse that code easily. VisualVM should get you thread stack traces without coding though. – Ralf H Jan 06 '13 at 01:18