1

I am making use of a third party FooClientImpl Java class, which implements the FooClient interface:

interface FooClient {
    public FooOutput processFoo(FooInput inputData);
}

which processes data passed to it by communicating over a TCP socket on a particular port number on localhost to a third-party, non-Java fooserver process. Each inputData instance can be processed standalone (i.e it's embarrassingly parallel). fooserver is rather slow but just runs on a single core, so I'd like to be able to manually fire up multiple instances of it on separate ports , and have multiple FooClientImpl instances to access it -- one for each server instance. So far this is easy.

However I'd also like to create a ConcurrentFooClientImpl class which also implements the FooClient interface, but under the hood, keeps track of a pool of FooClientImpl instances or subclass instances (maybe 8), and processes the incoming data with the first unused instance after the .processFoo() method is called (waiting until one is available if necessary). That way, other code could instantiate ConcurrentFooClientImpl and it would look the same to the calling code, only faster.

I'm very new to threading, and I've walked through the Java Concurrency tutorial, but this use case seems somewhat different from the examples there.

I think I need to create a Thread subclass FooClientImplThread which wraps a FooClientImpl instance. And I think I also should use Executors.newFixedThreadPool to manage a fixed-size pool of clients, and that should maybe have a ThreadFactory which produces instances of FooClientImplThread. And then in ConcurrentFooClientImpl.processFoo() I could wrap inputData somehow in an implementation of Callable to submit them to the pool. But these instances would need to get hold of a particular FooClient instance so they could in turn call FooClientImplThread.processFoo() and actually talk to the fooserver running on the appropriate port. I guess they could call Thread.currentThread() and cast the result to FooClientImplThread. But intuitively that feels hacky to me (although I don't know if trust my intuition here), and I'm not sure whether it would work or whether it is sensible..

Am I on the right track with my possible solution here, or is there a more sane approach I should take?

Update: I ended up implementing a solution like the one described. Surprisingly, it mostly worked as expected, except of course I realised I hadn't thought through what would be needed for background processing while maintaining the outward-facing API. As far as I can tell, this is not possible in the one-at-at-time caller-triggered API I outlined above. It would need to be replaced by a producer-consumer pattern or something, which I ended up doing, albeit in a different way.

Andy MacKinlay
  • 1,374
  • 1
  • 16
  • 23

1 Answers1

2

You are wildly overcooking this. All you need is to multithreaded the TCP part of the server, i.e. start a new thread per accepted socket. Then either those threads can all share the same FooImpl object, or each have one of their own, whichever makes sense given the way FooImpl is implemented. You don't need a new thread class; you don't need a pool of anything. NB its not correct to say that any Java object only runs in one core unless you know that it is single threaded.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Sorry, I should have been clearer (I've edited the question now). The `fooserver` process is *not* in Java -- it's a Prolog-based server in fact (although that's not particularly relevant), which I don't have control over. And as far as I can determine, it uses a single-core to do the heavy processing for a single TCP listener port. I did consider writing a Java socket server *wrapper* (or other language) to listen on single port and dispatch to one of the available `fooserver` ports but I don't think that would be any easier. – Andy MacKinlay Jan 08 '13 at 22:26
  • @strangefeatures Are you sure about this? Have you tried connecting two clients to it at the same time? It would be very unusual to write TCP server that could only handle one client at a time, in any language. – user207421 Jan 09 '13 at 02:49
  • Good point - I hadn't actually tried the simple obvious thing. However it seems that the server process is not using worker threads. Using a third-party downstream library, when I create 3 client instances on my dual-core laptop, I get a 13% speed reduction. I know I shouldn't expect 50%, but that seems to suggest I'm not getting useful parallelisation. The CPU usage of the `fooserver` process also doesn't top 100%. (I can think of one language where a naive TCP server might not support concurrency - Python with its [GIL](http://stackoverflow.com/questions/1912557/a-question-on-python-gil) – Andy MacKinlay Jan 09 '13 at 05:25
  • I can now confirm that is indeed the case that I get a substantial (~45%) speedup when there are multiple client threads *and* there are multiple server instances listening on multiple ports (one per client thread), so it looks like this is an instance of an unusually written server. So the kind of convolutions I described in the question would be useful (except I've ended up implementing them in a different way using some third-party code) – Andy MacKinlay Jan 10 '13 at 01:54
  • @strangefeatures As you are not getting a 3x slowdown, obviously the server *is* multithreaded, although perhaps not very well. You must be getting some concurrency or it would all take three times as long for three clients. So I still think you're barking up the wrong tree. – user207421 Jan 10 '13 at 02:07
  • Er, why? My understanding of threading is not great - can you explain why I should be seeing a 3x slowdown (pointers to web resources are most welcome)? If the server blocks while it's processing another client request, I would have thought the remaining clients would simply wait until the server is happy to respond to their request? There's going to be some pointless memory overhead to be sure, but I can't work out why it should be substantially slower. – Andy MacKinlay Jan 10 '13 at 23:57
  • (Also, since the multiple port/multiple server process solution *does* scale across multiple CPUs, while the single process solution doesn't, I don't think it's entirely accurate to say that I'm barking up the wrong tree, although I'm more than willing to be convinced that I'm doing it for the wrong reason) – Andy MacKinlay Jan 11 '13 at 00:00
  • @strangefeatures If the server wasn't multithreaded, it would handle clients one at a time, i.e. sequentialize them, so two concurrent clients would take 2x the time of one client, three clients would take 3x, etc. You aren't seeing that so, there *must* be some concurrency at the server. – user207421 Jan 11 '13 at 22:58
  • Ah, I think I wasn't being clear. I have a fixed number of data instances which I need processed, (around 45), and I either split across 3 clients, or concentrate into a single client. Your answer makes sense in the scenario where I load all 45 instances into all clients, but that's not what I'm doing here -- I presume that's how you were interpreting my description? – Andy MacKinlay Jan 13 '13 at 01:11
  • @strangefeaturss. Yes. So now it isn't clear that you are comparing apples with apples. – user207421 Jan 20 '13 at 23:25
  • Right, but I'm not particularly concerned with the most rigorous scientific comparison. If this solution enables me to decrease the processing time for a fixed number of instances by a factor of N when I have N processing cores available (which is what I want to do, and what I was testing) surely that's the most important test to run? What would a more "apples to apples" comparison would usefully tell me here (ie processing multiple copies of the same data concurrently), since ultimately I'm interested in practical speedups over a fixed data set using parallellisation? – Andy MacKinlay Jan 22 '13 at 00:32