I’m developing a network proxy application using Java 8. For ingress, the main logic is the data-processing-loop: getting a packet in the inbound queue, processing the content data (e.g. protocol-adoption), and put it in the send-queue. Multi virtual TCP channels are allowed in the design, so a data processing thread, among a list of data-processing threads, handles a bunch of channels at a specific time duration, as a part of the whole job (e.g., for the channels with channel.channelId%NUM_DATA_PROCESSING_THREADS = 0
, which is determined by a load-balancing scheduler). Channels are stored in an array and accessed by using the channeled
as the index of the cell, which is wrapped by a class that provides methods like register
, deregister
, getById
, size
, etc., and the instance is called CHANNEL_STORE
in the program. I need to use these methods in the main logic (data-processing-loop) by different threads (at least dispatcher thread, data processing thread, and the control operation thread for destroying a channel from the GUI). Then I need to consider concurrency among these threads. I have several candidate-approaches:
Use
synchronized
or reentrant locks surrounding theregister
,deregister
,getById
, etc. This is the simplest and its thread-safe. But I have performance concerns about the lock (CAS) mechanisms since I need to perform the operations on theCHANNEL_STORE
(especiallygetById
) at a very high frequency.Designate the operations of
CHANNEL_STORE
to a SingleThreadExecutor byexecutor.execute(runnable)
and/orexecutor.submit(callable)
. The concern is the performance of creating runnable/callables at each such destination in the data-processing-loop: creating the runnable instance and callexecute
– I have no idea will this be even more expansive than the synchronized or reentrant locks. In the reality (so far) there is post-operation so only putting runnable and no need to wait for the callable return in the data-processing-loop, although post-operation is needed in the control loop.Designate the operations of
CHANNEL_STORE
to a dedicated task by a pair of ArrayBlockingQueue instead of Executor For each access toCHANNEL_STORE
, put a task-indicator together with an attachment of parameters to the first queue, and then the dedicated thread loops on this queue by the blocking methodtake
and operates on theCHANNEL_STORE
. Then, it put the result to the 2nd queue for the Designator to continue the post-operation (currently no need, however). I regard this as the fastest, assuming the blocking queue in JVM is lock-free. The concern on this is that code is very messy and error-prone.
I think the 2nd and 3rd may be called "serialization".
The reason that I cannot simply assign tasks to a thread-pool for data processing and forget them is that the TCP stream data packets of each channel cannot be disordered, it has to be in serial per channel base.
Questions:
what’s the performance of the second way comparing to the first way?
what’s the suggestion for my situation?
I'm currently using stream-IO for LAN read/write. If using NIO, the coordination between the NIO thread and data processing threads may bring additional complexity (e.g post operations). So I think this question is meaningful for time-critical (stream-based, multi-channel network) applications like mine.