1

I want to write a Proxy Server for SMB2 based on Asio and consider using a cumulative buffer to receive a full message so as to do business logic, and introducing a queue for multiple messages which will force me to synchronize the following resouce accesses:

  • the read and write operation on the queue because the two upstream/downstream queue are shared the frontend client and the backend server,
  • the backend connection state because reads on the frontend won't wait for the completion of connect or writes on the backend server before the next read, and
  • the resource release when an error occurs or a connection is normally closed because both read and write handlers on the same scoket registered with the EventLoop are not yet completed and a asynchronous connect operation can be initiated in worker threads while its partner socket has been closed, and those may run concurrently.

If not using the two queues, only one (read, write and connect) handler is register with the EventLoop on the proxy flow for a request, so no need to synchronize.

From the Application level,

I think a cumulative buffer is generally a must in order to process a full message packet (e.g. a message in the fomat | length (4 bytes) | body (variable) |) after multiple related API calls (System APIs: recv or read, or Library APIs: asio::sync_read).

And then, is it necessary to use a queue to save messages received from clients and pending to be forwarded to the backend server

use the following diagram from http://www.partow.net/programming/tcpproxy/index.html, it turned out to have similar thoughts to mine (the upstream concept as in NGINX upstream servers).

                                    ---> upstream --->           +---------------+
                                                     +---->------>               |
                               +-----------+         |           | Remote Server |
                     +--------->          [x]--->----+  +---<---[x]              |
                     |         | TCP Proxy |            |        +---------------+
 +-----------+       |  +--<--[x] Server   <-----<------+
 |          [x]--->--+  |      +-----------+
 |  Client   |          |
 |           <-----<----+
 +-----------+
                <--- downstream <---

   Frontend                                                           Backend

For a Request-Response protocol without a message ID field (useful for matching each reply message to the corresponding request message), such as HTTP, I can use one single buffer for every connection in the two downstream and upstream flows, and then continue processing the next request (note for the first request, a connection to the server is attempted, so it's slower than the subsequent processes), because clients always wait (may block or get notified by an asynchronous callback function) for the response after sending requests.

However, for a protocol in which clients don't wait for the response before sending the next request, a message ID field can be used to uniquely identify or distinguish request-replies pairs. For example, JSON-RPC 2.0, SMB2, etc. If I strictly complete the two above flows regardless of next read (without call to read and make TCP data accumulated in kernel), the subsequent requests from the same connection cannot be timely processed. After reading What happens if one doesn't call POSIX's recv “fast enough”? I think it can be done.

I also did a SMB2 proxy test using one single buffer for the two downstream and upstream flows on windows and linux using the ASIO networking library (also included in Boost.Asio). I used smbclient as a client on linux to create 251 of connections (See the following command):

 ft=$(date '+%Y%m%d_%H%M%S.%N%z'); for ((i = 2000; i <= 2250; ++i)); do smbclient //10.23.57.158/fromw19 user_password -d 5 -U user$i -t 100 -c "get 1.96M.docx 1.96M-$i.docx" >>smbclient_${i}_${ft}_out.txt 2>>smbclient_${i}_${ft}_err.txt & done

Occasionally, it printed several errors, "Connection to 10.23.57.158 failed (Error NT_STATUS_IO_TIMEOUT)". But if increasing the number of connections, the number of errors would increase, so it's a threshold? In fact, those connections were completed within 30 seconds, and I also set the timeout for smbclient to 100. What's wrong?

Now, I know those problems need to be resolved. But here, I just want to know "Is it necessary to use a queue to save messages received from clients and pending to be forwarded to the backend server?" so I can determine my goal because it causes a great deal of difference.

Maybe they cannot care about the application message format, the following examples will reqest the next read after completing the write operation to it peer. HexDumpProxyFrontendHandler.java or tcpproxy based on c++ Asio.

Other References

samm
  • 620
  • 10
  • 22

0 Answers0