How/where to inject server communication into a Flux workflow?

Question

Context

We are building a Flux-based web app where the client (Flux/React/TypeScript) communicates with the server (C#) via websockets (not via HTTP GET).

When a client-side action is performed, we send a command request to the server over a web socket connection.

The server always responds quickly with a first Start response (indicating if it can execute the requested action), and then subsequently responds with 1 or more Progress responses (may be several hundreds of responses, that contain progress information about the performed action). When the action on the server was completed, the last Progress response will have a progress fraction of 100% and an "OK" error status.

Note that some server actions can take just 100 ms, but others can take up to 10 minutes (hence the progress responses, so the client can show something about the action to the user).

In the client, we have:

a sendCommandRequest function which we call to send a command request to the server over the websocket
a handler function handleCommandResponse (which is called in the websocket onMessage callback)

The question we have is: what is the best way to inject the server communication into a Flux-based application? Where should we make the send calls to the websocket and how should we move from the callback of the websocket back into the Flux chain?

Proposal

Let's say we change a visual effects parameter in the client GUI (floating point number, set in a text field box) and this triggers an action on the server that generates a new image. This action takes 1 second to be executed on the server.

Now, the way this goes is that, the GUI handler will send a Flux action to the dispatcher, which calls the callbacks of the stores that registered with it, and the stores then update their data based on the action payload, and trigger the various components via change event callbacks, which use setState with the new store data to trigger a re-render.

We could let the stores send command requests over the websocket to the server at the moment they change their data (like the new parameter value). At that point the store will have the new parameter value, while the image is still the last received image (result from a previous command).

From that moment onwards, the server starts sending back Progress responses to the client, which arrive in the websocket onMessage callback, and the final Progress response will contain the new image (2 seconds later).

In the websocket onMessage callback, we could now send a second Flux action to the dispatcher, with as payload the newly received image, and this will cause the stores to update their data with the new image, which will cause the listening components to re-render.

The problem with this is:

the Flux store can have partially updated data: as long as the server did not send back the final result, the store will have the new parameter, but still an old image
we need 2 Flux actions now: one used by the GUI to signal a user made some change, and one used by the web socket callback to signal some result arrived
if something goes wrong at the server side, the new parameter will already have been set in the Flux store, while we will never receive a new image, so the data in the store becomes corrupt

Are there better ways to fit in server communication in the Flux workflow? Any advice/best practices?

score 0 · Answer 1 · edited Oct 07 '21 at 13:39

if something goes wrong at the server side, the new parameter will already have been set in the Flux store, while we will never receive a new image, so the data in the store becomes corrupt

Here are a few failsafes:

Caching

We have a caching layer on each of the API nodes. This technique not only improves latency (by up to 20%), but enhances availability when dependency services go down. In addition, the load on dependency systems reduces significantly (by up to 70% in our case). For maximum flexibility, we place the cache close to the dependency calls, and use Guava in our implementation.

Dependency protection

Occasionally, we notice patterns where a dependency service would degrade, causing threads in the API system to stall. This dramatically reduces throughput, and can bring the system down, due to thread exhaustion. To avoid this, we implemented aggressive timeouts on dependencies, and an automated mechanism to eliminate calls to dependency services when they are down. This technique improves scalability significantly as threads can now proceed instead of waiting for a timeout.

Speculative Retry

In our analysis, some dependencies tend to have a disproportionately large p99 (aka 99th-percentile) latency, compared to say the p95 latency. More generally, we see patterns where there is a steep jump in latency. At the point where the latency jump occurs, we introduced a parallel retry, and consume the first response received. The tradeoff is an increase in traffic to our dependency systems, in exchange for lower latency and error rate. In our case, this approach reduced latency by up to 22%, and error rate by up to 81%.

Reconnection Algorithm

It can happen that an XMPP server goes offline unexpectedly while servicing TCP connections from connected clients and remote servers. Because the number of such connections can be quite large, the reconnection algorithm employed by entities that seek to reconnect can have a significant impact on software performance and network congestion. If an entity chooses to reconnect, it:

SHOULD set the number of seconds that expire before reconnecting

  to an unpredictable number between 0 and 60 (this helps to ensure
  that not all entities attempt to reconnect at exactly the same
  number of seconds after being disconnected).

SHOULD back off increasingly on the time between subsequent

  reconnection attempts (e.g., in accordance with "truncated binary
  exponential backoff" as described in [ETHERNET]) if the first
  reconnection attempt does not succeed.

TLS Resumption

This specification describes a mechanism to distribute encrypted session-state information to the client in the form of a ticket and a mechanism to present the ticket back to the server. The ticket is created by a TLS server and sent to a TLS client. The TLS client presents the ticket to the TLS server to resume a session. Implementations of this specification are expected to support both mechanisms.

Traditionally one of the main differences between Web apps and native apps was that unlike web apps, native apps could be run offline. This has changed — technologies such as Service Workers allow for a website or web app to cache necessary assets so it can still run while offline. This includes things like JavaScript ﬁles, CSS and images. Combining this technique with intelligent use of things like localStorage will allow your game to continue working even if the Internet connection goes down. You just need to sync up all the changes when it gets connected again.

References

Thanks for the substantial answer! I'll digest this and see what is applicable to my use case. — KoenT_X, Aug 30 '16 at 12:52

How/where to inject server communication into a Flux workflow?

1 Answers1