When you pick up Rx, it becomes this awesome shiny hammer and everything starts looking like a rusty nail just waiting for you to bang in.
Personally, I think the biggest clue is in the name, reactive framework. Given a requirement, you need to reflect upon whether a reactive solution truly makes sense.
In any Rx proposition, you are looking to introduce one or more event streams and carry out some action in response to an event.
I think there are two key questions to ask:
- Are you in control of the event stream?
- To what degree must you complete responses at the rate of the event stream?
If you do not have control of the event stream and you must respond at the rate of the event stream then Rx is a good candidate.
In any other circumstance, it is probably a poor choice.
I have seen many examples where people have jumped through hoops to create the illusion of a lack of control in order to justify Rx - which seems crazy to me. Why give up the control that you have?
Some examples:
You have to extract data from a fixed list of files and store it in a database. You decide to push each file name into a subject and create a reactive pipeline that opens each file and projects the data, then processes the data in some way and finally writes it to the database.
This fails the control test and the rate test. It would be far easier to iterate over the files and pull them in and process them as fast as you can. The phrase "decide to push" is the giveaway here.
You need to display stock prices from a stock exchange.
Clearly this is a good choice for Rx. If you can't keep up with the rate of prices in general, you are screwed. It might be the case that you conflate prices (perhaps to provide an update only once every second) - but this still qualifies as keeping up. The one thing you can't do is ask the stock exchange to slow down.
These (real world) examples pretty much fall at opposite ends of the spectrum and don't have much grey area. But there is a lot of grey area out there where control isn't clear.
Sometimes you are wearing the client hat in a client/server system and it can be easy to fall into the trap of sacrificing control, or putting control in the wrong place - which can easily be fixed with correct design. Consider this:
A client application displays news updates from a server.
- News updates are submitted to the server at any time and are created in high volume.
- The client should be refreshed at an interval set by the client.
- Refresh interval can be changed at any time and the user can always request an immediate refresh.
- The client only shows updates tagged with particular keywords, as specified by the user.
- The news updates are sometimes lengthy and the client should not store the full content of news updates, but rather display the headline and summary.
- At user request, the full content of an article can be shown.
Here, the frequency of news updates is not in control of the client. But the desired refresh rate and the tags of interest are.
For the client to receive all the news updates as they arrive and filter them client side isn't going to work. But there are plenty of options:
- Should the server send a data stream of updates taking into account the client refresh rate? What if the client goes offline?
- What if there are thousands of clients? What if the client wants an immediate refresh?
There are lots of valid ways to tackle this problem that include more or less reactive elements. But any good solution should take account of the client's control of tags and desired refresh rate, and the lack of control of news update frequency (by client or server). You might want the server to react to changes in client interest by updating the events that it pushes to the client - which it pushes only as long as the client is listening (detected via a heartbeat). When the user wants a full article, then the client would pull the article down.
There is much debate in the Rx community about back-pressure. This is the idea that the client should inform the server when it is overloaded and the server respond by somehow reducing the event stream. I think this is a misguided approach that can lead to confusing designs.
To my mind, as soon as a client needs to give this feedback, it has failed the response rate test. At this point, you are not in a reactive situation, you are in an async enumerable situation. i.e. The client should be saying "I am ready" when it is ready for more and then waiting in a non-blocking fashion for server to respond.
This would be appropriate if the first scenario were modified to be files arriving in a drop-folder, of varying lengths and complexity to process. The client should make a non-blocking call for the next file, process it, and repeat. (Add parallelism as required) - and not be responding to a stream of file-arrived events.
Wrap up
I've deliberately avoided other valid concerns such as maintainability of code, performance of Rx itself etc. Most because they are addressed elsewhere and more importantly because I think the ideas here are more divisive than those concerns.
So if you reflect on the elements of control and response rate in your scenario you and will probably stay on the right track.
The response rate issue can be subtle - and the degree aspect is important. Arrival rate can fluctuate, and there is going to be some acceptable degree of fluctuation in response rate - clearly, if you don't ultimately have a way to "catch up" then at some point the client will blow up.