2

I looking for a protocol or a standard suitable for the following requirements:

  • There is an array of objects that can be represented in JSON - as text.
  • We have one source of truth stored as file or files. (possible many but I looking for simplicity)
  • There are many clients that are trusted and do CRUD operations on these data.
  • A server (That can be realized by serverless architecture because it is needed only when a client requires it) that modifies these flat files on requests from clients or gives them current version of resources.

I researched some hours and have a problem because found many complicated solutions like Paxos and Raft that requires network infrastructure.

My proposition

So I want to show my proposition and instantly ask if something similar exists and can be used in this case. Any feedback about problems with my concept will be appreciated.

Architecture.

Clients

We have many clients that can be disconnected from network, operate on local data and want to send their changes, receive changes made by other clients.

Data Storage

We have one data storage like AWS / Dropbox / Google Drive / Hosted files, does not matter, it is important that there is no access to such advanced and powerful tools like casandra. There are two files: current data - it can be serialized JSON or SQLite file and logs it is a file with the history of all operations sorted by timestamp.

Point of access

We have one point to access to this data. Serverless unit with the following tasks: receive local logs from the client, merge them with global logs, process them to update global current data, send to client instruction about a local modification that hi should apply.

enter image description here

Processing

Let's consider the following scenario. A client modifies local data and adds any modification to local logs.

Synchronization means. That 1) A client is sending local logs to point of access (server). The server creates empty .lock file to prevent modification from other points of access. If the file was existing before, synchronization is refused. The server looks for the date of the last synchronization saved in the local log. Then download logs from storage and gets any lines from the date of the last synchronization to now. Server mixing them, and calculating two things: a) how to update global current data, b) how to update local current data of this client. Then the server download global data, apply modification. Send local modification instructions to the client. And removes .lock file. Client applies modifications modifying local state, remove local logs and save to them a date of the last synchronization.

Exceptions / Merging

Now image that two clients deleted the same item. It should be deleted. Id is randomized, so two clients can't create two elements with the same Id. If anyway would be created, the second is valid and his creation is changed to update. If two clients update the same resource then we should look at his structure. I mentioned about JSON representation. So it is a collection of keys and values. The following strategy should be applied:

First version         {a:1, b:2, c:3}
Update form client 1: {a:4, b:2, c:3}
Update from client 2: {a:1, b:5, c:3}
Merged version        {a:4, b:5, c:3}

Because of the date of last modification is calculated for any property independently.

When during processing exceptions stop the server and .lock will be not deleted, then the system stops synchronization and require manual fixing.

My questions:

  • where can I find open source solutions similar to presented one?
  • are there any protocols, standards that should I learn?
  • what should I type in Google / Duckduck to find it?

Related topic from 2009.

Client-server synchronization pattern / algorithm?

  • there was svn, cvs recommended

but because of CVS was replaced by SVN, SVN by GIT should I consider using GIT to synchronization like this? On wikipedia Version Control systems are considered as "requiring much overhead".

Considering differences between blockchain and chain of blocks I understand that blockchain can't be applied here.

Daniel
  • 7,684
  • 7
  • 52
  • 76
  • You don't mention what you think should happen in the case of merge conflicts. As for "serverless", there absolutely is a server, _something_ grants access to your "one source of truth". If you mean the server is restricted to complete replacement of the dataset on command and want to know how to serialize command access, you'll have to specify exactly what operations are possible on this one thing you don't want to call a server. But none of this is on-topic here, maybe try the software engineering exchange at a guess. – jthill Jan 27 '19 at 05:28
  • 1
    I'm voting to close this question as off-topic because it's off topic, I think it might belong on softwareengineering.stackexchange.com. – jthill Jan 27 '19 at 05:30

0 Answers0