3

How do I create a CRDT using Gun?

For instance, if I want to implement an grow-only array, where each element points to the next, how do I solve conflicts?

To simplify, let's create this scenario, where Alice and Bob are cooperating.

The array contains 3 elements, [a, b, d].

The internal representation of this array would be a linked list like this:

a => b => c

(of course, the internal representation would be something like {value: 'a', next: { value: 'b' next: { value: 'c' }}}), but I think you get my point with the terser notation.

Alice now wants to insert element c between b and d.

Concurrently, Bob wants to insert element C between b and d.

Concurrently, they have this internal representation of the array:

Alice: a => b => c => d

Bob: a => b => C => d

When they join the CRDT, they would converge to either one of the following values:

a => b => c => C => d

or

a => b => C => c => d

No matter what, a) both of them would converge on the same value and b) they would not lose each other's data.

Can we achieve this using Gun?

(This question is a simplification and follow-up question to https://github.com/amark/gun/issues/602)

pgte
  • 51
  • 4

1 Answers1

0

Yes.

Here is a demo of code doing just that from a few years ago:

https://youtu.be/rci89p0o2wQ

You can create any other CRDT as a data structure on top of GUN's base CRDT.

We even did a whole cartoon explainer on generic algorithms for this type of stuff:

https://gun.eco/explainers/school/class.html

(Or see a more detailed explanation for a case-specific implementation, by the wonderful Martin Kleppmann starting at https://youtu.be/yCcWpzY8dIA?t=29m36s )

I haven't seen anybody implement RGA specifically on top of GUN, but given how short the code you sent over, it should be pretty easy. (Although the "delete" would need to be a null tombstone, but that is fine)

It is probably easiest to start by looking at the counter CRDT for "how to start saving data to GUN with a custom extension", a grow-only CRDT in just 12 lines of code:

https://gun.eco/docs/Counter

You note that it is very basic, RGA would be similar, you'd probably have some GUN extension (just a JS method/function, that makes it easier for developers to use) like

Gun.chain.rga = function(...

Then internally, like with the counter, you'd call gun.put( or gun.set( or whatever other commands to save data to the graph (put and set are themselves just extensions like RGA is, nothing fancy here) where you'd build/construct a graph/tree/table, or you could be lazy and just do something like:

// fictional code
var myData = {
  rgaTree: {left: {}, right: {}}
}
myData.rgaTree.left.next = myData.rgaTree.right;
myData.rgaTree.right.prev = myData.rgaTree.left;
// yes! circular references are supported!
gun.put(myData);

Obviously you might want to be more detailed and control the UUIDs with the cuids and stuff, but you get the point.

There is no reason to actually replace HAM CRDT directly or "inject" a CRDT along side it, just @pgte would model the CRDT's data structure on GUN (probably with a nice easy API via a GUN extension), then you'd also have that extension support a callback, which if passed you'd gun.get(... through the RGA graph/tree and run the various RGA logic before spitting the result back out to the developer. This tree can be dynamically mutated concurrently inside of GUN by many peers, like Alice and Bob!

Then, GUN will save that dynamically changing and updating data to disk via one of (many) storage engines (such as IPFS!) so IPFS can host the persistence of data over time, and GUN can manage the O(1) tree lookups that are the mutable/changing/dynamic tree/graph/index structures.

marknadal
  • 7,534
  • 4
  • 25
  • 22
  • Hi Mark, thanks for your reply. I'm a bit confused here, hope you can help me out. If I use `.set` I'll be adding items to a set. If, instead, I use `.put`, I'll be overwriting data, and I'll have last writer wins. Is this correct? – pgte Sep 13 '18 at 17:58
  • @pgte Yes. Some edge cases: `.set(gunRef)` uses gunRef ID to dedup item in set (a proper mathematical set). `.set(data)` if data isn't a gunRef then it is hard to tell what the behavior should be so it uses a base 36 timestamp as the ID so you can "loosely" sort items based on it, adds it to the "table". `.set` is built on top of `.put` simply as `.set(data) = .get(id).put(data)`. `.put` is the core operation, it updates/mutates/changes/MERGES data on a property (it doesn't overwrite unless the value on that property is atomic). "Last" write win as determined by the CRDT, not anybody's "last". – marknadal Sep 15 '18 at 08:49
  • Ok. As far as I can understand, this implies that I can’t model an RGA as a linked list, but as a collection of sets. The RGA works by keeping the following sets: – pgte Sep 16 '18 at 13:20
  • each element in the vector has an id. – pgte Sep 16 '18 at 13:23
  • * map of id => value – pgte Sep 16 '18 at 13:23
  • * set of removed ids (tombstones) – pgte Sep 16 '18 at 13:24
  • * edges: map of id => id (this defines the order of the vector) – pgte Sep 16 '18 at 13:24
  • Given this, how would you model the CRDT using Gun? – pgte Sep 16 '18 at 13:24
  • You can do a Linked List in GUN to get strongly ordered data structures. I was just giving an example. You say you have IDs, why not just use those directly? Let me know after you have finished the https://gun.eco/docs/Todo-Dapp and ping me (probably chat, I don't check SO every day) cause it should be pretty straight forward to implement - you've already modeled it! – marknadal Sep 20 '18 at 18:05
  • If I model it like I said, how does this behave when two users change the linked list? Let’s say that, for instance, the list has elements A => D, and Alice changes it to A => B => D and, concurrently, Bob changes it to A => C => D. How can we make Gun resolve this to either A => B => C => D or A => C => B => D (where no data is lost)? – pgte Sep 24 '18 at 09:40
  • If you are modify it in-place, then you get A => C => D with B => D as expected. It is superior to use a "back sort", A <- B & C <- D, if you receive D you won't display it until C (and its dependencies) also exist. You don't load based on the links, you load the whole "set" and stream sort based on back sort. Both B&C will be displayed even tho B not pointed to. What decides B vs C display order? Your sort function, not GUN! While 2 things may have same back sort, you'd use their ID (or something else) to sort remaining letters. Again, covered in: https://gun.eco/explainers/school/class.html – marknadal Sep 25 '18 at 06:47