1

Sharding with replication]1

I have a multi tenant database with 3 tables(store,products,purchases) in 5 server nodes .Suppose I've 3 stores in my store table and I am going to shard it with storeId . I need all data for all shards(1,2,3) available in nodes 1 and 2. But node 3 would contain only shard for store #1 , node 4 would contain only shard for store #2 and node 5 for shard #3. It is like a sharding with 3 replicas. Is this possible at all? What database engines can be used for this purpose(preferably sql dbs)? Did you have any experience?

Regards

Behdad
  • 184
  • 3
  • 12

1 Answers1

1

I have a feeling you have not adequately explained why you are trying this strange topology.

Anyway, I will point out several things relating to MySQL/MariaDB.

  • A Galera cluster already embodies multiple nodes (minimum of 3), but does not directly support "sharding". You can have multiple Galera clusters, one per "shard".
  • As with my comment about Galera, other forms of MySQL/MariaDB can have replication between nodes of each shard.
  • If you are thinking of having a server with all data, but replicate only parts to readonly Replicas, there are settings for replicate_do/ignore_database. I emphasize "readonly" because changes to these pseudo-shards cannot easily be sent back to the Primary server. (However see "multi-source replication")
  • Sharding is used primarily when there is simply too much traffic to handle on a single server. Are you saying that the 3 tenants cannot coexist because of excessive writes? (Excessive reads can be handled by replication.)

A tentative solution:

Have all data on all servers. Use the same Galera cluster for all nodes.

Advantage: When "most" or all of the network is working all data is quickly replicated bidirectionally.

Potential disadvantage: If half or more of the nodes go down, you have to manually step in to get the cluster going again.

Likely solution for the 'disadvantage': "Weight" the nodes differently. Give a height weight to the 3 in HQ; give a much smaller (but non-zero) weight to each branch node. That way, most of the branches could go offline without losing the system as a whole.

But... I fear that an offline branch node will automatically become readonly.

Another plan:

Switch to NDB. The network is allowed to be fragile. Consistency is maintained by "eventual consistency" instead of the "[virtually] synchronous replication" of Galera+InnoDB.

NDB allows you to immediately write on any node. Then the write is sent to the other nodes. If there is a conflict one of the values is declared the "winner". You choose which algorithm for determining the winner. An easy-to-understand one is "whichever write was 'first'".

Rick James
  • 135,179
  • 13
  • 127
  • 222