19

I've read some article say that RDBMS such as MySQL is not good at scalable,but NoSQL such as MongoDB can shard well. I want to know which feature that RDBMS provided make itself can not shard well.

shuitu
  • 253
  • 1
  • 2
  • 7

3 Answers3

57

Most RDBMS systems guarantee the so-called ACID properties. Most of these properties boil down to consistency; every modification on your data will transfer your database from one consistent state to another consistent state.

For example, if you update multiple records in a single transaction, the database will ensure that the records involved will not be modified by other queries, as long as the transaction hasn't completed. So during the transaction, multiple tables may be locked for modification. If those tables are spread across multiple shards/servers, it'll take more time to acquire the appropriate locks, update the data and release the locks.

The CAP theorem states that a distributed (i.e. scalable) system cannot guarantee all of the following properties at the same time:

  • Consistency
  • Availability
  • Partition tolerance

RDBMS systems guarantee consistency. Sharding makes the system tolerant to partitioning. From the theorem follows that the system can therefor not guarantee availability. That's why a standard RDBMS cannot scale very well: it won't be able to guarantee availability. And what good is a database if you can't access it?

NoSQL databases drop consistency in favor of availability. That's why they are better at scalability.

I'm not saying RDBMS systems cannot scale at all, it's just harder. This article outlines some of the possible sharding schemes, and the problems you may encounter. Most approaches sacrifice consistency, which is one of the most important features of RDBMS systems, and which prevents it from scaling.

Niels van der Rest
  • 31,664
  • 16
  • 80
  • 86
  • 6
    According with Wikipedia. "Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions." https://dba.stackexchange.com/questions/31260/consistency-in-acid-and-cap-theorem-are-they-the-same – Enrique Benito Casado Aug 17 '18 at 15:47
  • 5
    Consistency in the CAP Theorem is completely different from Consistency in ACID. – jack klompus Oct 02 '20 at 08:04
  • @jackklompus, care to elaborate why those are different? – Niels van der Rest Oct 05 '20 at 07:10
  • 1
    @NielsvanderRest Sure. In the CAP Theorem, Consistency is referring to "Every read receives the most recent write or an error". In ACID, Consistency is referring to "a transaction can only bring the database from one valid state to another, maintaining database invariants". – jack klompus Oct 07 '20 at 00:41
  • @NielsvanderRest From Martin Keppman's Designing Data Intensive Applications, you can [see](https://imgur.com/a/glES6e5) – jack klompus Oct 07 '20 at 00:43
  • 1
    @jackklompus, OK. Where in the answer do I claim those are the same thing, that apparently warrants a downvote? Read the answer again and interpret every mention of 'consistency' as CAP consistency. – Niels van der Rest Oct 07 '20 at 06:25
  • In CAP theorem, if we take consistency and partition tolerance => how are we going to lose the availability ? any example i know the system becomes slow from the answer explained above. – Hiro_Hamada Jul 04 '22 at 09:55
  • @NielsvanderRest The above example mentions locking multiple shards. What if all the transactions just update single shard at a time, then locking a single shard should be quick(just like without the shards). Does having RDBMS preferred in such case if the db is to be scaled? – asn Jul 02 '23 at 07:21
2

Why NoSQL dudes and dudettes don't like joins: http://www.dbms2.com/2010/05/01/ryw-read-your-writes-consistency/

TTT
  • 2,365
  • 17
  • 16
1

Queries involving multiple shards are complex (f.e. JOINs between tables in different shards)

ggarber
  • 8,300
  • 5
  • 27
  • 32
  • What if i never use join in RDBMS?Will RDBMS become shard easily? If so,it seems no need to implement a new database called NoSQL. – shuitu Aug 06 '10 at 12:01
  • 4
    RDBMS are meant to be management system for 'relational' database. And relations among data is exhibited by JOINs between tables. If you are not using JOINs then it means your database is not relational so you can move NoSQLs which are key/value stores. – Gary Lindahl Sep 15 '11 at 03:11