0

Although I currently do not have it, I'm interested in learning how someone would scale an individual table in MySQL that might have, say 20 million users. Is this something you would use sharding for? What are some strategies one might use to make an individual table of this magnitude "scalable" ?

randombits
  • 47,058
  • 76
  • 251
  • 433
  • Are these users OF the table, or user entries IN the table? – Paul Sonier Jun 17 '11 at 18:44
  • these are 20M users in the users table, not 20 million users USING the table. – randombits Jun 17 '11 at 18:49
  • With proper indexing according to your queries, a 20M (or even 200M) user table is no big deal (and should it become, you tune MySQL accordingly and buy more RAM). Though with one such big table, it's more a queistion of what the application as a whole does and how it'll perform. – Lyke Jun 17 '11 at 19:04

1 Answers1

6

20M records is generally considered "small". Depending on the size of records and the kind of queries performed, you are likely to get very good performance on the lowliest of servers.

Almost all servers can keep such a database in memory. Let's consider that a record takes 1024 bytes, including indexes. This is quite a large record, yet 20M rows is still only 20Gb, which fits comfortably within the RAM of a modest server.

While your database fits in RAM, queries are likely to be very fast.

But in any case, you need to consider what the access patterns are.

Do you have

  • Very high write rates - more than 100 transactions per second?
  • Lots of hard queries / reports?

If the answer to both of these is "no", you probably need no special equipment at all.

Certainly you don't want to shard. It's complicated, it massively complicates your application, and will require a LOT of developer time which is better spent on features (which you can actually sell to customers)

In order to improve performance with big data, in approximate order of preference, you want to:

  • Buy better hardware (within reason)
  • Reduce the amount of data you need to store
  • Use horizontal partitioning
  • Use vertical partitioning / functional partitioning
  • Get a better database engine which can use existing hardware more efficiently (possible examples: Infobright, Tokutek)
  • Shard (you really don't want to do this!)
MarkR
  • 62,604
  • 14
  • 116
  • 151