0

Good day guys,

After browsing a bit in my spare time at work, I stumbled upon the Facebook statistics page.

"More than 30 billions pieces of content ... shared each month."

We can assume there are probably around let's say 20ish billions posts a month. For now, I believe MySQL will have no problem dealing with such amount of informations since the maximum value of an unsigned BIGINT is 18 446 744 073 709 551 615. I assume they use numeric PK for optimisation purpose.

The question that boggle my mind is that in C++, the maximum unsigned long integer value is 4 294 967 295. How are they dealing with their database's primary keys when their's values are now probably too large to be dealt with, in their back-end codes ?

Thank you

Cybrix
  • 3,248
  • 5
  • 42
  • 61
  • 1
    possible duplicate of [C++ handling very large integers](http://stackoverflow.com/questions/124332/c-handling-very-large-integers) – Byron Whitlock Jan 04 '11 at 21:02
  • Just as a comment: even if it's non-standard, the popular C++ implementations (GCC, MSVC, Intel) provide a 64-bit `int`, which is what MySQL's `BIGINT` type is. In case you don't, you could implement one or use a library. – wkl Jan 04 '11 at 21:03
  • Crap, I wasn't aware of the unsigned long long integer. Silly me! – Cybrix Jan 04 '11 at 21:11

3 Answers3

3

What about unsigned long long which is 64-bit long, exactly as BIGINT? :) Also, I believe that Facebook do not use numeric ids for their DB, because it is very hard to do that at multiserver database configuration with parallel inserts.

Victor Haydin
  • 3,518
  • 2
  • 26
  • 41
  • I believe using anything else then numeric values in a database would be a pain. But I have no knowledge into multiserver database structures. I would like to get some info about that though. – Cybrix Jan 04 '11 at 21:14
  • @Cybrix - why would it be such a pain? That is the whole point of relational databases. It shouldn't aside from memory / space consumption matter I JOIN a number or a string. – JonH Jan 04 '11 at 21:28
  • @JonH Sorry my comment wasn't clear. I believe using anything else then numeric values as PK is probably a pain for the server. Using a string as PK is probably very time-consuming: You have to generate a random string and then make sure it's not already used. To be honest, I don't know how the indexing works on non-numeric primary key and if they are as efficent as a numeric value. Those are just my thoughts. – Cybrix Jan 04 '11 at 21:41
  • @Cybrix: You don't have to create unique values by flipping coins. Consider GUIDs. – John Dibling Jan 04 '11 at 21:45
0

Simple you write your own BIGINT datatype class in C++ to handle very very large numbers (that may be internally represented as strings actually).

What I mean by this is think of a linked list

struct myNode{
char num;
myNode* next;
}

Now the linked list can hold a lot more integers than you'd expect a typical long / int to.

That doesnt mean I know what facebook is doing or using but there are so many variations. Also facebook is so large all the data isn't kept in just one database...You also may be making assumptions about technology and programming languages as well as DB vendors.

They could be using GUID :), they could be using Oracle or even SQL Server.
There servers are definately load balanced and spread across many many servers across the US and probably outside of the US as well. It's not a simple app but it sure looks like one.

JonH
  • 32,732
  • 12
  • 87
  • 145
0

If you use 64 bit systems (which is already the default for Windows 7 desktop PC's), you can count to 4 billion times 4 billion.

If 4 billion people are online in facebook, they can all make 4 billion comments, and you can still store all of them in a single table with a 64 bit key.