1

I'm trying to develop a professional social network and I use mongodb to the database, and I wanted to ask if I will not have a problem with the database, regarding the size of documents. knowing that we plan to have a large number of users in the social network. I hope that I would have util feedback from you.

Renato Gama
  • 16,431
  • 12
  • 58
  • 92
  • Foursquare uses MongoDB and has done several presentations on their experience which will likely be of interest: [MongoNYC 2012: Scaling MongoDB at Foursquare](http://www.10gen.com/presentations/MongoNYC-2012/MongoDB-at-foursquare), [MongoSV 2011: Show & Tell](http://engineering.foursquare.com/2011/12/21/show-and-tell-mongodb-at-foursquare/). There are many more interesting presentations from MongoDB users and the development team at [10gen.com/presentations](http://www.10gen.com/presentations). – Stennie Jul 28 '12 at 14:16

2 Answers2

3

'Large number of users' is somewhat vague. Having a rough estimate helps..Anyway, the document size limit in MongoDB is 16MB, which looks enough for storing a user's profile details. However, in your use-case of 'networking', you might be planning for keeping followers/friends. Whether to store them in the same document as the User-profile document or not is a different question in itself. You might want to check these out:

What is a good MongoDB document structure for most efficient querying of user followers/followees?
http://www.10gen.com/events/common-mongodb-use-cases
http://docs.mongodb.org/manual/use-cases/
http://nosql.mypopescu.com/post/316345119/mongodb-usecases

Community
  • 1
  • 1
Aafreen Sheikh
  • 4,949
  • 6
  • 33
  • 43
1

One issue you might run into is that MongoDB stores the text of a field name for each field in each document. So if you have a field called "Name" or "Address" that you want for a set of documents that text will appear in every single document, taking up space. This is different to a relational database which has a schema, where the name of a column is only stored once.

A few years ago I worked on a project where the engineers had a bit of a surprise at the size of their data set when they simulated millions of users because they had not taken this into consideration. They optimized the data for size (ie "loc1" instead of "Location 1") but had not done the same for the field names. It's the problem when developers used to RDBM development make assumptions about NoSQL solutions, they only counted the size of their data, not field name plus field value.

They were glad they found this out in a test before they went live, otherwise they would have had to migrate every live document in order to implement the changes they wanted.

It isn't a big deal, certainly not a reason not to use MongoDB (being schema less and treating each document as a unique item is after all a feature rather than a bug or design flaw). Just something to keep in mind.

Cormac Mulhall
  • 1,197
  • 1
  • 8
  • 12
  • Data storage is a consideration, but before aggressively optimizing the column names to take up fewer characters it is worth considering what the actual cost might be. For example, with [AWS](http://aws.amazon.com/s3/pricing/) the cost per Gb (or Tb) is still pretty reasonable versus the tradeoff for readability (i.e. if you have to work out that `l1` is actually shorthand for `location1` and do this translation somewhere in your app layer). – Stennie Jul 28 '12 at 14:05
  • 1
    True, premature optimization is the root of all evil, as they say. We optimized the data simply because we had massive amounts of very large documents, so for us the trade off between long field names and translation in the application layer was worth it, but that may not always be the case. Certainly wouldn't recommend shortening data or field names unless there is a valid reason to do so. – Cormac Mulhall Jul 29 '12 at 09:17