14

How does Trello store their data (cards) in MongoDB?

I read ( How does Trello show history so quickly? ) that their Actions collection is the largest one. Thus I assume that they do not store all cards in one gigantic collection.

The only other logical solution seems a collection per board or per user. But they have a lot of users and I read ( http://www.mongodb.org/display/DOCS/Using+a+Large+Number+of+Collections ) that there is a limit on the number of collections and some people advise against using a lot of collections.

I hope someone from Trello team will answear this AND I hope to get some ideas on solving this kind of a problem also from the general community.

Community
  • 1
  • 1
Ben
  • 2,435
  • 6
  • 43
  • 57
  • 2
    Why wouldnt they store it all in one collection and shard on what they query this way you will effectively only query one computer at a time making the query pretty much as fast as a small collection, add on top that they probably ping the data to other regions as well so that response times are consistent and before long you got a fully scalable collection into the billions of rows. You have also gotta remember the lock is db level so if they were gonna use multiples of something it would make snese to use multiple of DBs if at all. – Sammaye Oct 19 '12 at 07:56
  • @Sammaye I don't understand the "shard on what they query" part. You can view all cards in a board, search through all boards by card name, ... I understand the basic principle of sharding, but I don't understand how one can accomplish what you are saying. Could you please explain it a bit more. – Ben Oct 19 '12 at 08:08
  • 4
    Ok I have only been using trello for 10 minutes now and already I see that if I was designing this site I would have three collections: `user`, `board`, `card`. I would shard on `board_id` for `board` since a board needs to be accessible to many users. This way when you query for the user (which is sharded on `{_id,username,email}`) and then get all their boards that would scale out. The lists of a board would be embedded in the boards row but not the cards. The cards would probably be sharded on board_id and timestamp and maybe user_id too. Thats my first 10 min glance – Sammaye Oct 19 '12 at 08:14
  • For tyhe search they probably don't use MongoDB, I myself for a video site use Sphinx since MongoDB is not amazing and FTS index storage and efficiency. That being said, for localised search on a single list or board they might use ideas described here: http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo – Sammaye Oct 19 '12 at 08:21
  • @Sammaye In trello, each user has a role (permission) in a board. So would you store embed participations {board_id, role} in each user? That way getting all of a user's boards would be easy but If you want to get all a board's users you would have to query the user's collection to see if they have a participation with that board_id. Or would you also maintain a list of user_ids on each board? I see many possibilities, but what would be the "best" one? – Ibraheem Ahmed Jul 04 '20 at 19:54

1 Answers1

16

Trello stores all of the cards in one collection. The collection is sharded on the card's board id.

Daniel LeCheminant
  • 50,583
  • 16
  • 120
  • 115
  • Do you know at which point would sharding be necessary? I am relatively new to MongoDB and am making an app that may need to be scaled. – MadPhysicist Jun 27 '16 at 01:41