5

I have the following situation. There are a lot of queries to database (commonly writing comments, reading profiles, etc...), and, I think will be more reads than writes. I want to have a good possibility to scaling db to a several servers. So, and I enjoy nosql:) As I understand by reading blogs and answers for question on StackOverflow (for example this one) the best choice in this situation is using Cassandra.

So, the question is - is Cassandra more suitable for my purposes? Why?

And the second question will be about async client libraries for Tornado - do you know some implementations of this? As you can see on the wiki page linked above there are async clients only for mongodb and couchdb. And this fact also stops me.

Maybe I can use MongoDB now (cause of async library esists and maybe in first time it will be faster than Cassandra on several servers but without async), and after some time convert data in mongo to cassandra. What do you think about it?

Community
  • 1
  • 1
Dmitry Belaventsev
  • 6,347
  • 12
  • 52
  • 75
  • 1
    AFAIK there are no tailor made (async) Cassandra libs that runs inside Tornados IOLoop. (ps read Bens post about threads: https://github.com/facebook/tornado/wiki/Threading-and-concurrency) – Schildmeijer Nov 21 '11 at 15:37
  • Thx for a link! How do you think - will using Cassandra without async be faster than MongoDB with async module. Or maybe difference will very little - and I will have time to write my own async implementation. Maybe I can simply run special thread for db interaction, which will communicate with Tornado's thread. – Dmitry Belaventsev Nov 21 '11 at 16:11
  • Tornado supports twisted, which means you can use the async telephus twisted library for Async Cassandra support. – koblas Nov 21 '11 at 19:28

1 Answers1

3

Half answer - since it's not about suitability. Tornado 2.1 supports twisted as a async pattern, which means that you can use the telephus Cassandra library (twisted+Cassandra) to have async Cassandra access.

    import tornado.platform.twisted
    from telephus.pool import CassandraClusterPool
    from twisted.internet import reactor

    tornado.platform.twisted.install()

    from twisted.internet import reactor

    pool = CassandraClusterPool([HOST], keyspace='XXXX', reactor=reactor)

    pool.startService()

    reactor.run()        # this calls tornado.ioloop.IOLoop.instance().start() 

That said, I'm using MongoDB and mongoengine (non-async) for some personal projects at the moment and Cassandra+telephus for work projects. I'm making a tradeoff in terms of flexibly data models versus fixed data models and performance.

koblas
  • 25,410
  • 6
  • 39
  • 49