8

I want to code a Server which handles Websocket Clients while doing mysql selects via sqlalchemy and scraping several Websites on the same time (scrapy). The received data has to be calculated, saved to the db and then send to the websocket Clients.

My question ist how can this be done in Python from the logical point of view. How do I need to set up the code structure and what modules are the best solution for this job? At the moment I'm convinced of using twisted with threads in which the scrape and select stuff is running. But can this be done an easier way? I only find simple twisted examples but obviously this seems to be a more complex job. Are there similar examples? How do I start?

trbck
  • 5,187
  • 6
  • 26
  • 29

2 Answers2

5

Cyclone, a Twisted-based 'network toolkit', based on/similar to facebook/friendfeed's Tornado server, contains support for WebSockets: https://github.com/fiorix/cyclone/blob/master/cyclone/web.py#L908

Here's example code:

Here's an example of using txwebsocket:

You may have a problem using SQLAlchemy with Twisted; from what I have read, they do not work well together (source). Are you married to SQLA, or would another, more compatible OR/M suffice?

Some twisted-friendly OR/Ms include Storm (a fork) and Twistar, and you can always fall back on Twisted's core db abstraction library twisted.enterprise.adbapi. There are also async-friendly db libraries for other products, such as txMySQL, txMongo, and txRedis, and paisley (couchdb).

You could conceivably use both Cyclone (or txwebsockets) and Scrapy as child services of the same MultiService, running on different ports, but packaged within the same Application instance. The services may communicate, either through the parent service or some RPC mechanism (like JSONRPC, Perspective Broker, AMP, XML-RPC (2) etc), or you can just write to the db from the scrapy service and read from it using websockets. Redis would be great for this IMO.

Community
  • 1
  • 1
mikewaters
  • 3,668
  • 3
  • 28
  • 22
  • little bit late for an answer but I did a lot of research and am now thinking/trying to do this with a threaded tornado instance while doing the other stuff in several threads and let them communicate through redis channel subscribe/publish. this seems to be the most elegant solution. – trbck Jun 22 '11 at 21:13
  • maybe I'll put something up on github soon. but I'm not a professional coder so there is nothing to learn from me ;) – trbck Jun 22 '11 at 21:15
  • sorry for being late, just trying to help! ;) hope it works out for you. – mikewaters Jun 23 '11 at 20:46
  • 1
    These links have gone stale. Try https://github.com/fiorix/cyclone/tree/master/demos/websocket – Andrew Jan 30 '14 at 22:39
4

Ideally you'll want to avoid writing your own WebSockets server, but since you're running Twisted, you might not be able to do that: there are several WebSockets implementations (see this search on PyPI). Unfortunately none of them are Twisted-based [Edit see @JP-Calderone's comment below.]

Twisted should drive the master server, so you probably want to begin with writing something that can be run via twistd (see here if your'e new to this). The WebSocket implementation mentioned by @JP-Calderone and Scrapy are both Twisted -based so they should be reasonable trivial to drive from your master Twisted-based server. SQLAlchemy will be more difficult, I've commented on this before in this question.

Community
  • 1
  • 1
Jacob Oscarson
  • 6,363
  • 1
  • 36
  • 46
  • Thank you, Jean-Paul. Still didn't get it to work properly though: ImportError: cannot import name _IdentityTransferDecoder – trbck Jun 06 '11 at 12:34
  • Ok, I fixed it with upgrading twisted. If anyone has more advice for me I would be really grateful. Especially how to plan the structure of the code and an easy solution how to let the threads communicate between each other. – trbck Jun 06 '11 at 12:55
  • Start with being very inspired by https://github.com/rlotun/txWebSocket/blob/master/simple_server.py, but beware: the system you describe is quite complicated and there isn't an easy solution, we're talking major architectural work here, much more than what tends to come out of a typical SO question. – Jacob Oscarson Jun 06 '11 at 13:44
  • The WebSocket server that ships with [cyclone](https://github.com/fiorix/cyclone) works fine also. It even has [examples](https://github.com/fiorix/cyclone/tree/master/demos/websocket) that can help speed up the development. – fiorix May 27 '14 at 03:31