8

I'm running a Flask application that is basically pulling tweets from Twitter. While running the app with the embedded Flask server gives no troubles, when running within gUnicorn I get duplicated tweets, mostly because I have 2 threads receiving the callback from Twitter.

For instance, if I start my app using

python app.py

When receiving tweets I'm getting this expected output, see that I've attached thread info (first param) in the logger output:

140721974449920 2015-03-12 17:59:13,030 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]
140721974449920 2015-03-12 17:59:14,646 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]
140721974449920 2015-03-12 17:59:49,031 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]

As you can see, timestamp looks valid too, checking at the mongo collection where I'm storing this, I see documents are OK. Then, if I start the app using gunicorn:

gunicorn app:app -b localhost:8000 --debug

And then check the logs, I can see that 2 different threads are getting data:

139883969844992 2015-03-12 17:52:05,104 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]
139883961452288 2015-03-12 17:52:05,106 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]
139883969844992 2015-03-12 17:53:36,480 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]
139883961452288 2015-03-12 17:53:36,481 INFO: Got message from streaming Twitter API! [in /home/mosquito/git/opencoast_streamer/app.py:83]

As you can see something weird is going on....then I went to see and check gunicorn:

ps aux | grep gunicorn

mosquito 25035  3.1  0.3  54612 12516 pts/1    S    15:31   0:01 /home/mosquito/www/env/bin/python /home/mosquito/www/env/bin/gunicorn app:app -b localhost:8000
mosquito 25606  0.0  0.4  66904 17016 pts/1    R    15:32   0:00 /home/mosquito/www/env/bin/python /home/mosquito/www/env/bin/gunicorn app:app -b localhost:8000
mosquito 25610  0.0  0.0  13220   956 pts/3    S+   15:32   0:00 grep --color=auto gunicorn

Thus, I'm starting to think that this has to do with gUnicorn...any ideas why gUnicorn is spawining 2 process for my Flask app?

Thanks!

AlejandroVK
  • 7,373
  • 13
  • 54
  • 77

3 Answers3

6

I believe this is not gUnicorn's fault but rather the intended behavior of Werkzeug. Werkzeug has a "reloader" process that monitors for file changes (and hence reloads if it detects a change in your .py files.

For more information on the reloader go here.

To get you through your trouble, I believe adding use_reloader=False to your call to app.run: app.run(use_reloader=False) would do the trick.

You can also see this SO answer for more information.

Community
  • 1
  • 1
Carlos
  • 1,897
  • 3
  • 19
  • 37
  • That seems to work, although to be honest I fixed by moving the async callback method from the streaming API within Tweepy to another module. It does make more sense this way. In any case, thanks for the detailed response Carlos +1 – AlejandroVK Mar 13 '15 at 22:53
  • Anytime, @AlejandroVK! – Carlos Mar 13 '15 at 23:10
3

Gunicorn would always spawn at least 2 processes, even when you set --workers=1. One of the processes is a master process, which spawns other worker processes to handle requests.

vuamitom
  • 173
  • 2
  • 8
0
gunicorn --workers=1 app:app -b localhost:8000 --debug

Source

Smart Manoj
  • 5,230
  • 4
  • 34
  • 59