When running my flask app, which uses Python's subprocess to use scrapy within a flask app as specified here (How to integrate Flask & Scrapy?), from a Docker Container and calling the appropriate endpoints specified in my flask app, I receive the error message: ERR_EMPTY_RESPONSE. Executing the flask app outside of my docker container (python app.py, where app.py has my flask code), everything works as intended and my spiders are called using subprocess within the flask app.
Instead of using flask & subprocess to call my spiders within a web app, I tried using twisted & twisted-klein python libraries, with the same result when called from a docker Container. I have also created a new, clean scrapy project, meaning no specific code of my own, just the standard scrapy code and project structure upon creation. This resulted in the same error. I am not quite certain whether my approach is anti-pattern, since flask and scrapy are in bundled into run image, resulting in one container for two purposes.
Here is my server.py code. When executing outside a container (using python interpreter) everything works as intended. When running it from a container, then I receive the error message (ERR_EMPTY_RESPONSE).
# server.py
import subprocess
from flask import Flask
from medien_crawler.spiders.firstclassspider import FirstClassSpider
app = Flask(__name__)
@app.route("/")
def return_hello():
return "Hello!"
@app.route("/firstclass")
def return_firstclass_comments():
spider_name = "firstclass"
response = subprocess.call(['scrapy', 'crawl', spider_name, '-a', 'start_url=https://someurl.com'])
return "OK!"
if __name__ == "__main__":
app.run(debug=True)
FROM python:3
WORKDIR /usr/src/app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 5000
CMD [ "python", "./server.py" ]
Finally I run docker run -p 5000:5000 . It does not work. Any ideas?