docker scrapy spyder not auto restart

Question

I am running docker+python+spyder

My spyder run just as much as my concurency limit, idk, can someone help me to understand it ?

my docker-compose.py

celery:
    build:
      context: .
      dockerfile: ./celery-queue/Dockerfile
    entrypoint: celery
    command: -A tasksSpider worker --loglevel=info  --concurrency=5 -n myuser@%n
    env_file:
    - .env
    depends_on:
    - redis

My spider code :

def spider_results_group():
    results = []

    def crawler_results(signal, sender, item, response, spider):
        results.append(item)

    dispatcher.connect(crawler_results, signal=signals.item_passed)

    process = CrawlerProcess(get_project_settings())
    process.crawl(groupSpider)
    process.start()  # the script will block here until the crawling is finished
    process.stop()
    return results

With this code, i could run spider multiple times, but only 5 times, when i check it, i think this is because my concurency is only 5, and when this run again(6th), it stuck..

if need other code, please ask

score 0 · Answer 1 · answered Mar 18 '19 at 08:23

0

solved by using this command :

    command: -A tasksSpider worker --loglevel=info  --concurrency=5 --max-tasks-per-child=1 -n myuser@%n

got answer from : Running Scrapy spiders in a Celery task

answered Mar 18 '19 at 08:23

note this "maybe" different issue because if iam trying outside docker, my code perfectly run, but when it run in docker+celery it got only maximum limit of celery concurency.. – Mar 18 '19 at 08:25
i mean about duplicate thing.. XD – Mar 18 '19 at 10:26

docker scrapy spyder not auto restart

1 Answers1