I am newish to the world of distributed scrapy crawls, but I found out about scrapy-redis and have been using it. I am using it on a raspberry pi to scrape a large number of URLs that I push to redis. What I have been doing is creating multiple SSH sessions into the Pi, where I then run
scrapy crawl myspider
to have the spider "wait". I then start another SSH and do redis-cli lpush "my links". The crawlers then run, although I'm not sure how concurrent they are actually running.
I'm hoping this is clear, if not please let me know and I can clarify. I'm really just looking for a "next step" after implementing this barebones version of scrapy-redis.
edit: I based my starting point from this answer Extract text from 200k domains with scrapy. The answerer said he spun up 64 spiders using scrapy-redis.