I am getting the above mentioned error when trying to crawl a website. There are many posts on SO with similar issue, most notably this one: Scrapy: HTTP status code is not handled or not allowed? where it is suggested to change the user agent to prevent this error. However, my issue is a bit different. I did change the user agent and I am still unable to run scrapy crawl spidername
command, but I am able to run scrapy shell "website.com"
without an issue and I am even able to get the response from the website inside the shell and parse the html. The error only happens when I try to run crawl
command.
What could be the issue? Here is my error message:
I am even able to run spider
object from inside the shell without any errors.