I'd like to check tor before I start crawling using python scrapy. I am using polipo/tor/scrapy on linux.
with this settup scrapy correctly using tor on its crawls. The way I check if the scrapy using tor correctly is to crawl this page in myspider.
class mySpider(scrapy.Spider):
def start_requests(self):
yield Request('https://check.torproject.org/', self.parse)
def parse(self, response):
logging.info("Check tor page:" + str(response.css('.content h1::text')))
However I think there might be a better/clean way of doing it. I know I can check tor service status or check ip address but I want to actually check whether tor connection is correctly established.