I'm working with scrapy. I have a pipieline that starts with:
class DynamicSQLlitePipeline(object):
@classmethod
def from_crawler(cls, crawler):
# Here, you get whatever value was passed through the "table" parameter
table = getattr(crawler.spider, "table")
return cls(table)
def __init__(self,table):
try:
db_path = "sqlite:///"+settings.SETTINGS_PATH+"\\data.db"
db = dataset.connect(db_path)
table_name = table[0:3] # FIRST 3 LETTERS
self.my_table = db[table_name]
I've been reading through https://doc.scrapy.org/en/latest/topics/api.html#crawler-api , which contains:
The main entry point to Scrapy API is the Crawler object, passed to extensions through the from_crawler class method. This object provides access to all Scrapy core components, and it’s the only way for extensions to access them and hook their functionality into Scrapy.
but still do not understand the from_crawler method, and the crawler object. What is the relationship between the crawler object with spider and pipeline objects? How and when is a crawler instantiated? Is a spider a subclass of crawler? I've asked Passing scrapy instance (not class) attribute to pipeline, but I don't understand how the pieces fit together.