In some stack overflow questions I've seen accepted answers where the __init__
method of the scrapy.spider superclass is overriden by the user defined spider. For example: selenium with scrapy for dynamic page.
My question is, what are the risks of doing so? The __init__
of the super class looks like this:
class Spider(object_ref):
"""Base class for scrapy spiders. All spiders must inherit from this
class.
"""
name = None
custom_settings = None
def __init__(self, name=None, **kwargs):
if name is not None:
self.name = name
elif not getattr(self, 'name', None):
raise ValueError("%s must have a name" % type(self).__name__)
self.__dict__.update(kwargs)
if not hasattr(self, 'start_urls'):
self.start_urls = []
So, if I were to define an __init__
in my spider that inherits from this class and didn't include a call to the superclass __init__
would I be breaking scrapy functionality? How to mitigate that risk? Call the super's __init__
in my spider? Looking for best practices for scrapy and also a better understanding of __init__
calls in the context of class inheritance.