I want to access the variable self.cursor
to make use of the active postgreSQL connection, but i am unable to figure out how to access the scrapy's instance of the pipeline class.
class ScrapenewsPipeline(object):
def open_spider(self, spider):
self.connection = psycopg2.connect(
host= os.environ['HOST_NAME'],
user=os.environ['USERNAME'],
database=os.environ['DATABASE_NAME'],
password=os.environ['PASSWORD'])
self.cursor = self.connection.cursor()
self.connection.set_session(autocommit=True)
def close_spider(self, spider):
self.cursor.close()
self.connection.close()
def process_item(self, item, spider):
print ("Some Magic Happens Here")
def checkUrlExist(self, item):
print("I want to call this function from my spider to access the
self.cursor variable")
Please note, i realise i can get access to process_item
by using yield item
but that function is doing other stuff and i want access of the connection via self.cursor
in checkUrlExist
and be able to call the instance of class from my spiders at will!
Thank you.