I'm running scrapy as a AWS lambda function. Inside my function I need to have a timer to see whether it's running longer than 1 minute and if so, I need to run some logic. Here is my code:
def handler():
x = 60
watchdog = Watchdog(x)
try:
runner = CrawlerRunner()
runner.crawl(MySpider1)
runner.crawl(MySpider2)
d = runner.join()
d.addBoth(lambda _: reactor.stop())
reactor.run()
except Watchdog:
print('Timeout error: process takes longer than %s seconds.' % x)
# some other logic here
watchdog.stop()
Watchdog timer class I took from this answer. The problem is the code never hits that except Watchdog
block, but rather throws an exception outside:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 1182, in run
self.function(*self.args, **self.kwargs)
File "./functions/python/my_scrapy/index.py", line 174, in defaultHandler
raise self
functions.python.my_scrapy.index.Watchdog: 1
I need to catch exception in the function. How would I go about that. PS: I'm very new to Python.