There's an important detail missing from your question - the library you're looking at is written on top of twisted
, which is an asynchronous networking framework. The complete method declaration actually looks like this:
@defer.inlineCallbacks
def open_spider(self, spider, start_requests=(), close_if_idle=True):
assert self.has_capacity(), "No free spider slot when opening %r" % \
spider.name
log.msg("Spider opened", spider=spider)
nextcall = CallLaterOnce(self._next_request, spider)
scheduler = self.scheduler_cls.from_crawler(self.crawler)
start_requests = yield self.scraper.spidermw.process_start_requests(start_requests, spider)
The defer.inlineCallbacks
decorator does some magic with all the calls that used yield
. Essentially it lets you write asynchronous code that would normally use callbacks, in a way that looks synchronous:
inlineCallbacks
helps you write Deferred
-using code that looks like a
regular sequential function. This function uses features of Python 2.5
generators. If you need to be compatible with Python 2.4 or before,
use the deferredGenerator
function instead, which accomplishes the
same thing, but with somewhat more boilerplate. For example:
@inlineCallBacks
def thingummy():
thing = yield makeSomeRequestResultingInDeferred()
print thing #the result! hoorj!
When you call anything that results in a Deferred
, you can simply yield
it; your generator will
automatically be resumed when the Deferred's result is available. The
generator will be sent the result of the Deferred
with the send
method on generators, or if the result was a failure, throw
.
Your inlineCallbacks
-enabled generator will return a Deferred
object,
which will result in the return value of the generator (or will fail
with a failure object if your generator raises an unhandled
exception). Note that you can't use return result to return a value;
use returnValue(result)
instead. Falling off the end of the generator,
or simply using return will cause the Deferred
to have a result of
None
.
If you dig into the process_start_requests
call, you'll find it ultimately calls scrapy.util.defer.process_chain
, which returns a Deferred
:
def process_chain(callbacks, input, *a, **kw):
"""Return a Deferred built by chaining the given callbacks"""
d = defer.Deferred()
for x in callbacks:
d.addCallback(x, *a, **kw)
d.callback(input)
return d