Where is item returned to when I yield an item while scraping data in Python using scrapy?

Question

I wanted to know from where do I access an item or where is it returned when I yield an item in parse function ? See the sample code below

from scrapy import Spider
from scrapy import Selector


import scrapy
from scrapy.item import Item,Field


class StackItem(Item):

    title = Field()
    url = Field()

class StackSpider(Spider):
    name = "stack"
    allowed_domains = ["stackoverflow.com"]
    start_urls = [
        "http://stackoverflow.com/questions?pagesize=50&sort=newest"
    ]

    def parse(self, response):
        questions = Selector(response).xpath('//*[@class="summary"]/h3')
        for question in questions:
            item = StackItem()
            item['title'] = question.xpath(
            'a[@class="question-hyperlink"]/text()').extract()
            item['url'] = question.xpath(
            'a[@class="question-hyperlink"]/@href').extract()
            yield item

I am confused that where is this item returned back to ? And how do I access it later on ? Any help would be appreciated. Thanks

Possible duplicate of [What does the yield keyword do in Python?](http://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do-in-python) — juanpa.arrivillaga, Jun 30 '16 at 21:41

score 1 · Accepted Answer · answered Jun 30 '16 at 22:21

1

The items yielded in a Scrapy callback method are consumed by the Scrapy engine, who forwards that item to the Item Pipelines.

So, if you want to do further actions on your items (such as data validation, database persistence, etc), you have to create an Item Pipeline and configure it in your Scrapy project. Check out an example here and have a look at the Scrapy architecture:

answered Jun 30 '16 at 22:21

Valdir Stumm Junior

4,568
1
23
31

Thank you ! It was very helpful – Waqar Joyia Jun 30 '16 at 22:22

Where is item returned to when I yield an item while scraping data in Python using scrapy?

1 Answers1