3

I am trying to use the link parsing structure described by "warwaruk" in this SO thread: Following links, Scrapy web crawler framework

This works great when only grabbing a single item from each page. However, when I try to create a for loop to scrape all items within each page, it appears that the parse_item function terminates upon reaching the first yield statement. I have a custom pipeline setup to handle each item, but currently it only receives one item per page.

Let me know if I need to include more code, or clarification. THANKS!

def parse_item(self,response):  
    hxs = HtmlXPathSelector(response)
    prices = hxs.select("//div[contains(@class, 'item')]/script/text()").extract()
    for prices in prices:
        item = WalmartSampleItem()
        ...
        yield items
Community
  • 1
  • 1
Tyler
  • 1,705
  • 2
  • 18
  • 26

1 Answers1

2

You should yield a single item in the for loop, not items:

for prices in prices:
    item = WalmartSampleItem()
    ...
    yield item
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • It still seems to have the same issue, I just accidentally added that the s when I pasted the code in. – Tyler Apr 19 '14 at 00:07