-1

raise DropItem below is creating too much noise and outputting complete objects

Question: How can we make it output just the string? Or is there another Way to drop items in pipelines?

the result is now a whole object with all its values and cluttering the output. The wish would be to drop 1 item silently ... we used delete() before but this resulted in errors in later pipelines. Help appreciated

    # Duplicate checker based on https://scrapy2.readthedocs.io/en/latest/topics/item-pipeline.html
    if item['sku'] in self.skus_seen:
        if "url" not in item or not item['url']:
            item['url'] = '???, plz store item url in spider'
        raise DropItem(f"Duplicate products {item['sku']} at {item['url']}")
snh_nl
  • 2,877
  • 6
  • 32
  • 62

1 Answers1

0

A populair question and answer ;)

It is given here

Implement

import logging from scrapy import logformatter

class PoliteLogFormatter(logformatter.LogFormatter): def dropped(self, item, exception, response, spider): return { 'level': logging.INFO, 'msg': logformatter.DROPPEDMSG, 'args': { 'exception': exception, 'item': item, } }

Scrapy - Silently drop an item

snh_nl
  • 2,877
  • 6
  • 32
  • 62