1

This is my spider code

 class DmozSpider(BaseSpider):
  5     name = "dmoz"
  6     allowed_domains = ["dmoz.org"]
  7     start_urls = [
  8             "file:///home/ubuntu/xxx/test.html",
  9             ]
 10     def parse(self, response):
 11         hxs = HtmlXPathSelector(response)
 12         sites = hxs.select("//li")
 13         items = []
 14         for site in sites:
 15
 16             item = DmozItem()

 17             item['title'] = site.select('a/text()').extract()
 18             item['link'] = site.select('a/@href').extract()
 19             item['desc'] = site.select('text()').extract()
 20             items.append(item)
 21         return items

Now i want to write data in log file like name: {{name}} , link={{link }} for tetsing , as it crawls the site live.

how can i do that

Mirage
  • 30,868
  • 62
  • 166
  • 261
  • What have you tried? Writing formatted output to text file is `trivial` and so is using simple log APIs as indicated in the answers to a recent question of yours (http://stackoverflow.com/questions/13304325/how-can-i-log-into-website-and-do-stuff-in-python). While this site occasionally entertain basic questions, particularly when searching the web for self obvious keywords doesn't yield good insight, I'm afraid this question doesn't meet this minimal expectation... Voting to close. – mjv Nov 12 '12 at 04:32

1 Answers1

8

Here's the answer, but I assume you just copied the code you already have, otherwise you'd know how to use file IO, or at least have the capability to research the topic which has been covered a million times on this site alone.

...
item['title'] = site.select('a/text()').extract()
item['link'] = site.select('a/@href').extract()
item['desc'] = site.select('text()').extract()
items.append(item)
with open('log.txt', 'a') as f:
  f.write('name: {0}, link: {1}\n'.format(item['title'], item['link']))
Aesthete
  • 18,622
  • 6
  • 36
  • 45