2

I am new to Python and attempting to learn Scrapy. I am using the tutorial for Scrapy v1.5 in here and running my code within Anaconda v4.3.1 and python 3.6.4.

I am attempting to run the following code:

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "quotes"

    def start_requests(self):
        urls = [
            'http://quotes.toscrape.com/page/1/',
            'http://quotes.toscrape.com/page/2/',
        ]
        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):
        page = response.url.split("/")[-2]
        filename = 'quotes-%s.html' % page
        with open(filename, 'wb') as f:
            f.write(response.body)
        self.log('Saved file %s' % filename)

Within Anaconda I run: %%cmd scrapy crawl quotes

and it produces the result:

Microsoft Windows [Version 10.0.16299.248]
(c) 2017 Microsoft Corporation. All rights reserved.

C:\Users\tom\Desktop\Python 36 Projects\tutorial>
C:\Users\tom\Desktop\Python 36 Projects\tutorial>
C:\Users\tom\Desktop\Python 36 Projects\tutorial>

The expected output is that it will extract the web pages from here and here but nothing shows up in the current working directory.

The link I copy and pasted shows what is supposed to show up instead but I do not know exactly what I am doing wrong as it is not placing the HTML files in my directory. If someone could point me in the right direction I would really appreciate it.

Masoud Rahimi
  • 5,785
  • 15
  • 39
  • 67
Tom H
  • 175
  • 3
  • 12
  • 1
    Have you checked the robots.txt: https://stackoverflow.com/a/37278895/1011724 – Dan Mar 01 '18 at 09:56
  • I had not and will look into trying that although since I was using instructions from a tutorial on Scrapy and since the website was designed for scraping, I do not see why that would 1. Not be included in the tutorial and 2. why they would see a need to put that obstacle in my path. I tried it in a normal python file outside of anaconda (Ie the IDLE and using a text editor) and it scraped the files just fine. Although I figured out the workaround it would be helpful to know why scrapy isn't working within Anaconda. I see a lot of benefits for using it. – Tom H Mar 07 '18 at 04:07
  • seems like what @Dan says – Ami Hollander Mar 09 '18 at 21:07

0 Answers0