2

when I set start_urls inside a Scrapy spider class, the fllowing code is OK:

class InfoSpider(scrapy.Spider):
    name = 'info'
    allowed_domains = ['isbn.szmesoft.com']
    isbns = list(set(pd.read_csv('E:/books.csv')['ISBN']))
    url = 'http://isbn.szmesoft.com/isbn/query?isbn='
    start_urls = [url + isbns[0]]

But then I got the error Scrapy: NameError: name 'url' is not defined when I rewrite my code as follows:

class InfoSpider(scrapy.Spider):
    name = 'info'
    allowed_domains = ['isbn.szmesoft.com']
    isbns = list(set(pd.read_csv('E:/books.csv')['ISBN']))
    url = 'http://isbn.szmesoft.com/isbn/query?isbn='
    start_urls = [url + isbn for isbn in isbns[:3]]

Maybe I can solve this problem in other ways,but I want to know the reason for the ERROR

MJ_0826
  • 41
  • 3

3 Answers3

2

There are only four ranges in Python: LEGB, because the local scope of the class definition and the local extent of the list derivation are not nested functions, so they do not form the Enclosing scope.

Therefore, they are two separate local scopes that cannot be accessed from each other.

MJ_0826
  • 41
  • 3
0

Try doing __init__:

class InfoSpider(scrapy.Spider):
    def __init__(self):
        self.name = 'info'
        self.allowed_domains = ['isbn.szmesoft.com']
        self.isbns = list(set(pd.read_csv('E:/books.csv')['ISBN']))
        self.url = 'http://isbn.szmesoft.com/isbn/query?isbn='
        self.start_urls = [url + isbn for isbn in isbns[:3]]

Then when you call it do self. before it

U13-Forward
  • 69,221
  • 14
  • 89
  • 114
0

You need to pass string of it and try printing url so that you can also go and check it on browser if ut actually exists or not.

start_urls = [url + str(isbn) for isbn in isbns[:3]]
print(start_urls)
Upasana Mittal
  • 2,480
  • 1
  • 14
  • 19