0

I am scraping a page. I made two loops but the first loop is only taking transcription_price value not the last two one. Why and how to solve this problem?

def start_requests(self):
    links = {'transcription_page': 'https://www.rev.com/freelancers/transcription',
             'captions_page': 'https://www.rev.com/freelancers/captions',
             'subtitles_page': 'https://www.rev.com/freelancers/subtitles'
            }
    call = [self.parse_transcription,self.parse_caption,self.parse_subtitles]

    for link in links.values():
        for n in range(0,3):
            return [scrapy.Request(link, callback=call[n])]
Dania
  • 41
  • 1
  • 8

1 Answers1

2

Because return statement, well, returns the value and terminates1 the function, passing the control flow to the caller. This way, your inner loop is terminated before it goes over all the values.

Perhaps what you wanted was yield:

>>> def f():
...  for x in (1, 2, 3):
...   yield x
...
>>> list(f())
[1, 2, 3]

Besides, using unnamed constants is a way to plant a bug which is often not so obvious, not to say non-Pythonic:

items = ["a", "b", "c"]

# will make an incomplete round when `items` is longer than 3 elements
def poor():
  for i in xrange(0, 3):
    yield items[i]

# will do just alright
def appropriate():
  for item in items:
    yield item

1 Unless you are in try/except/finally block, in which case finally is always executed before return takes place:

def return_one():
  try:
    1/0
  except ZeroDivisionError:
    return 0
  finally:
    return 1
a small orange
  • 560
  • 2
  • 16
  • I always forget about `yield`. Perhaps a generator is exactly what OP wanted! – Reedinationer May 08 '19 at 22:06
  • I didn't understand. – Dania May 08 '19 at 22:33
  • @Dania the way you initially wrote it, the first executed `return` of your function will break the loop and return a single value from it. To obtain multiple values, you need to either insert them in a list variable and return it instead or use a generator with `yield` keyword in place of `return`, as shown in my example. – a small orange May 08 '19 at 22:37
  • I yield it and I also insert them in a list variable but it didn't work. – Dania May 08 '19 at 23:10