0

I have an array

myArray = array(url1,url2,...,url90)

I want to execute this commande 3 times in parallel

scrapy crawl mySpider -a links=url

and each time with 1 url,

scrapy crawl mySpider -a links=url1
scrapy crawl mySpider -a links=url2
scrapy crawl mySpider -a links=url3

and when the first one finish his job, he will get the other url like

scrapy crawl mySpider -a links=url4

I read this question, and this one and I try this:

import threading
from threading import Thread

def func1(url):

    scrapy crawl mySpider links=url

if __name__ == '__main__':
    myArray = array(url1,url2,...,url90)
    for(url in myArray):
        Thread(target = func1(url)).start()
Community
  • 1
  • 1
parik
  • 2,313
  • 12
  • 39
  • 67

1 Answers1

2

When you write target = func1(url) you actually runnig func1 and passing result to Thread (not a reference do the function). This means functions are run on the loop not in the seperate thread.

You need to rewrite it like that:

if __name__ == '__main__':
    myArray = array(url1,url2,...,url90)
    for(url in myArray):
        Thread(target=func1, args=(url,))).start()

Then you are telling Thread to run func1 with arguments (url,)

Also you should wait for Threads to finish after the loop, otherwise your program with terminate just after starting all the threads.

EDIT: and if you want only 3 threads to be run on the same time you may want to use ThreadPool:

if __name__ == '__main__':
    from multiprocessing.pool import ThreadPool

    pool = ThreadPool(processes=3)
    pool.map(func, myArray)
Pax0r
  • 2,324
  • 2
  • 31
  • 49
  • When the first one finished his job, does he recieve another url? , in your edit you didn't write the "for loop" – parik Sep 09 '16 at 09:35
  • Yes - ThreadPools creates 3 Threads and `pool.map` runs `func` for every element from `myArray` using this threads. So when one thread finishes its job it runs `func` with next url from `myArray`. In this case you don'tneed to write 'for loop' as `map` functions runs one internally. – Pax0r Sep 09 '16 at 09:50