0

I have a list of files file_list and the function get_id that I pass into pool.map. Now I want to change the proxy for all python traffic every 300 tasks. The environment proxy I set like recommended here How to pass all Python's traffics through a http proxy?

So I think I need something like a global variable to count the finished tasks. But how I have to set theses variable?

def get_id(file_list):
    with open(file_list, 'rb') as fp:
        tweet = pickle.load(fp)  

    os.system('snscrape twitter-search "(to:'+tweet+')')


NUM_CPUS = mp.cpu_count()    
def mp_handler(file_list,proxies):
pool = mp.Pool(NUM_CPUS) 
pool.map(get_id, file_list)

if __name__ == "__main__":
    
    proxies = ['111.11.111.11:1212',
                '222.22.222.22:1212']

    os.environ['http_proxy'] = proxies[0]
    os.system("echo $http_proxy")
               
    file_list = function that create file_list
   
    start = time.time() 
    mp_handler(file_list)
    end = time.time()
    print(end - start)
    
padul
  • 134
  • 11
  • 1. I guess your code have some errors. For example get_id should get not the file_list but single file only. 2. You can use method `.pop()` in get_id so your list will shrink. Then you can check list length `len(file_list) % 300 == 0` and do something with your proxies. – viilpe Dec 06 '20 at 15:54
  • @viilpe Thank you. To your point one. The code works as written. In the beginning I also tried a list as input, as you suggested. To my suprise this did not work. To 2. I want to count over all the cores the finished tasks. And then change the proxy for all. Your suggestion would count the tasks for one core. Or Im wrong with this? – padul Dec 06 '20 at 19:09

0 Answers0