0
if __name__ == '__main__':

    team_one = ['512191', '386271', '935881']

    for item in team_one:
        p = PlayerFree(item)

    team_two = ['288211', '1066118', '532424', '494230']

    for item in team_two:
        p = PlayerFree(item)

I have a for loop and I initialise my PlayerFree instances one at a time. As I have the list of items already, I want the class to run simultaneously for all the items in the list

I know how to do it in a function but can we do it for a class directly?

martineau
  • 119,623
  • 25
  • 170
  • 301
johnrao07
  • 6,690
  • 4
  • 32
  • 55
  • 1
    Not sure what you are asking here. There is no such thing as _"running a class"_. Do you want to initialise objects simultaneously? What is your use case? – Selcuk Oct 21 '19 at 05:52
  • @Selcuk yes initialise objects. And what do you mean by use case? – johnrao07 Oct 21 '19 at 05:55
  • Can we do this or not? – johnrao07 Oct 21 '19 at 06:04
  • You "can" but what are you trying to gain? Do these take a long time to initialise? – Tim Oct 21 '19 at 06:06
  • @Tim Yes each one takes around 5 minutes to complete. And when I have a list of 25 items, it takes about 40 minutes. – johnrao07 Oct 21 '19 at 06:07
  • Do you want **all** the players in the same list, or in **two different** lists? – Wololo Oct 21 '19 at 06:11
  • @magnus two separate list, two squads goes to different databases – johnrao07 Oct 21 '19 at 06:12
  • 1
    The important question is are you IO or CPU bound? IO-bound you can use threads, CPU bound you will need to use multiprocessing. Search the standard library for `ProcessPool` and `ThreadPool` – Tim Oct 21 '19 at 06:13
  • @Tim CPU bound as I only get one url in each and do the rest of the operations locally. I use beautifulsoup – johnrao07 Oct 21 '19 at 06:15
  • 1
    The answer proposed by @pyd should do the trick I'd suggest trying both and measuring. – Tim Oct 21 '19 at 06:20

1 Answers1

2
import concurrent.futures

team_one = ['512191', '386271', '935881']

#multithreading
with concurrent.futures.ThreadPoolExecutor() as executor:
    results = list(executor.map(PlayerFree,team_one))

#multiprocessing
with concurrent.futures.ProcessPoolExecutor() as executor:
    results = list(executor.map(PlayerFree,team_one))
Pyd
  • 6,017
  • 18
  • 52
  • 109
  • Claims to be CPU bound probably want `ProcessPoolExecutor`. – Tim Oct 21 '19 at 06:19
  • When I use ProcessPoolExecutor() I get this error: `concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.` – johnrao07 Oct 21 '19 at 06:33
  • And when I use ThreadPoolExecutor() for some of the items it is not doing what it is supposed to do. But for some it is working perfectly. What should be going wrong here? And what if I have both I/O and CPU bound? – johnrao07 Oct 21 '19 at 06:39
  • @pyd for some items it is skipping the operations that's what I think – johnrao07 Oct 21 '19 at 06:40
  • You cannot be both I/O and CPU bound at the same time. – Jan Christoph Terasa Oct 21 '19 at 06:41
  • @pyd What is the overhead of executing this using multiprocessing? It fires up a new interpreter, transfers/shares some state and has to collect the results. This probably only makes sense if `team_one` is sufficiently large? – Jan Christoph Terasa Oct 21 '19 at 06:42
  • @JanChristophTerasa okay, as I make one url call and mostly operations, this has to be CPU bound! – johnrao07 Oct 21 '19 at 06:43
  • @johnrao07 It's easy to see in `top` or Task Manager. If CPU is at 100%, you are CPU bound. If memory usage is at 100% (and your system likely starts to swap), you are memory bound. If you are neither memory nor CPU bound, you are I/O bound. – Jan Christoph Terasa Oct 21 '19 at 06:45
  • @JanChristophTerasa CPU usage went up from 30 to 65 when I run it, so it is CPU confirmed! – johnrao07 Oct 21 '19 at 06:49
  • @pyd Class code is just normal python scraping using beautifulSoup and then storing the data in a database. – johnrao07 Oct 21 '19 at 06:52
  • 2
    @johnrao If it runs on 65% CPU you are either memory or I/O limited. I think you should not try to prematurely optimize something before you fully understand what the "problem" (if there really is any) is at all. What is wrong with your simple code above? Think about that **you** (and the future yous as well) and **your colleagues** need to be able to understand what is happening. Simple code is usually better, especially for beginners. And since you use Python you have already decided that readability and maintainability are more paramount than speed. – Jan Christoph Terasa Oct 21 '19 at 06:55
  • check [here](https://stackoverflow.com/questions/15900366/all-example-concurrent-futures-code-is-failing-with-brokenprocesspool) once – Pyd Oct 21 '19 at 07:11
  • @pyd okay I have many things going on but `ThreadPoolExecutor()` seems to be working as expected for me. But why is it only working for some items and skipping for some items. By skipping I mean it is doing it partly for some items – johnrao07 Oct 21 '19 at 11:44
  • @pyd check the exact question listed here bro https://stackoverflow.com/questions/58819905/brokenprocesspool-a-process-in-the-process-pool-was-terminated-abruptly-while-t – johnrao07 Nov 12 '19 at 13:37
  • @pyd can you have a look at this problem? It is related to this question https://stackoverflow.com/questions/59040311/update-variable-while-working-with-processpoolexecutor?noredirect=1#comment104324638_59040311 – johnrao07 Nov 26 '19 at 15:03