1

Based on this answer (https://stackoverflow.com/a/20192251/9024698), I have to do this:

from multiprocessing import Pool

def process_image(name):
    sci=fits.open('{}.fits'.format(name))
    <process>

if __name__ == '__main__':
    pool = Pool()                         # Create a multiprocessing Pool
    pool.map(process_image, data_inputs)  # process data_inputs iterable with pool

to multi-process a for loop.

However, I am wondering, how can I get the output of this and further process if I want?

It must be like that:

if __name__ == '__main__':
    pool = Pool()                         # Create a multiprocessing Pool
    output = pool.map(process_image, data_inputs)  # process data_inputs iterable with pool
    # further processing

But then this means that I have to put all the rest of my code in __main__ unless I write everything in functions which are called by __main__?

The notion of __main__ has been always pretty confusing to me.

Outcast
  • 4,967
  • 5
  • 44
  • 99
  • I would just "write everything in functions which are called by `__main__`". In languages like Java or C++ that *require* a `main` entry point, the entire program is `main` calling other functions. – Carcigenicate Nov 04 '19 at 23:21
  • @Carcigenicate, ok I see. So literally everything should be encapsulated in functions simply because I want to do multi-processing. Does not sounds that reasonable to me from a higher-level viewpoint but it must make sense for python at the lower-level. – Outcast Nov 04 '19 at 23:34
  • That's a proper way to have a program anyways. Ideally your program should already be broken down into functions, and you just need to call them from `main` when switching to using multiprocessing. Having everything as a top level script is messy and causes problems as the code grows. – Carcigenicate Nov 04 '19 at 23:36
  • @Carcigenicate, sure I agree from the viewpoint of a finalised code - I am at the phase of prototyping though for now haha. But yes I see your point. – Outcast Nov 04 '19 at 23:38
  • Ya, if you're developing in a REPL or something, I'll admit, it is a bit of a pain. – Carcigenicate Nov 04 '19 at 23:41
  • @Carcigenicate, sure, thank you :) – Outcast Nov 04 '19 at 23:42

1 Answers1

2

if __name__ == '__main__': is literally just "if this file is being run as a script, as opposed to being imported as a module, then do this". __name__ is a hidden variable that gets set to '__main__' if it's being run as a script. why it works this way is beyond the scope of this discussion but suffice it to say it has to do with how python evaluates sourcefiles top-to-bottom.

In other words, you can put the other two lines anywhere you want - in a function, probably, that you call elsewhere in the program. You could return output from that function, or do other processing on it, or etc., whatever you happen to need.

Green Cloak Guy
  • 23,793
  • 4
  • 33
  • 53
  • Thanks, I see. So basically ALL my code should be put into functions which are called by `__main__`? As I said above, I find it a bit excessive simply because I want to do multi-processing but it must make sense for python somehow. – Outcast Nov 04 '19 at 23:36
  • @PoeteMaudit either all your code is reachable from `if __name__ == '__main__':` or reachable from the things that are reachable from that, or etc. etc. Your program has one entrypoint, and anything that isn't connected by at least *some* execution change just won't run. It's like `int main() {}` in C or Java, it's where the program starts, but it's not special beyond that. – Green Cloak Guy Nov 05 '19 at 02:12