0

I looked through some old questions trying to find if someone had encountered this, but didn't find any that were relevant, so my question is this:

I have some code that runs fine before I use Cython to compile it, but after using Cython it exits between the first process and the second.

My code looks like:

import dummyC as dummy
import numpy as np
import math
import logging
import time
import concurrent.futures as cf
from multiprocessing import freeze_support
start=time.time()
format = "%(asctime)s: %(message)s"
logging.basicConfig(format=format, level=logging.INFO, datefmt="%H:%M:%S")
logging.info("Here we go!")
#path1="Y:/data/remotesensing/satellite/VIIRS/L1B/Pilot Areas/Fire/Observation/VNP02IMG.A2012164.1742.001.2017287025318.nc"
#path2="Y:/data/remotesensing/satellite/VIIRS/L1B/Pilot Areas/Fire/Geolocation/VNP03IMG.A2012164.1742.001.2017287005326.nc"
path1= "Y:/data/remotesensing/satellite/VIIRS/L1B/Pilot Areas/Fire/Observation/VNP02IMG.A2012152.1000.001.2017287002855.nc"
path2= "Y:/data/remotesensing/satellite/VIIRS/L1B/Pilot Areas/Fire/Geolocation/VNP03IMG.A2012152.1000.001.2017286230211.nc"

fd = dummy.fireData(path1,path2)

def thread_me(arr):
    #The Algorithm here is proprietary, but also not the issue as it works pre-Cython
    x=1 #Dummy code that produces the same issue
def start():
    if __name__ == '__main__':
        freeze_support()
        arg=[[0,lines//4,0,pix//4],[0,lines//4,pix//4,pix//2],[0,lines//4,pix//2,(pix//4+pix//2)],[0,lines//4,(pix//4+pix//2),pix],
             [lines//4,lines//2,0,pix//4],[lines//4,lines//2,pix//4,pix//2],[lines//4,lines//2,pix//2,(pix//4+pix//2)],[lines//4,lines//2,(pix//4+pix//2),pix],
             [lines//2,(lines//2+lines//4),0,pix//4],[lines//2,(lines//2+lines//4),pix//4,pix//2],[lines//2,(lines//2+lines//4),pix//2,(pix//4+pix//2)],[lines//2,(lines//2+lines//4),(pix//4+pix//2),pix],
             [(lines//2+lines//4),lines,0,pix//4],[(lines//2+lines//4),lines,pix//4,pix//2],[(lines//2+lines//4),lines,pix//2,(pix//4+pix//2)],[(lines//2+lines//4),lines,(pix//4+pix//2),pix]]



        with cf.ProcessPoolExecutor(max_workers=4) as executor:
           output = executor.map(thread_me, arg)
           executor.shutdown(wait=True)
        output = np.array(list(output))
        numFire=0
        file = open("NoFirePoints.txt","w")
        for i in range(len(output)):
            for j in range(len(output[i])):
                numFire+=1
                print(output[i][j][0],output[i][j][1],output[i][j][2])
                tempStr = ' '.join([str(elem) for elem in output[i][j]])
                tempStr = tempStr+'\n'
                file.write(tempStr)

        print(fd.firePoints,"Fire points")
        logging.info("%d fires detected"%(numFire))
        file.close()

        print(str((time.time()-start)/60))
start()

And then I call start() from another program so I can compile this one. Any ideas as to why it exits after the first iteration?

Image of what my execution looks like:

Execution Screenshot

Note that the "Here we go" is a trace, and will be removed as soon as I figure out why it won't execute further. Each message is from a process starting. Also that Untitled0.py is the program running the compiled version and FireDetection.py is the uncompiled version

Thanks in advance for your help!

Glenn Driver
  • 101
  • 1
  • Can you reproduce the problem with some dummy code in `thread_me` that is *not* proprietary? Otherwise it's not clear you can be helped. – Iguananaut Jan 08 '20 at 17:13
  • I will say you have this a little bit backwards. You should write `if __name__ == '__main__': multiprocessing.freeze_support(); start()` in the module body, not in the `start()` function itself. – Iguananaut Jan 08 '20 at 17:14
  • @Iguananaut If I take out all the code and have it run x=1 it still only runs once. And when I had it the way you suggest it never uses the Freeze Support because I have to call the start method after it's compiled, unless there's a way to call __main__? – Glenn Driver Jan 08 '20 at 17:16
  • Depending on what multiprocessing model you're using you also won't see the log message more than once; if you want to be sure some code is executed per-process you should move that code into the function you are parallelizing. Are you sure `thread_me` itself is only running once? – Iguananaut Jan 08 '20 at 17:16
  • I suggest, if you are having trouble with parallel processing, reduce the problem to the minimum possible code; just calling a dummy function, etc. All the rest is noise. If you can't reproduce the problem then start adding things back in until you can. See https://stackoverflow.com/help/minimal-reproducible-example – Iguananaut Jan 08 '20 at 17:18
  • Also, try breaking your code into multiple modules--Cythonize just those functions that need to be high performance, but not things like your main function. It will be easier to debug that way. – Iguananaut Jan 08 '20 at 17:19
  • See [this answer](https://stackoverflow.com/questions/59124525/python-concurrent-futures-processpoolexecutor-fail-to-work/59320867#59320867) for a simple example of how to structure your code around `ProcessPoolExecutor`. If you want your `thread_me` function to be higher performance I would suggest moving it to a separate module, and then Cythonizing just that module. There's nothing to gain by running Cython over the rest of the code you've given. – Iguananaut Jan 08 '20 at 17:23
  • Also in the code following `file = open("NoFirePoints.txt","w")` it looks like you're writing some custom code to output a Numpy array to a plain text file, but Numpy already has higher-performance built-in functions to do that, such as [`numpy.savetxt`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html) – Iguananaut Jan 08 '20 at 17:25
  • The Cython code may be encountering a race condition and or writing to protected memory, which sometimes doesn't give a proper run-time error and results in premature program exit. – Edge Jan 10 '20 at 22:30

0 Answers0