13

I'm creating a multiprocess, which creates a csv file. When I run the code with d.daemon = False it works fine, ie it creates a file in the same folder. But when compiled and run with d.daemon = True, it does not, ie does not creates a file. Why's so?

My Code

I've a seed list of URLs from which I need to scrape the data.

for url in config.SEED_LIST:
    # starting a new process for each category.
    d = multiprocessing.Process(target=workers.scrape, args=())
    d.daemon = True
    d.start()


def scrape():
    import time
    time.sleep(5)
    # The above part of code takes some time to scrape a webpage, applying
    # some logic, which takes some time to execute, hence I've added a time
    # sleep of 5 secs. But when run with daemon = True, the file is not
    # created. Else it works fine.

    data = [[1, 2, 3, 4], [2224, 34, 34, 34, 34]]
    with open('1.csv', "wb") as f:
        writer = csv.writer(f)
        writer.writerows(data)
Praful Bagai
  • 16,684
  • 50
  • 136
  • 267

2 Answers2

24

According to multiprocess daemon documentation by setting d.daemon=True when your script ends its job will kill all subprocess. That occurs before they can start to write so no output will be produced.

Michele d'Amico
  • 22,111
  • 8
  • 69
  • 76
7

d.daemon = True means that the subprocess is automatically terminated after the parent process ends to prevent orphan processes. join() is helpful by simply adding d.join() after d.start(), so that the parent process does not end before the child process; instead, the parent process will wait until the child process ends.

xxllxx666
  • 183
  • 1
  • 6