Python sub processes randomly drops to 0% cpu usage, causing the process to "hang up"

Question

I run several python subprocesses to migrate data to S3. I noticed that my python subprocesses often drops to 0% and this condition lasts more than one minute. This significantly decreases the performance of the migration process.

Here is the pic of the sub process:

The subprocess does these things:

Query all tables from a database.

Spawn sub processes for each table.

for table in tables:
    print "Spawn process to process {0} table".format(table)
    process = multiprocessing.Process(name="Process " + table,
                                  target=target_def,
                                  args=(args, table))
    process.daemon = True
    process.start()
    processes.append(process)
for process in processes:
    process.join()

Query data from a database using Limit and Offset. I used PyMySQL library to query the data.

Transform returned data to another structure. construct_structure_def() is a function that transform row into another format.

buffer_string = []
for i, row_file in enumerate(row_files):
    if i == num_of_rows:
        buffer_string.append( json.dumps(construct_structure_def(row_file)) )
    else:
        buffer_string.append( json.dumps(construct_structure_def(row_file)) + "\n" )
content = ''.join(buffer_string)

Write the transformed data into a file and compress it using gzip.

with gzip.open(file_path, 'wb') as outfile:
    outfile.write(content)
return file_name

Upload the file to S3.
Repeat step 3 - 6 until no more rows to be fetched.

In order to speed up things faster, I create subprocesses for each table using multiprocesses.Process built-in library.

I ran my script in a virtual machine. Following are the specs:

processor: Intel(R) Xeon(R) CPU E5-2690 @ 2.90 Hz 2.90 GHz (2 Processes)
Virtual processors: 4
Installed RAM: 32 GB
OS: Windows Enterprise Edition.

I saw on the post in here that said one of the main possibilities is because of memory I/O limitation. So I tried to run one sub process to test that theory, but no avail.

Any idea why this is happening? Let me know if you guys need more information.

Thank you in advance!

Without seeing the code and how you are using it the culprit could be anything. — l'L'l, Jul 29 '17 at 06:44
Have you tried using `Pool` instead? (just to see if you are getting the same results?) — bergerg, Jul 29 '17 at 07:38
It works for one sub process! I haven't tested for several sub processes though. Going to update it here later. But why do you think `Pool` method will help here? @nutmeg64 — Vincent acent, Jul 29 '17 at 08:24
Not saying it would, just trying to narrow down the problem and\or eliminate other factors. — bergerg, Jul 29 '17 at 08:36
BTW, if you are joining the process then don't make it a daemon. — bergerg, Jul 29 '17 at 08:43
@nutmeg64 Why? Will the daemon cause harm to the script? What's your best guess about the cause of CPU 0%? — Vincent acent, Jul 29 '17 at 08:48
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/150453/discussion-between-vincent-acent-and-nutmeg64). — Vincent acent, Jul 29 '17 at 08:58

score 1 · Accepted Answer · answered Aug 12 '17 at 16:17

1

Turns out the culprit was the query I ran. The query took a long time to return the result. This made the python script idle thus zero percent usage.

answered Aug 12 '17 at 16:17

Vincent acent

465
2
7
15

score 0 · Answer 2 · answered Jul 29 '17 at 08:59

Your VM is a Windows machine, I'm more of a Linux person so I'd love it if someone will back me up here.

I think the daemon is the problem here. I've read about daemon preocesses and especially about TSR.

The first line in TSR says:

Regarding computers, a terminate and stay resident program (commonly referred to by the initialism TSR) is a computer program that uses a system call in DOS operating systems to return control of the computer to the operating system, as though the program has quit, but stays resident in computer memory so it can be reactivated by a hardware or software interrupt.

As I understand, making the process a daemon (or TSR in your case) makes it dormant until a syscall will wake it up, which I don't think is the case here (correct me if I'm wrong).

Python sub processes randomly drops to 0% cpu usage, causing the process to "hang up"

2 Answers2