Threading problem with threadpool and objects

Question

I have a problem when using objects and treads. Below follows a simplified example of the code.

I am using a threadpool to loop over a list of jobs.

class File(object):
    def __init__(self, name, streams = [])
        self.name = name
        self.streams = streams

    def appendStream(stream):
        self.streams.append(stream)

class Job(object):
    def __init__(self, file):
        self.file = file

def main():
    ...
    jobs = []

    for f in input_files:
        f_obj = File(f)
        jobs.append(Job(f_obj))

    with ThreadPool(processes = 2, initializer = init, initargs = (log, p_lock)) as pool:
        pool.map(func = process_job, iterable = jobs, chunksize = 1)
    ...

The function (process_job) used by the thread pool resides in the same .py file.

def process_job(job):
    ...
    get_info(job.file)
    ...

This function in turn uses a function (get_info) from a self defined package. This function creates an argument list and then calls subprocess.check_output(). The subprocess returns a json struct which is looped over to update the contents of the input object.

def get_info(file):
    ...
    args = ["ffprobe", ..., "-i", file.name]
    try:
        output = subprocess.check_output(args)
    except Exception as e:
        print(e)

    data = info_json.decode('utf8')
    json_data = json.loads(data)

    for item in info_json:
        file.appendStream(item["stream"])
    ...

The problem is that when running this code the threads spawned by the pool is updating each others file objects. For example when running this with 5 input files the 5th job.file.streams will contain 5 streams i.e the 4 previous streams that belongs to the other files. Why is this happening and how can I solve it.

Best regards!

Without looking at any of the rest of it, this is a classic bug: `def __init__(self, name, streams = [])`. See https://stackoverflow.com/q/19497879/1256452 — torek, Dec 04 '18 at 23:57
That might be the case will try with a different approach for the object list creation. — Bregell, Dec 05 '18 at 07:01

score 0 · Accepted Answer · answered Dec 05 '18 at 07:34

0

As @torek spotted it seems to be a case of the "Mutable Default Argument".

“Least Astonishment” and the Mutable Default Argument

answered Dec 05 '18 at 07:34

Bregell

139
10

Threading problem with threadpool and objects

1 Answers1