0

I have the following program:

daytime_images = os.listdir("D:/TR/Daytime/")
number_of_day_images = len(daytime_images)
day_value = 27

def find_RGB_day(clouds, red, green, blue):
    img = Image.open(clouds)
    img = img.convert('RGB')
    pixels_single_photo = []
    for x in range(img.size[0]):
        for y in range(img.size[1]):
            h, s, v, = img.getpixel((x, y))
            if h <= red and s <= green and v <= blue:
                pixels_single_photo.append((x,y))
    return pixels_single_photo

number = 0

for _ in range(number_of_day_images):
    world_image = ("D:/TR/Daytime/" + daytime_images[number])
    pixels_found = find_RGB_day(world_image, day_value, day_value, day_value)
    coordinates.append(pixels_found)
    number = number+1

EDITED I would want to execute the function using multiprocessor, so I tried:

for number in range(number_of_day_images):
    p = multiprocessing.Process(
        target=find_RGB_day,
        args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number],27, 27, 27))
    p.start()
    p.join()
    number = number+1
    coordinates.append(p)

When executing it, a AttributeError happened and I don't know how to solve it:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'find_RGB_day' on <module '__main__' (built-in)>

I think this error may be related with the way I introduce the images into the program, where I get all names from a folder and then select element per element with number= number+1

2 Answers2

2

I was working in parallel with wagnifico and tried the Pool (nice guess). I did not have image files, so I have constructed the similar function using namedtuple to get as close as possible.

from multiprocessing import Pool, TimeoutError
import os, time
import random
from collections import namedtuple


def find_RGB_day(red, green, blue):
    img = Image((256, 256))  # loading your image goes here eg. 256 x 256
    pixels_single_photo = []
    for x in range(img.size[0]):
        for y in range(img.size[1]):
            h, s, v, = random.randrange(255), random.randrange(255), random.randrange(255)
            if h <= red and s <= green and v <= blue:
                pixels_single_photo.append((x,y))
    time.sleep(5)  # del this line, it imitates loading a big file
    return pixels_single_photo


if __name__ == '__main__':
    number_of_day_images = 5
    day_value = 27
    Image = namedtuple('Image',['size'])

    with Pool(processes=4) as pool:  # You can play with processes number
        multiple_results = [pool.apply_async(find_RGB_day, args=(27, 27, 27)) for i in range(number_of_day_images)]
        try:
            [print(res.get()) for res in multiple_results]
        except TimeoutError:
            print("We lacked patience and got a multiprocessing.TimeoutError")

I have used python documentation about 'Using a pool of workers', which you can find HERE.

So instead of creating this loop of processes, the pool takes responsibility for all the computations.

for _ in range(number_of_day_images):
    p = multiprocessing.Process(target=find_RGB_day, args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number], 27, 27, 27))
    p.start()
    p.join()

You could have something like this:

multiple_results = [pool.apply_async(find_RGB_day, args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number], 27, 27, 27)]
    try:
        [print(res.get()) for res in multiple_results]

In my code example 4 files load and build an array of multiple_results in around 5 seconds (because we have got 4 workers) and the last one fires after another 5 seconds.

[EDIT] I have downloaded images and used this code to get all the coordinates of the desired pixels. (27, 27, 27) was too low for me, so I have used a different scale (31, 90, 170).

Enjoy.

from multiprocessing import Pool, TimeoutError
import time, os
import random
from PIL import Image


def find_RGB_day(clouds, red, green, blue):
    img = Image.open(clouds)
    img = img.convert()
    pixels_single_photo = []
    for x in range(img.size[0]):
        for y in range(img.size[1]):
            # print(img.getpixel((x, y)))
            h, s, v, = img.getpixel((x, y))
            if h <= red and s <= green and v <= blue:
                pixels_single_photo.append((x,y))
    return pixels_single_photo


def create_pool():
    coordinates = []
    with Pool(processes=4) as pool:
        files_to_precess = [pool.apply_async(find_RGB_day,
                                             args=("D/TR/Daytime/" + daytime_images[number], 31, 90, 170))
                            for number in range(number_of_day_images)]
        try:
            coordinates = [res.get() for res in files_to_precess]  # processes get your data in here
        except TimeoutError:
            print("We lacked patience and got a multiprocessing.TimeoutError")
    return coordinates

if __name__ == '__main__':
    daytime_images = os.listdir("D/TR/Daytime/")
    number_of_day_images = len(daytime_images)
    print(number_of_day_images)
    day_value = 27
    coordinates = create_pool()
    [print(res) for res in coordinates]
  • 1
    Thanks for the answer. In my program I was looking for certain pixels that were added to pixels_single_photo and then, at the end of each loop, those pixels were kept in a list called coordinates (a list of lists) How can I implement this this this program? – Joan Carles Montero Jiménez Oct 25 '20 at 07:15
  • 1
    You have got all the pixels inside multiple_results list of lists. Instead of just printing them out like I did, you can do with it whatever you please. Just try my code and you will see all of them in there :) – Marcin Mukosiej Oct 25 '20 at 14:43
  • 1
    Sorry for the inconvenience, how can you store all the coords in a list called coordinates? Trying with [coordinates.append(res.get()) for res in coordinates] I'm having the AttributeError: 'list' object has no attribute 'get'. Thanks – Joan Carles Montero Jiménez Oct 27 '20 at 22:02
  • 1
    Not a problem. I have corrected few lines in that last edit, so paste all after 'Enjoy'. It should be coordinates = [res.get() for res in files_to_precess] where files_to_process list just stores all the desired processes, so you loop through it and res.get() them out one by one. – Marcin Mukosiej Oct 28 '20 at 01:48
  • Thanks for your help, I really appreciate it. – Joan Carles Montero Jiménez Oct 28 '20 at 06:50
1

You should just do it as explained by the error message and add a main module:

def fun(inputs):
    # your function
    return outputs

if __name__ == '__main__':
   # your main code
   p = multiprocessing.Process(target=fun, args=(inputs,))
   p.start()
   p.join()

More details are available on answers for this question.

Also, there is an error in your code. You must split the function and arguments when calling your process: target=fun, args=(inputs,), like I did in the example above. When you send the function with the arguments - target=fun(inputs) - you are actually not calling any process because you are just sending the output of your the function fun as the target, and not the function itself. It will raise an error because the output of your function is not callable (not a function by itself).

In order to adapt your call, with multiple arguments, you can use:

for number in range(number_of_days_images):
    p = multiprocessing.Process(
        target=find_RGB_day,
        args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number],
              27, 27, 27)
    )
    # rest of your code ...

Moreover, I recommend you to use pool.Pool.map, that will split a list of arguments between the desired number of workers and block the results. A nice description of how to implement it for a function with multiple arguments is availabe here.

wagnifico
  • 632
  • 3
  • 13