0

so I have a JSON user database, I want to check if user has a valid id or not, if not then remove it from database. I am using threads for this but each thread will start from the starting of database, I don't want that.

Example: if thread A starts from 10 then thread B will start from 20. Also when thread A ends, I want it to start from 30 instead of 20.

I am a beginner so detailed guide & explanation would be great!

Thanks for your help.

  • can u show what u have tried? – Pratyush Arora Sep 26 '21 at 11:09
  • Hi, here is the [sample of current code](https://pastebin.com/NSTxrDea) – YourjuniorDev Sep 26 '21 at 11:25
  • the link is full of animations, spots & co... could you please just edit your question with your code? – cards Sep 26 '21 at 11:34
  • I dont really know why cant u use a single thread to for all of it. or why do u need to usse threads for that from the code u posted but [Threading](https://www.geeksforgeeks.org/multithreading-in-python-set-2-synchronization/) should help as reference, from what I understand, you can use `lock.acquire()` and `lock.release()` with the help of `lock = threading.Lock()` – Pratyush Arora Sep 26 '21 at 11:43
  • Please provide enough code so others can better understand or reproduce the problem. – Community Oct 04 '21 at 09:56

2 Answers2

0

Here is an example :

import threading
import time
import typing

MAX_NUMBER = 57  # assumed to be inclusive
JOB_SIZE = 10

indexes = tuple(
    tuple(range(0, MAX_NUMBER + 1, JOB_SIZE)) + (MAX_NUMBER + 1,)
)
jobs_spans = tuple(zip(indexes, indexes[1:]))  # cf https://stackoverflow.com/a/21303286/11384184

print(jobs_spans)
# ((0, 10), (10, 20), (20, 30), (30, 40), (40, 50), (50, 58))

jobs_left = list(jobs_spans)  # is thread-safe thanks to the GIL


def process_user(user_id: int) -> None:
    sleep_duration = ((user_id // JOB_SIZE) % 3) * 0.4 + 0.1  # just to add some variance to each job
    time.sleep(sleep_duration)


def process_users() -> typing.NoReturn:
    while True:
        try:
            job = jobs_left.pop()
        except IndexError:
            break  # no job left
        else:
            print(f"{threading.current_thread().name!r} processing users from {job[0]} to {job[1]} (exclusive) ...")
            for user_id in range(job[0], job[1]):
                process_user(user_id)
                print(f"user {user_id} processed")
    print(f"{threading.current_thread().name!r} finished")


if __name__ == "__main__":
    thread1 = threading.Thread(target=process_users)
    thread2 = threading.Thread(target=process_users)
    thread1.start()
    thread2.start()

    thread1.join()
    thread2.join()

I started by computing the spans that the jobs will cover, using only the number of users and the size of each job.
I use it to define a queue of jobs left. It is actually a list that the threads will pop onto.
I have two different functions :

  • one to process a user given its id, which has nothing to do with threading, i could use it the exact same way in a completely sequential program
  • one to handle the threading. It is the target of the threads, which means which code will get executed by each threads once it is starteded. It is an infinite loop, which try to get a new job until there is no more.

I join each thread to wait for its completion, before the script exits.

Lenormju
  • 4,078
  • 2
  • 8
  • 22
0

If you don't have time to understand Original Answer code, then you can use this. Its easy & small.

Original Source

from multiprocessing.dummy import Pool as ThreadPool

# Make the Pool of workers
pool = ThreadPool(4)

# Execute function in their own Threads
results = pool.map(func, arg)

# Close the pool and wait for the work to finish
pool.close()
pool.join()

func is your function that you want to execute.
arg is your function arg

Example:

names = ["John", "David", "Bob"]

def printNames(name):
  print(name)

results = pool.map(printNames, names)

It will print all names from names list using printNames function.
function arg - names

Links