How to process server side event data without blocking the new events in python

Question

# pip3 install sseclient-py json5 requests
import sseclient  
import requests
import concurrent.futures
import json5


def task1(event):
    # task code here
    print("[+] Got Event: ", event.event)
    print(json5.loads(event.data))

def task2(event):
    # task code here
    print("[+] Got Event: ", event.event)
    print(json5.loads(event.data))

def task3(event):
    # task code here
    print("[+] Got Event: ", event.event)
    print(json5.loads(event.data))

my_api = "https://example.com/lol"
response = requests.get(my_api, stream=True)
if response.status_code == 200:
    executor = concurrent.futures.ThreadPoolExecutor()
 
    client = sseclient.SSEClient(response)
    # Loop forever (while connection "open")
    for event in client.events():
        if event.event == "Task1":                         
            executor.submit(self.task1, event) 

        elif event.event == "Task2": 
            executor.submit(self.task2, event)                             

        elif event.event == "Task3":
            executor.submit(self.task3, event)

Above is the example, it is working absolutely fine, the problem I'm facing is that, while checking the event.event to if-else ladder,

it is taking some fraction of seconds, and this time delay is eventually increasing, after 2-3 minutes, I'm facing time delay of around 6-11 seconds, which is problematic in my case.

I'm using concurrent.futures module so that while processing the event data, it will not block the for loop, but still it is not that helpful

What should I do, so that time delay is minimum around 1 sec max

Thanks in advance

Sounds like you're getting events faster than you can process them. Which means there are a *lot* events or they have *huge* payloads. Python tends to be single process. You may need to figure out how to offload event processing to other processes (maybe via [multiprocessing](https://docs.python.org/3/library/multiprocessing.html)). It may also be possible that you simply do not have enough computing power or bandwidth to keep up. — Ouroborus, Feb 13 '22 at 06:40
In my case Bandwidth and Computing power is not the issue, as I've deployed the code on AWS EC2 server with 1GB Ram, and I've tested the code on my local system as well, with 16 GB (i7), There are around 8 types of event and they are coming at quite high rate, Can you please post the example solution — Pushpender Singh, Feb 13 '22 at 06:45
Are you sure? The evidence shows that, in this situation, your code is unable to keep up. If your EC2 instance can keep up, you'll need to look at the differences between that system and the problem system. Note that "1GB ram" has nothing to do with computing power or bandwidth. — Ouroborus, Feb 13 '22 at 06:48
Maybe 1 GB is quite less for the code, I don't know, But is there a way to increase the speed without upgrading the server configuration ? — Pushpender Singh, Feb 13 '22 at 06:53
As I said, look into something like [`multiprocessing`](https://docs.python.org/3/library/multiprocessing.html). This would allow you to utilize more CPU cores in Python than might otherwise be used. — Ouroborus, Feb 13 '22 at 06:56
Using different Multithreading module doesn't help in my case, is there a way to yield event data without for loop ? — Pushpender Singh, Feb 13 '22 at 07:14
Multi-threading and multi-process aren't the same thing. Your `for` loop and `if`/`else` ladder produce no significant overhead. Changing to, say, a callback or event based system, simply hides that `for` loop in the providing module. Regardless, there does not appear to be any alternative to `SSEClient`. — Ouroborus, Feb 13 '22 at 07:46
Looks suspiciously like your code would run faster without the thread pool executor, which does add overhead. — jwal, Feb 13 '22 at 09:10
@jwal This is sample code, `task1(), task2() task3()` functions have around 50 lines of code, so I think thread pool executor is necessary. @Ouroborus I think event based system or callback will work, but I'm not able to figure out that I should I implement that — Pushpender Singh, Feb 13 '22 at 09:41
@Pushpender, threads use a single process on a single CPU. So every attempt to use threads to spread the load fails. Multi-processing, on the other hand, may help. Personally I prefer asyncio to threading as the overhead is smaller and the lack of multi-processing is more obvious. — jwal, Feb 14 '22 at 03:02
Many good explanations for this on stack overflow, one being https://stackoverflow.com/a/4496918/6242321 — jwal, Feb 14 '22 at 03:16

How to process server side event data without blocking the new events in python

0 Answers0