1

I am new to developing Rest APIs and trying to deploy a machine learning model for image segmentation using Python and Rest APIs.
On the server side I am using FastAPI while on the client side I use the Python requests library. The client already resizes the image to the necessary input size of the model and therefore doesn't send unneccessary large images. The server feeds the received image to the model and returns the binary segmentation mask. The image and the mask are converted from numpy arrays to lists which are then send as json data.
Below is some code, representing what I've just described. As I cannot provide the model here the server in this minimum reproducible example is just going to return the same image it received.

server.py

import uvicorn
from fastapi import FastAPI
import numpy as np
from datetime import datetime

app = FastAPI()

@app.get('/test')
def predict_and_process(data: dict = None):
    start = datetime.now()
    if data:
        image = np.asarray(data['image'])
        print("Time to run: ", datetime.now() - start)
        return {'prediction': np.squeeze(image).tolist()}
    else:
        return {'msg': "Model or data not available"}

def run():
    PORT = 27010
    uvicorn.run(
        app,
        host="127.0.0.1", 
        port=PORT,
    )


if __name__=='__main__':
    run()

client.py

import requests
import numpy as np
import json
from matplotlib.pyplot import imread 
from skimage.transform import resize
from datetime import datetime

def test_speed():
    path_to_img = r"path_to_some_image"
    
    image = imread(path_to_img)
    image = resize(image, (1024, 1024))
    img_list = image.tolist()

    data = {'image': img_list}
    start = datetime.now()
    respond = requests.get('http://127.0.0.1:27010/test', json=data)

    prediction = respond.json()['prediction']
    print("time for prediction: {}".format(datetime.now()-start))

if __name__=='__main__':
    test_speed()

The output from the server is:

(cera) PS C:\Users\user_name\Desktop\MRM\REST> python .\server.py
[32mINFO[0m:     Started server process [[36m20448[0m]
[32mINFO[0m:     Waiting for application startup.
[32mINFO[0m:     Application startup complete.
[32mINFO[0m:     Uvicorn running on [1mhttp://127.0.0.1:27010[0m (Press CTRL+C to quit)
Time to run:  0:00:00.337099
[32mINFO[0m:     127.0.0.1:61631 - "[1mGET /test HTTP/1.1[0m" [32m200 OK[0m

and the output from the client is:

(cera) PS C:\Users\user_name\Desktop\MRM\REST> python .\client.py
time for prediction: 0:00:16.845123

Since the code running on the server is less than a second, the time needed to transfer the image from the client to the server (or back) is somewhere around 8 seconds, which is definitely too long.
I can't send smaller images since the input size of the model needs to stay the same.

So for a deployment/REST newbie: what would be a professional / best-practice way to get my predictions from a REST API faster? I assume there are limits since I'm using python but 16 seconds still seems way too long to me.
Thank you in advance!

frfritz
  • 41
  • 1
  • 7

2 Answers2

0

I would suggest reading through this documentation and trying the examples provided for your image upload route.

https://fastapi.tiangolo.com/tutorial/request-files/

  • If I understand correctly that might work for the mrm but in reality I don't upload files from the harddrive. The startingpoint is always going to be a numpy.ndarray... – frfritz Feb 17 '21 at 06:57
0

As @slizb pointed out, encoding the image to base64 makes everything so much faster. Instead of img_list = img.to_list() use

data = {'shape': image.shape, 'img': base64.b64encode(image.tobytes())}

and on the server

image = np.frombuffer(base64.b64decode(data.img)).reshape(data.shape)

Make sure to send the shape as well, because numpy isn't going to "remember" the shape from the buffer, so I needed to manually .reshape() the image.
The overall time went down to about 1 second which is mostly inference time of my model.

frfritz
  • 41
  • 1
  • 7
  • I would **not** suggest using `base64` encoding to upload files to FastAPI backend - please have a look at Method 5 of [this answer](https://stackoverflow.com/a/70640522/17865804) as to why. I would also recommend having a look at [this answer](https://stackoverflow.com/a/73443824/17865804), which explains the reason for the delay you faced when uploading the file, as well as provides soltuions that would **increase the speed** of file uploading in FastAPI. – Chris Apr 28 '23 at 18:53