Using paddlepaddle GPU with gunicorn preload

Asked Sep 12 '22 at 10:53

Active Sep 12 '22 at 13:02

Viewed 16 times

I would like to make use of Gunicorn's preload feature to save memory while running paddlepaddle gpu for inference purposes. When using preload, it seems like CUDA cannot be initialised properly, as CUDA is initialised as soon as paddle is imported.

From paddle github repository :

initialization operation of CUDA occurs before the process is forked, causing the newly started process to not obtain the result of CUDA initialization, thus causing a crash during prediction.

They recommended using Flask as the server, but I would like to make use of the preload feature in gunicorn.

Was wondering if there is any workaround to this problem, thanks!

edited Sep 12 '22 at 13:02

davidism

121,510
29
395
339

asked Sep 12 '22 at 10:53

tofustack9

Using paddlepaddle GPU with gunicorn preload

0 Answers0