4

I'm using the Python requests package to make a large number of requests to an API. At some point my program however crashes due to 'too many open files'. As I explicitely close my session I do not really know how this can be.

I use the following code:

import requests
import multiprocessing
import numpy as np

s = requests.session()
s.keep_alive = False


def request(i, mapId, minx, maxx, miny, maxy):
    print(i)
    try:
        with requests.Session() as s:
            r = s.post(
                url + "metadata/polygons",
                timeout=10,
                json={
                    "mapId": mapId,
                    "layer": "percelen",
                    "xMin": minx,
                    "xMax": maxx,
                    "yMin": miny,
                    "yMax": maxy,
                },
            )
            out = r.json()
            s.close()

    except:
        print("something went wrong with: " + str(i))


for i in np.aragne(10000):
    time.sleep(1)
    multiprocessing.Process(target=request, args=argsList[i])

Any help or insights would be greatly appreciated as I'm out of ideas.

jdhao
  • 24,001
  • 18
  • 134
  • 273
Daan
  • 349
  • 4
  • 16

1 Answers1

10

"Too many open files" is likely a reference to the fact that each Session and its single POST request hogs a TCP socket and therefore a file descriptor.

First solution:

Use a single Session instance with a customized HTTPAdapter and pass a beefed up argument to its pool_connections parameter.

Side note 1: you don't need to call s.close(). That's already called when the context manager calls .__exit__().

Side note 2: consider using threading or asyncio/aiohttp. Multiprocessing is not ideal for an IO-bound task like this.

Second solution:

Increase the number of open files permitted. On Linux, you'll need to do something like:

sudo vim /etc/security/limits.conf
# Add these lines
root    soft    nofile  100000
root    hard    nofile  100000
ubuntu    soft    nofile  100000
ubuntu    hard    nofile  100000

sudo vim /etc/sysctl.conf
# Add this line
fs.file-max = 2097152

sudo sysctl -p

sudo vim /etc/pam.d/commmon_session
# Add this line
session required pam_limits.so

sudo reboot

I think this second solution could be characterized as "fixing the symptom rather than the problem," but try it if you must and are feeling bold.

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
  • I'm having a similar problem as the OP. Can you please explain what you mean by passing "a beefed up argument to its pool_connections parameter"? An example would be greatly appreciated! – Karl Schneider Jun 02 '22 at 17:28