Creating many Sockets in ZMQ - too many files error

Question

I am trying to create sockets with inproc:// transport class from the same context in C.

I can create 2036 sockets, when I try to create more zmq_socket() returns NULL and the zmq_errno says 24 'Too many open files'.

How can I create more than 2036 sockets? Especially as inproc forces me to use only one context.

There are several things I don't understand:
- the sockets are eventually turned to inproc, why does it take up files?
- Increasing ZMQ_MAX_SOCKETS does not help, the system file limit appears to be the limiting factor
- I am unable to increase the file limit with ulimit on my Mac, no workaround helped.

// the code is actually in cython and can be found here:

https://github.com/DavoudTaghawiNejad/ABsinthCE

score 1 · Answer 1 · edited Jun 02 '17 at 06:37

1

Use zmq_ctx_set():

zmq_ctx_set (context, ZMQ_MAX_SOCKETS, 256);

edited Jun 02 '17 at 06:37

user3666197

1
6
50
92

answered Jun 01 '17 at 23:49

score 1 · Answer 2 · edited Jun 20 '20 at 09:12

Solution is multi-fold complex:

inproc does not force you to have a common Context() instance, but it is handy to have one, as the signalling / messaging goes without any data-transfers, just by Zero-copy, pointer manipulations for in-RAM blocks of memory, which is extremely fast.

I started to assemble ZeroMQ-related facts about having some 70.000 ~ 200.000 file-descriptors available for "sockets", as supported by O/S kernel settings, but your published aims are higher. Much higher.

Given your git-published Multi-agent ABCE Project paper refers to nanosecond shaving, HPC-domain grade solution to have ( cit. / emphasis added: )

the whopping number of 1.073.545.225, many more agents than fit into the memory of even the most sophisticated supercomputer, some small hundreds of thousands of file-descriptors are not much worth spending time with.

Your Project faces multiple troubles at the same time.

Let's peel the problem layers off, step by step:

File Descriptors (FD) -- Linux O/S level -- System-wide Limits:

To see the actual as-is state:

edit /etc/sysctl.conf file

# vi /etc/sysctl.conf

Append a config directive as follows:

fs.file-max = 100000

Save and close the file.

Users need to log out and log back in again to changes take effect or just type the following command:

# sysctl -p

Verify your settings with command:

# cat /proc/sys/fs/file-max

( Max ) User-specific File Descriptors (FD) Limits:

Each user has additionally a set of ( soft-limit, hard-limit ):

# su - ABsinthCE
$ ulimit -Hn
$ ulimit -Sn

However, you can limit your ABsinthCE user ( or any other ) to any specific limits by editing /etc/security/limits.conf file, enter:

# vi /etc/security/limits.conf

Where you set ABsinthCE user the respective soft- and hard-limit as needed:

ABsinthCE soft nofile 123456
ABsinthCE hard nofile 234567

All that is not for free - each file descriptor takes up some kernel memory, so at some point you may and you will exhaust it. A few hundred thousands file descriptors are not trouble for server deployments, where event-based ( epoll on Linux ) server architectures are used. But simply straight forget to try to grow this anywhere near the said 1.073.545.225 level.

Today,
one can have a private HPC machine ( not a Cloud illusion ) with ~ 50-500 TB RAM.

But still, the multi-agent Project application architecture ought be re-defined, not to fail on extreme resources allocations ( just due to a forgiving syntax simplicity ).

Professional Multi-Agent simulators are right due to extreme scaling very, VERY CONSERVATIVE on per-Agent instance resource-locking.

So the best results are to be expected ( both performance-wise and latency-wise ) when using direct memory-mapped operations. ZeroMQ inproc:// transport-class is fine and does not require a Context() instance to allocate IO-thread ( as there is no data-pump at all, if using just inproc:// transport-class ), which is very efficient for a fast prototyping phase. The same approach will become risky for growing the scales much higher towards the levels expected in production.

Latency-shaving and accelerated-time simulator operations throughput scaling is the next set of targets, for raising both the Multi-Agent based simulations static scales and increasing the simulator performance.

For a serious nanoseconds hunting
follow the excellent Bloomberg's guru, John Lakos, insights on HPC.

Either pre-allocate ( as a common Best Practice in RTOS domain) and do not allocate at all, or follow John's fabulous testing-supported insights presented on ACCU 2017.

If you haven't received a reply please send me an email again. Put stackoverflow into the subject line. Thanks Davoud @user3666197 — Davoud Taghawi-Nejad, Jun 13 '17 at 10:12

score 1 · Answer 3 · edited Jun 02 '17 at 17:49

1

You can change these using sysctl ( tried on Yosemite and El Capitan ), but the problem is what to change. Here is a post on this topic: Increasing the maximum number of tcp/ip connections in linux

That's on Linux, and the Mac is based on BSD 4.x, but man pages for sysctl on BSD are available online.

Note: sysctl is a private interface on iOS.

edited Jun 02 '17 at 17:49

user3666197

1
6
50
92

answered Jun 02 '17 at 16:17

mikep

3,841
8
21

Creating many Sockets in ZMQ - too many files error

3 Answers3

Solution is multi-fold complex:

File Descriptors (FD) -- Linux O/S level -- System-wide Limits:

( Max ) User-specific File Descriptors (FD) Limits:

Today,one can have a private HPC machine ( not a Cloud illusion ) with ~ 50-500 TB RAM.

For a serious nanoseconds huntingfollow the excellent Bloomberg's guru, John Lakos, insights on HPC.

Today,
one can have a private HPC machine ( not a Cloud illusion ) with ~ 50-500 TB RAM.

For a serious nanoseconds hunting
follow the excellent Bloomberg's guru, John Lakos, insights on HPC.