5

I have a socket server in C/linux. Each time I create a new socket it is assigned a file descriptor. I want to use these FD's as uniqueID's for each client. If they are guaranteed to always be assigned in increasing order (which is the case for the Ubuntu that I am running) then I could just use them as array indices.

So the question: Are the file descriptors that are assigned from linux sockets guaranteed to always be in increasing order?

Josh Brittain
  • 2,162
  • 7
  • 31
  • 54

3 Answers3

10

Let's look at how this works internally (I'm using kernel 4.1.20). The way file descriptors are allocated in Linux is with __alloc_fd. When you do a open syscall, do_sys_open is called. This routine gets a free file descriptor from get_unused_fd_flags:

long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
{ 
    ...
    fd = get_unused_fd_flags(flags);
    if (fd >= 0) {
        struct file *f = do_filp_open(dfd, tmp, &op);

get_unused_d_flags calls __alloc_fd setting minimum and maximum fd:

int get_unused_fd_flags(unsigned flags)
{
    return __alloc_fd(current->files, 0, rlimit(RLIMIT_NOFILE), flags);
}

__alloc_fd gets the file descriptor table for the process, and gets the fd as next_fd, which is actually set from the previous time it ran:

int __alloc_fd(struct files_struct *files,
           unsigned start, unsigned end, unsigned flags)
{
    ...
    fd = files->next_fd;
    ...
    if (start <= files->next_fd)
        files->next_fd = fd + 1;

So you can see how file descriptors indeed grow monotonically... up to certain point. When the fd reaches the maximum, __alloc_fd will try to find the smallest unused file descriptor:

if (fd < fdt->max_fds)
    fd = find_next_zero_bit(fdt->open_fds, fdt->max_fds, fd);

At this point the file descriptors will not be growing monotonically anymore, but instead will jump trying to find free file descriptors. After this, if the table gets full, it will be expanded:

error = expand_files(files, fd);

At which point they will grow again monotonically.

Hope this helps

Jay Medina
  • 544
  • 5
  • 12
4

FD's are guaranteed to be unique, for the lifetime of the socket. So yes, in theory, you could probably use the FD as an index into an array of clients. However, I'd caution against this for at least a couple of reasons:

  • As has already been said, there is no guarantee that FDs will be allocated monotonically. accept() would be within its rights to return a highly-numbered FD, which would then make your array inefficient. So short answer to your question: no, they are not guaranteed to be monotonic.

  • Your server is likely to end up with lots of other open FDs - stdin, stdout and stderr to name but three - so again, your array is wasting space.

I'd recommend some other way of mapping from FDs to clients. Indeed, unless you're going to be dealing with thousands of clients, searching through a list of clients should be fine - it's not really an operation that you should need to do a huge amount.

richvdh
  • 1,163
  • 11
  • 19
  • 2
    Also, sockets (and fds) are a valuable resource, and you should release (i.e. `close`) them when not needed any more. A typical Linux process can usually only have a few thousands fd-s (that limit can be raised). If your application will `close`an fd, the kernel will probably re-assign it later ... so in the long run fd-s are *not* monotonically allocated... – Basile Starynkevitch Feb 21 '12 at 11:15
  • Thank you. The server code is intended for a few thousand users. A linear search isn't really possible. I'm essentially trying to solve a design and data structures problem efficiently. My post about it is here http://stackoverflow.com/questions/9373739/what-is-the-fastest-way-to-find-an-integer-in-an-array p.s. someone changed my post title to something silly. – Josh Brittain Feb 21 '12 at 11:16
  • To summarize, I need to organize users into groups of 15 and then quickly determine what group a user is in. The only method I can think of is double hashing, first on the userID and then second on the groupID. – Josh Brittain Feb 21 '12 at 11:19
2

Do not depend on the monotonicity of file descriptors. Always refer to the remote system via a address:port pair.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • Could you go into a little more detail on this? Are FD's not guaranteed to be unique? If they aren't unique wouldn't this result in errors when waiting on a FD's I/O? – Josh Brittain Feb 21 '12 at 10:56
  • 1
    @Josh Brittain: What happens when someone ports your software to Windows with cygwin or some other porting layer? Keep in mind, that someone could be you. Use a hash table since the fd has to be unique to work. – JimR Feb 21 '12 at 11:18
  • Thanks JimR. I was thinking about a hash table and thought this may be better. It seems I am back to hashing :) – Josh Brittain Feb 21 '12 at 11:20