8

I am using the C programming language in the Linux environment to read the files in a directory. I have include #include<dirent.h> in my code and am using the function readdir().

According to the Linux page online it says not to call free() on the resulting pointer to a dirent structure because it may be allocated on the stack.

Can you help me understand how that works? I don't understand why we would not have to delete the struct dirent. When is it deleted and who deletes it?

Here is the excerpt I am talking about:

On success, readdir() returns a pointer to a dirent structure. (This structure may be statically allocated; do not attempt to free(3) it.) If the end of the directory stream is reached, NULL is returned and errno is not changed. If an error occurs, NULL is returned and errno is set appropriately.

cadaniluk
  • 15,027
  • 2
  • 39
  • 67
Matthew
  • 3,886
  • 7
  • 47
  • 84
  • 2
    Because that is a pointer to a `static` variable within the function, it is not yours to free. The clue is that the data gets overwritten on subsequent calls. – Weather Vane Dec 31 '15 at 19:22

5 Answers5

13

man readdir literally says:

On success, readdir() returns a pointer to a dirent structure. (This structure may be statically allocated; do not attempt to free(3) it.)

(Code formatters added.)

That means the space for it is not allocated at runtime such as stack or free store memory but is static: it is in the executable itsself, comparable to string literals with the difference that writing to string literals is undefined behavior.

Imagine the implementation to be something like this:

struct dirent *readdir(DIR *dirp) {
    static struct dirent dir;

    /* Fill dir with appropriate values. */

    return &dir;
}

dir is statically allocated here. Returning its address isn't wrong because it exists throughout the whole runtime of the program.

Here is the actual source code of readdir on my glibc 2.22 implementation (the path is /sysdeps/posix/readdir.c):

DIRENT_TYPE *
__READDIR (DIR *dirp)
{
  DIRENT_TYPE *dp;
  int saved_errno = errno;

#if IS_IN (libc)
  __libc_lock_lock (dirp->lock);
#endif

  do
    {
      size_t reclen;

      if (dirp->offset >= dirp->size)
    {
      /* We've emptied out our buffer.  Refill it.  */

      size_t maxread;
      ssize_t bytes;

#ifndef _DIRENT_HAVE_D_RECLEN
      /* Fixed-size struct; must read one at a time (see below).  */
      maxread = sizeof *dp;
#else
      maxread = dirp->allocation;
#endif

      bytes = __GETDENTS (dirp->fd, dirp->data, maxread);
      if (bytes <= 0)
        {
          /* On some systems getdents fails with ENOENT when the
         open directory has been rmdir'd already.  POSIX.1
         requires that we treat this condition like normal EOF.  */
          if (bytes < 0 && errno == ENOENT)
        bytes = 0;

          /* Don't modifiy errno when reaching EOF.  */
          if (bytes == 0)
        __set_errno (saved_errno);
          dp = NULL;
          break;
        }
      dirp->size = (size_t) bytes;

      /* Reset the offset into the buffer.  */
      dirp->offset = 0;
    }

      dp = (DIRENT_TYPE *) &dirp->data[dirp->offset];

#ifdef _DIRENT_HAVE_D_RECLEN
      reclen = dp->d_reclen;
#else
      /* The only version of `struct dirent*' that lacks `d_reclen'
     is fixed-size.  */
      assert (sizeof dp->d_name > 1);
      reclen = sizeof *dp;
      /* The name is not terminated if it is the largest possible size.
     Clobber the following byte to ensure proper null termination.  We
     read jst one entry at a time above so we know that byte will not
     be used later.  */
      dp->d_name[sizeof dp->d_name] = '\0';
#endif

      dirp->offset += reclen;

#ifdef _DIRENT_HAVE_D_OFF
      dirp->filepos = dp->d_off;
#else
      dirp->filepos += reclen;
#endif

      /* Skip deleted files.  */
    } while (dp->d_ino == 0);

#if IS_IN (libc)
  __libc_lock_unlock (dirp->lock);
#endif

  return dp;
}

I don't know much about glibc but the line

dp = (DIRENT_TYPE *) &dirp->data[dirp->offset];

seems the most interesting to us. dirp->data is the static data here, as far as I can tell.


That is the reason as to why there is the reentrant alternative readdir_r and readdir is not reentrant.
Imagine two threads concurrently executing readdir. Both will attempt to fill dir, which is shared among all readdir invocations, simultaneously, resulting in unsequenced memory reads/writes.

cadaniluk
  • 15,027
  • 2
  • 39
  • 67
4

The man page you reference is cautioning that the struct dirent is statically allocated. Therefore free()is not necessary.

free() is designed exclusively for use with [m][c][re]alloc() functions, which all make requests for memory from the heap. (as opposed to the stack)

Community
  • 1
  • 1
ryyker
  • 22,849
  • 3
  • 43
  • 87
2

The struct dirent is logically part of the DIR. It may be reused in a subsequent readdir() call on the same DIR (but not in a readdir() call on a different DIR) and will be freed upon closedir().

Some documentation states that readdir() is not thread-safe. In all implementations except really exotic ones, it is thread-safe as long as the last access to the previous struct dirent happens-before the next readdir() call. Using readdir_r() is not advisable because it is very hard to determine the NAME_MAX limit properly and the readdir_r() function does not know the value used by its caller.

More details about the problems with readdir_r() are at http://austingroupbugs.net/view.php?id=696.

jilles
  • 10,509
  • 2
  • 26
  • 39
1

You only need to free() the memory which has been previously allocated by a call to malloc() and family. This is dynamically allocated memory, from heap area.

OTOH, non-dynamic allocation (see automatic allocation) need not to be handled separately. Once the variable goes out of scope, the stack memory is reclaimed and reused as needed.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
1

You don't need to free/release entries fetched by readdir(). It uses a static internal buffer and thus it does not need to be freed dynamically because it's not allocated dynamically. Note that the compiler can predict the space needed because you only use one entry at a time and you just need one entry to store the results. This is why it's not reentrant too. There is a readdir_r() which takes a user allocated buffer and is of course reentrant.

You need to call closedir() on the DIR * pointer in order to free the resources used by opendir().

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97