readdir_r considered harmful

Issued by Ben Hutchings <ben@decadent.org.uk>, 2005-11-02.

This is revision 6 (2013-06-14), which makes the following change:

Thanks to David Bartley for pointing out the open issue.

This is revision 5, which makes the following changes:

Revision 4 made the following change:

Thanks to Kevin Bracey of Broadcom for pointing out this possibility.

Revision 3 made the following change:

Revision 2 made the following changes:

Thanks to Dave Butenhof of HP for the information on HP-UX and Tru64.

Background

The POSIX readdir_r function is a thread-safe version of the readdir function used to read directory entries. Whereas readdir returns a pointer to a system-allocated buffer and may use global state without mutual exclusion, readdir_r uses a user-supplied buffer and is guaranteed to be reentrant. Its use is therefore preferable or even essential in portable multithreaded programs.

(The next version of POSIX may require that readdir is thread-safe so long as use of each DIR handle is serialised. This would make the problems with readdir_r entirely moot. See Austin Group issue 696.)

Problem Description

The length of the user-supplied buffer passed to readdir_r is implicit; it is assumed to be long enough to hold any directory entry read from the given directory stream. The length of a directory entry obviously depends on the length of the name, and the maximum name length may vary between filesystems. The standard means to determine the maximum name length within a directory is to call pathconf(dir_name, _PC_NAME_MAX). This method unfortunately results in a race condition between the opendir and pathconf calls, which could in some cases be exploited to cause a buffer overflow. For example, suppose a setuid program "rd" includes code like this:

#include <dirent.h>
#include <unistd.h>

int main(int argc, char ** argv)
{
    DIR * dir;
    long name_max;
    struct dirent * buf, * de;

    if ((dir = opendir(argv[1]))
        && (name_max = pathconf(argv[1], _PC_NAME_MAX)) > 0
        && (buf = (struct dirent *)malloc(
                offsetof(struct dirent, d_name) + name_max + 1))
    {
        while (readdir_r(dir, buf, &de) == 0 && de)
        {
            /* process entry */
        }
    }
}

Then an attacker could run:

ln -sf exploit link && (rd link &; ln -sf /fat link)

where the "exploit" directory is on a filesystem that allows a maximum of 255 bytes in a name whereas the "/fat" directory is the root of a FAT filesystem that allows a maximum of 12 byes.

Depending on the timing of operations, "rd" may open the "exploit" directory but allocate a buffer only long enough for names in the "/fat" directory. Then names of entries in the "exploit" directory may overflow the allocated buffer by up to 243 bytes. Depending on the heap allocation behaviour of the target program, it may be possible to construct a name that will overwrite sensitive data following the buffer. If the target program uses alloca or a variable length array to create the buffer, a classic stack overflow exploit is possible.

A similar attack could be mounted on a daemon that reads user-controllable directories, for example a web server.

Attacks are easier where a program assumes that all directories will have the same or smaller maximum name length than, for instance, its initial current directory.

Impact

This depends greatly on how an application uses readdir_r and on the configuration of the host system. At the worst, a user with limited access to the local filesystem could cause a privileged process to execute arbitrary code. However there are no known exploits.

Mitigation

Many systems don't have any variation in maximum name lengths among mounted and user-mountable filesystems.

Directory entry buffers for readdir_r are usually allocated on the heap, and it is relatively hard to inject code into a process through a heap buffer overflow, though denial-of-service may be more easily achievable.

Many programmers that use readdir_r erroneously calculate the buffer size as sizeof(struct dirent) + pathconf(dir_name, _PC_NAME_MAX) + 1 or similarly. On Linux (with glibc) and most versions of Unix, struct dirent is large enough to hold maximum-length names from most filesystems, so this is safe (though wasteful). This is not true of Solaris and BeOS, where the d_name member is an array of length 1.

Affected software

The following software appears to be exploitable when compiled for a system that defines struct dirent with a short d_name array, such as Solaris or BeOS:

The following software may also be exploitable:

Some proprietary software may also be vulnerable, but I have no way of testing this. I provided a draft of this advisory to Sun Security earlier this year on the basis that applications running on Solaris are most likely to be exploitable, but I have not received any substantive response. A brief search through the OpenSolaris source code suggests that it may include exploitable applications, but apparently no-one at Sun could spare the time to investigate this.

Recommendations

Many POSIX systems implement the dirfd function from BSD, which returns the file descriptor used by a directory stream. However, current versions of HP-UX and Tru64 do not implement this function. This allows pathconf(dir_name, _PC_NAME_MAX) to be replaced by fpathconf(dirfd(dir), _PC_NAME_MAX), eliminating the race condition.

Some systems, including Solaris, implement the fdopendir function which creates a directory stream from a given file descriptor. This allows the opendir,pathconf sequence to be replaced by open,fpathconf,fdopendir. However this function is much less widely available than dirfd.

Programs using readdir_r may be able to use readdir. According to POSIX the buffer readdir uses is not shared between directory streams. However readdir is not guaranteed to be thread-safe and some implementations may use global state, so for portability the use of readdir in a multithreaded program should be controlled using a mutex.

Suggested code for calculating the required buffer size for readdir_r follows:

#include <sys/types.h>
#include <dirent.h>
#include <limits.h>
#include <stddef.h>
#include <unistd.h>

/* Calculate the required buffer size (in bytes) for directory       *
 * entries read from the given directory handle.  Return -1 if this  *
 * this cannot be done.                                              *
 *                                                                   *
 * This code does not trust values of NAME_MAX that are less than    *
 * 255, since some systems (including at least HP-UX) incorrectly    *
 * define it to be a smaller value.                                  *
 *                                                                   *
 * If you use autoconf, include fpathconf and dirfd in your          *
 * AC_CHECK_FUNCS list.  Otherwise use some other method to detect   *
 * and use them where available.                                     */

size_t dirent_buf_size(DIR * dirp)
{
    long name_max;
    size_t name_end;
#   if defined(HAVE_FPATHCONF) && defined(HAVE_DIRFD) \
       && defined(_PC_NAME_MAX)
        name_max = fpathconf(dirfd(dirp), _PC_NAME_MAX);
        if (name_max == -1)
#           if defined(NAME_MAX)
                name_max = (NAME_MAX > 255) ? NAME_MAX : 255;
#           else
                return (size_t)(-1);
#           endif
#   else
#       if defined(NAME_MAX)
            name_max = (NAME_MAX > 255) ? NAME_MAX : 255;
#       else
#           error "buffer size for readdir_r cannot be determined"
#       endif
#   endif
    name_end = (size_t)offsetof(struct dirent, d_name) + name_max + 1;
    return (name_end > sizeof(struct dirent)
            ? name_end : sizeof(struct dirent));
}

An example of how to use the above function:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char ** argv)
{
    DIR * dirp;
    size_t size;
    struct dirent * buf, * ent;
    int error;

    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s path\n", argv[0]);
        return 2;
    }

    dirp = opendir(argv[1]);
    if (dirp == NULL)
    {
        perror("opendir");
        return 1;
    }
    size = dirent_buf_size(dirp);
    printf("size = %lu\n" "sizeof(struct dirent) = %lu\n",
           (unsigned long)size, (unsigned long)sizeof(struct dirent));
    if (size == -1)
    {
        perror("dirent_buf_size");
        return 1;
    }
    buf = (struct dirent *)malloc(size);
    if (buf == NULL)
    {
        perror("malloc");
        return 1;
    }
    while ((error = readdir_r(dirp, buf, &ent)) == 0 && ent != NULL)
        puts(ent->d_name);
    if (error)
    {
        errno = error;
        perror("readdir_r");
        return 1;
    }
    return 0;
}

The Austin Group should amend POSIX and the SUS in one or more of the following ways:

  1. Standardise the dirfd function from BSD and recommend its use in determining the buffer size for readdir_r. (This has now been done in POSIX 2013, SUS version 4.)
  2. Specify a new variant of readdir in which the buffer size is explicit and the function returns an error code if the buffer is too small.
  3. Specify that NAME_MAX must be defined as the length of the longest name that can be used on any filesystem. (This seems to be what many or most implementations attempt to do at present, although POSIX currently specifies otherwise.)

Licence

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following condition:

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.