From: François Tigeot Date: Mon, 30 Jul 2012 20:39:46 +0000 (+0200) Subject: kernel: Implement O_CLOEXEC X-Git-Tag: v3.2.0~497 X-Git-Url: https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff_plain/6e4ea98e8a4f403762582bd657180d3bd3636505 kernel: Implement O_CLOEXEC * Using fcntl(2) just after open(2) is not enough to avoid race conditions in programs doing fork+exec sequences. Child processes may be created before fcntl() is run and inherit the parent's file descriptors. * In some circonstances this behavior may even create security issues. * O_CLOEXEC can be used to atomically set the close-on-exec flag for new file descriptors, avoiding the whole mess in the first place. * Fixes issue #2356 Inspired-from: NetBSD --- diff --git a/lib/libc/sys/open.2 b/lib/libc/sys/open.2 index 63eb916d95..99aa306246 100644 --- a/lib/libc/sys/open.2 +++ b/lib/libc/sys/open.2 @@ -120,6 +120,7 @@ O_DIRECT eliminate or reduce cache effects O_FSYNC synchronous writes O_NOFOLLOW do not follow symlinks O_DIRECTORY error if file is not a directory +O_CLOEXEC set FD_CLOEXEC upon open .Ed .Pp Opening a file with @@ -201,6 +202,11 @@ from opening files which are even unsafe to open with .Dv O_RDONLY , such as device nodes. .Pp +.Dv O_CLOEXEC +may be used to atomically set the +.Dv FD_CLOEXEC +flag for the newly returned file descriptor. +.Pp If successful, .Fn open and @@ -213,12 +219,18 @@ file is set to the beginning of the file. When a new file is created it is given the group of the directory which contains it. .Pp -The new descriptor is set to remain open across +Unless +.Dv +O_CLOEXEC +was specified, +the new descriptor is set to remain open across .Xr execve 2 system calls; see -.Xr close 2 +.Xr close 2 , +.Xr fcntl 2 and -.Xr fcntl 2 . +.Dv O_CLOEXEC +description. .Pp The system imposes a limit on the number of file descriptors open simultaneously by one process. diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c index 4e9fe492aa..7d39c666d4 100644 --- a/sys/kern/vfs_syscalls.c +++ b/sys/kern/vfs_syscalls.c @@ -1811,7 +1811,7 @@ kern_open(struct nlookupdata *nd, int oflags, int mode, int *res) struct file *nfp; struct file *fp; struct vnode *vp; - int type, indx, error; + int type, indx, error = 0; struct flock lf; if ((oflags & O_ACCMODE) == O_ACCMODE) @@ -1930,7 +1930,9 @@ kern_open(struct nlookupdata *nd, int oflags, int mode, int *res) fsetfd(fdp, fp, indx); fdrop(fp); *res = indx; - return (0); + if (oflags & O_CLOEXEC) + error = fsetfdflags(fdp, *res, UF_EXCLOSE); + return (error); } /* @@ -4157,7 +4159,7 @@ sys_fhopen(struct fhopen_args *uap) struct vattr vat; struct vattr *vap = &vat; struct flock lf; - int fmode, mode, error, type; + int fmode, mode, error = 0, type; struct file *nfp; struct file *fp; int indx; @@ -4308,7 +4310,9 @@ sys_fhopen(struct fhopen_args *uap) fsetfd(fdp, fp, indx); fdrop(fp); uap->sysmsg_result = indx; - return (0); + if (uap->flags & O_CLOEXEC) + error = fsetfdflags(fdp, indx, UF_EXCLOSE); + return (error); bad_drop: fsetfd(fdp, NULL, indx); diff --git a/sys/sys/fcntl.h b/sys/sys/fcntl.h index 5f2b1908e8..00fab35afe 100644 --- a/sys/sys/fcntl.h +++ b/sys/sys/fcntl.h @@ -96,6 +96,9 @@ /* Attempt to bypass the buffer cache */ #define O_DIRECT 0x00010000 +#if __BSD_VISIBLE || __POSIX_VISIBLE >= 200809 +#define O_CLOEXEC 0x00020000 /* atomically set FD_CLOEXEC */ +#endif #define O_FBLOCKING 0x00040000 /* force blocking I/O */ #define O_FNONBLOCKING 0x00080000 /* force non-blocking I/O */ #define O_FAPPEND 0x00100000 /* force append mode for write */