| 1 | .\" Copyright (c) 1991, 1993 |
| 2 | .\" The Regents of the University of California. All rights reserved. |
| 3 | .\" |
| 4 | .\" Redistribution and use in source and binary forms, with or without |
| 5 | .\" modification, are permitted provided that the following conditions |
| 6 | .\" are met: |
| 7 | .\" 1. Redistributions of source code must retain the above copyright |
| 8 | .\" notice, this list of conditions and the following disclaimer. |
| 9 | .\" 2. Redistributions in binary form must reproduce the above copyright |
| 10 | .\" notice, this list of conditions and the following disclaimer in the |
| 11 | .\" documentation and/or other materials provided with the distribution. |
| 12 | .\" 3. All advertising materials mentioning features or use of this software |
| 13 | .\" must display the following acknowledgement: |
| 14 | .\" This product includes software developed by the University of |
| 15 | .\" California, Berkeley and its contributors. |
| 16 | .\" 4. Neither the name of the University nor the names of its contributors |
| 17 | .\" may be used to endorse or promote products derived from this software |
| 18 | .\" without specific prior written permission. |
| 19 | .\" |
| 20 | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND |
| 21 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
| 22 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
| 23 | .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE |
| 24 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
| 25 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
| 26 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
| 27 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
| 28 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
| 29 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
| 30 | .\" SUCH DAMAGE. |
| 31 | .\" |
| 32 | .\" @(#)mmap.2 8.4 (Berkeley) 5/11/95 |
| 33 | .\" $FreeBSD: src/lib/libc/sys/mmap.2,v 1.22.2.12 2002/02/27 03:40:13 dd Exp $ |
| 34 | .\" $DragonFly: src/lib/libc/sys/mmap.2,v 1.7 2007/01/08 03:33:34 dillon Exp $ |
| 35 | .\" |
| 36 | .Dd December 11, 2006 |
| 37 | .Dt MMAP 2 |
| 38 | .Os |
| 39 | .Sh NAME |
| 40 | .Nm mmap |
| 41 | .Nd allocate memory, or map files or devices into memory |
| 42 | .Sh LIBRARY |
| 43 | .Lb libc |
| 44 | .Sh SYNOPSIS |
| 45 | .In sys/types.h |
| 46 | .In sys/mman.h |
| 47 | .Ft void * |
| 48 | .Fn mmap "void *addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset" |
| 49 | .Sh DESCRIPTION |
| 50 | The |
| 51 | .Fn mmap |
| 52 | function causes the pages starting at |
| 53 | .Fa addr |
| 54 | and continuing for at most |
| 55 | .Fa len |
| 56 | bytes to be mapped from the object described by |
| 57 | .Fa fd , |
| 58 | starting at byte offset |
| 59 | .Fa offset . |
| 60 | If |
| 61 | .Fa len |
| 62 | is not a multiple of the pagesize, the mapped region may extend past the |
| 63 | specified range. |
| 64 | Any such extension beyond the end of the mapped object will be zero-filled. |
| 65 | .Pp |
| 66 | If |
| 67 | .Fa addr |
| 68 | is non-zero, it is used as a hint to the system. |
| 69 | (As a convenience to the system, the actual address of the region may differ |
| 70 | from the address supplied.) |
| 71 | If |
| 72 | .Fa addr |
| 73 | is zero, an address will be selected by the system. |
| 74 | The actual starting address of the region is returned. |
| 75 | A successful |
| 76 | .Fa mmap |
| 77 | deletes any previous mapping in the allocated address range. |
| 78 | .Pp |
| 79 | The protections (region accessibility) are specified in the |
| 80 | .Fa prot |
| 81 | argument by |
| 82 | .Em or Ns 'ing |
| 83 | the following values: |
| 84 | .Pp |
| 85 | .Bl -tag -width PROT_WRITE -compact |
| 86 | .It Dv PROT_NONE |
| 87 | Pages may not be accessed. |
| 88 | .It Dv PROT_READ |
| 89 | Pages may be read. |
| 90 | .It Dv PROT_WRITE |
| 91 | Pages may be written. |
| 92 | .It Dv PROT_EXEC |
| 93 | Pages may be executed. |
| 94 | .El |
| 95 | .Pp |
| 96 | The |
| 97 | .Fa flags |
| 98 | parameter specifies the type of the mapped object, mapping options and |
| 99 | whether modifications made to the mapped copy of the page are private |
| 100 | to the process or are to be shared with other references. |
| 101 | Sharing, mapping type and options are specified in the |
| 102 | .Fa flags |
| 103 | argument by |
| 104 | .Em or Ns 'ing |
| 105 | the following values: |
| 106 | .Bl -tag -width MAP_HASSEMAPHORE |
| 107 | .It Dv MAP_ANON |
| 108 | Map anonymous memory not associated with any specific file. |
| 109 | The file descriptor used for creating |
| 110 | .Dv MAP_ANON |
| 111 | must be \-1. |
| 112 | The |
| 113 | .Fa offset |
| 114 | parameter is ignored. |
| 115 | .\".It Dv MAP_FILE |
| 116 | .\"Mapped from a regular file or character-special device memory. |
| 117 | .It Dv MAP_FIXED |
| 118 | Do not permit the system to select a different address than the one |
| 119 | specified. |
| 120 | If the specified address cannot be used, |
| 121 | .Fn mmap |
| 122 | will fail. |
| 123 | If |
| 124 | .Dv MAP_FIXED |
| 125 | is specified, |
| 126 | .Fa addr |
| 127 | must be a multiple of the pagesize. |
| 128 | Use of this option is discouraged. |
| 129 | .It Dv MAP_HASSEMAPHORE |
| 130 | Notify the kernel that the region may contain semaphores and that special |
| 131 | handling may be necessary. |
| 132 | .It Dv MAP_NOCORE |
| 133 | Region is not included in a core file. |
| 134 | .It Dv MAP_NOSYNC |
| 135 | Causes data dirtied via this VM map to be flushed to physical media |
| 136 | only when necessary (usually by the pager) rather then gratuitously. |
| 137 | Typically this prevents the update daemons from flushing pages dirtied |
| 138 | through such maps and thus allows efficient sharing of memory across |
| 139 | unassociated processes using a file-backed shared memory map. Without |
| 140 | this option any VM pages you dirty may be flushed to disk every so often |
| 141 | (every 30-60 seconds usually) which can create performance problems if you |
| 142 | do not need that to occur (such as when you are using shared file-backed |
| 143 | mmap regions for IPC purposes). Note that VM/filesystem coherency is |
| 144 | maintained whether you use |
| 145 | .Dv MAP_NOSYNC |
| 146 | or not. This option is not portable |
| 147 | across |
| 148 | .Ux |
| 149 | platforms (yet), though some may implement the same behavior |
| 150 | by default. |
| 151 | .Pp |
| 152 | .Em WARNING ! |
| 153 | Extending a file with |
| 154 | .Xr ftruncate 2 , |
| 155 | thus creating a big hole, and then filling the hole by modifying a shared |
| 156 | .Fn mmap |
| 157 | can lead to severe file fragmentation. |
| 158 | In order to avoid such fragmentation you should always pre-allocate the |
| 159 | file's backing store by |
| 160 | .Fn write Ns ing |
| 161 | zero's into the newly extended area prior to modifying the area via your |
| 162 | .Fn mmap . |
| 163 | The fragmentation problem is especially sensitive to |
| 164 | .Dv MAP_NOSYNC |
| 165 | pages, because pages may be flushed to disk in a totally random order. |
| 166 | .Pp |
| 167 | The same applies when using |
| 168 | .Dv MAP_NOSYNC |
| 169 | to implement a file-based shared memory store. |
| 170 | It is recommended that you create the backing store by |
| 171 | .Fn write Ns ing |
| 172 | zero's to the backing file rather then |
| 173 | .Fn ftruncate Ns ing |
| 174 | it. |
| 175 | You can test file fragmentation by observing the KB/t (kilobytes per |
| 176 | transfer) results from an |
| 177 | .Dq Li iostat 1 |
| 178 | while reading a large file sequentially, e.g. using |
| 179 | .Dq Li dd if=filename of=/dev/null bs=32k . |
| 180 | .Pp |
| 181 | The |
| 182 | .Xr fsync 2 |
| 183 | function will flush all dirty data and metadata associated with a file, |
| 184 | including dirty NOSYNC VM data, to physical media. The |
| 185 | .Xr sync 8 |
| 186 | command and |
| 187 | .Xr sync 2 |
| 188 | system call generally do not flush dirty NOSYNC VM data. |
| 189 | The |
| 190 | .Xr msync 2 |
| 191 | system call is obsolete since |
| 192 | .Bx |
| 193 | implements a coherent filesystem buffer cache. However, it may be |
| 194 | used to associate dirty VM pages with filesystem buffers and thus cause |
| 195 | them to be flushed to physical media sooner rather then later. |
| 196 | .It Dv MAP_PRIVATE |
| 197 | Modifications are private. |
| 198 | .It Dv MAP_SHARED |
| 199 | Modifications are shared. |
| 200 | .It Dv MAP_STACK |
| 201 | This option is only available if your system has been compiled with |
| 202 | .Dv VM_STACK |
| 203 | defined when compiling the kernel. |
| 204 | This is the default for |
| 205 | i386 only. |
| 206 | Consider adding |
| 207 | .Li -DVM_STACK |
| 208 | to |
| 209 | .Va COPTFLAGS |
| 210 | in your |
| 211 | .Pa /etc/make.conf |
| 212 | to enable this option for other architechures. |
| 213 | .Dv MAP_STACK |
| 214 | implies |
| 215 | .Dv MAP_ANON , |
| 216 | and |
| 217 | .Fa offset |
| 218 | of 0. |
| 219 | .Fa fd |
| 220 | must be -1 and |
| 221 | .Fa prot |
| 222 | must include at least |
| 223 | .Dv PROT_READ |
| 224 | and |
| 225 | .Dv PROT_WRITE . |
| 226 | This option creates |
| 227 | a memory region that grows to at most |
| 228 | .Fa len |
| 229 | bytes in size, starting from the stack top and growing down. The |
| 230 | stack top is the starting address returned by the call, plus |
| 231 | .Fa len |
| 232 | bytes. The bottom of the stack at maximum growth is the starting |
| 233 | address returned by the call. |
| 234 | .It Dv MAP_VPAGETABLE |
| 235 | Memory accessed via this map is not linearly mapped and will be governed |
| 236 | by a virtual page table. The base address of the virtual page table may |
| 237 | be set using |
| 238 | .Xr mcontrol 2 |
| 239 | with |
| 240 | .Dv MADV_SETMAP . |
| 241 | Virtual page tables work with anonymous memory but there |
| 242 | is no way to populate the page table so for all intents and purposes |
| 243 | .Dv MAP_VPAGETABLE |
| 244 | can only be used when mapping file descriptors. Since the kernel will |
| 245 | update the VPTE_M bit in the virtual page table, the mapping must R+W |
| 246 | even though actual access to the memory will be properly governed by |
| 247 | the virtual page table. |
| 248 | .Pp |
| 249 | Addressable backing store is limited by the range suppored in the virtual |
| 250 | page table entries. The kernel may implement a page table abstraction capable |
| 251 | of addressing a larger range within the backing store then could otherwise |
| 252 | be mapped into memory. |
| 253 | .El |
| 254 | .Pp |
| 255 | The |
| 256 | .Xr close 2 |
| 257 | function does not unmap pages, see |
| 258 | .Xr munmap 2 |
| 259 | for further information. |
| 260 | .Pp |
| 261 | The current design does not allow a process to specify the location of |
| 262 | swap space. |
| 263 | In the future we may define an additional mapping type, |
| 264 | .Dv MAP_SWAP , |
| 265 | in which |
| 266 | the file descriptor argument specifies a file or device to which swapping |
| 267 | should be done. |
| 268 | .Sh RETURN VALUES |
| 269 | Upon successful completion, |
| 270 | .Fn mmap |
| 271 | returns a pointer to the mapped region. |
| 272 | Otherwise, a value of |
| 273 | .Dv MAP_FAILED |
| 274 | is returned and |
| 275 | .Va errno |
| 276 | is set to indicate the error. |
| 277 | .Sh ERRORS |
| 278 | .Fn Mmap |
| 279 | will fail if: |
| 280 | .Bl -tag -width Er |
| 281 | .It Bq Er EACCES |
| 282 | The flag |
| 283 | .Dv PROT_READ |
| 284 | was specified as part of the |
| 285 | .Fa prot |
| 286 | parameter and |
| 287 | .Fa fd |
| 288 | was not open for reading. |
| 289 | The flags |
| 290 | .Dv MAP_SHARED |
| 291 | and |
| 292 | .Dv PROT_WRITE |
| 293 | were specified as part of the |
| 294 | .Fa flags |
| 295 | and |
| 296 | .Fa prot |
| 297 | parameters and |
| 298 | .Fa fd |
| 299 | was not open for writing. |
| 300 | .It Bq Er EBADF |
| 301 | .Fa fd |
| 302 | is not a valid open file descriptor. |
| 303 | .It Bq Er EINVAL |
| 304 | .Dv MAP_FIXED |
| 305 | was specified and the |
| 306 | .Fa addr |
| 307 | parameter was not page aligned, or part of the desired address space |
| 308 | resides out of the valid address space for a user process. |
| 309 | .It Bq Er EINVAL |
| 310 | .Fa Len |
| 311 | was negative. |
| 312 | .It Bq Er EINVAL |
| 313 | .Dv MAP_ANON |
| 314 | was specified and the |
| 315 | .Fa fd |
| 316 | parameter was not -1. |
| 317 | .It Bq Er EINVAL |
| 318 | .Dv MAP_ANON |
| 319 | has not been specified and |
| 320 | .Fa fd |
| 321 | did not reference a regular or character special file. |
| 322 | .It Bq Er EINVAL |
| 323 | .Fa Offset |
| 324 | was not page-aligned. |
| 325 | (See |
| 326 | .Sx BUGS |
| 327 | below.) |
| 328 | .It Bq Er ENOMEM |
| 329 | .Dv MAP_FIXED |
| 330 | was specified and the |
| 331 | .Fa addr |
| 332 | parameter wasn't available. |
| 333 | .Dv MAP_ANON |
| 334 | was specified and insufficient memory was available. |
| 335 | The system has reached the per-process mmap limit specified in the |
| 336 | .Va vm.max_proc_mmap |
| 337 | sysctl. |
| 338 | .El |
| 339 | .Sh SEE ALSO |
| 340 | .Xr madvise 2 , |
| 341 | .Xr mincore 2 , |
| 342 | .Xr mlock 2 , |
| 343 | .Xr mprotect 2 , |
| 344 | .Xr msync 2 , |
| 345 | .Xr munlock 2 , |
| 346 | .Xr munmap 2 , |
| 347 | .Xr getpagesize 3 |
| 348 | .Sh BUGS |
| 349 | .Fa len |
| 350 | is limited to 2GB. Mmapping slightly more than 2GB doesn't work, but |
| 351 | it is possible to map a window of size (filesize % 2GB) for file sizes |
| 352 | of slightly less than 2G, 4GB, 6GB and 8GB. |
| 353 | .Pp |
| 354 | The limit is imposed for a variety of reasons. |
| 355 | Most of them have to do |
| 356 | with |
| 357 | .Dx |
| 358 | not wanting to use 64 bit offsets in the VM system due to |
| 359 | the extreme performance penalty. |
| 360 | So |
| 361 | .Dx |
| 362 | uses 32bit page indexes and |
| 363 | this gives |
| 364 | .Dx |
| 365 | a maximum of 8TB filesizes. |
| 366 | It's actually bugs in |
| 367 | the filesystem code that causes the limit to be further restricted to |
| 368 | 1TB (loss of precision when doing blockno calculations). |
| 369 | .Pp |
| 370 | Another reason for the 2GB limit is that filesystem metadata can |
| 371 | reside at negative offsets. |