lib/libc/rpc/PSD.doc/xdr.rfc.ms

   1 .\"
   2 .\"  Must use -- tbl -- with this one
   3 .\"
   4 .\" @(#)xdr.rfc.ms      2.2 88/08/05 4.0 RPCSRC
   5 .\" $FreeBSD: src/lib/libc/rpc/PSD.doc/xdr.rfc.ms,v 1.1.14.1 2000/11/24 09:36:30 ru Exp $
   6 .\" $DragonFly: src/lib/libc/rpc/PSD.doc/xdr.rfc.ms,v 1.2 2003/06/17 04:26:45 dillon Exp $
   7 .\"
   8 .so stubs
   9 .de BT
  10 .if \\n%=1 .tl ''- % -''
  11 ..
  12 .ND
  13 .\" prevent excess underlining in nroff
  14 .if n .fp 2 R
  15 .OH 'External Data Representation Standard''Page %'
  16 .EH 'Page %''External Data Representation Standard'
  17 .if \n%=1 .bp
  18 .SH
  19 \&External Data Representation Standard: Protocol Specification
  20 .IX "External Data Representation"
  21 .IX XDR RFC
  22 .IX XDR "protocol specification"
  23 .LP
  24 .NH 0
  25 \&Status of this Standard
  26 .nr OF 1
  27 .IX XDR "RFC status"
  28 .LP
  29 Note: This chapter specifies a protocol that Sun Microsystems, Inc., and
  30 others are using.  It has been designated RFC1014 by the ARPA Network
  31 Information Center.
  32 .NH 1
  33 Introduction
  34 \&
  35 .LP
  36 XDR is a standard for the description and encoding of data.  It is
  37 useful for transferring data between different computer
  38 architectures, and has been used to communicate data between such
  39 diverse machines as the Sun Workstation, VAX, IBM-PC, and Cray.
  40 XDR fits into the ISO presentation layer, and is roughly analogous in
  41 purpose to X.409, ISO Abstract Syntax Notation.  The major difference
  42 between these two is that XDR uses implicit typing, while X.409 uses
  43 explicit typing.
  44 .LP
  45 XDR uses a language to describe data formats.  The language can only
  46 be used only to describe data; it is not a programming language.
  47 This language allows one to describe intricate data formats in a
  48 concise manner. The alternative of using graphical representations
  49 (itself an informal language) quickly becomes incomprehensible when
  50 faced with complexity.  The XDR language itself is similar to the C
  51 language [1], just as Courier [4] is similar to Mesa. Protocols such
  52 as Sun RPC (Remote Procedure Call) and the NFS (Network File System)
  53 use XDR to describe the format of their data.
  54 .LP
  55 The XDR standard makes the following assumption: that bytes (or
  56 octets) are portable, where a byte is defined to be 8 bits of data.
  57 A given hardware device should encode the bytes onto the various
  58 media in such a way that other hardware devices may decode the bytes
  59 without loss of meaning.  For example, the Ethernet standard
  60 suggests that bytes be encoded in "little-endian" style [2], or least
  61 significant bit first.
  62 .NH 2
  63 \&Basic Block Size
  64 .IX XDR "basic block size"
  65 .IX XDR "block size"
  66 .LP
  67 The representation of all items requires a multiple of four bytes (or
  68 32 bits) of data.  The bytes are numbered 0 through n-1.  The bytes
  69 are read or written to some byte stream such that byte m always
  70 precedes byte m+1.  If the n bytes needed to contain the data are not
  71 a multiple of four, then the n bytes are followed by enough (0 to 3)
  72 residual zero bytes, r, to make the total byte count a multiple of 4.
  73 .LP
  74 We include the familiar graphic box notation for illustration and
  75 comparison.  In most illustrations, each box (delimited by a plus
  76 sign at the 4 corners and vertical bars and dashes) depicts a byte.
  77 Ellipses (...) between boxes show zero or more additional bytes where
  78 required.
  79 .ie t .DS
  80 .el .DS L
  81 \fIA Block\fP
  82
  83 \f(CW+--------+--------+...+--------+--------+...+--------+
  84 | byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
  85 +--------+--------+...+--------+--------+...+--------+
  86 |<-----------n bytes---------->|<------r bytes------>|
  87 |<-----------n+r (where (n+r) mod 4 = 0)>----------->|\fP
  88
  89 .DE
  90 .NH 1
  91 \&XDR Data Types
  92 .IX XDR "data types"
  93 .IX "XDR data types"
  94 .LP
  95 Each of the sections that follow describes a data type defined in the
  96 XDR standard, shows how it is declared in the language, and includes
  97 a graphic illustration of its encoding.
  98 .LP
  99 For each data type in the language we show a general paradigm
 100 declaration.  Note that angle brackets (< and >) denote
 101 variable length sequences of data and square brackets ([ and ]) denote
 102 fixed-length sequences of data.  "n", "m" and "r" denote integers.
 103 For the full language specification and more formal definitions of
 104 terms such as "identifier" and "declaration", refer to
 105 .I "The XDR Language Specification" ,
 106 below.
 107 .LP
 108 For some data types, more specific examples are included.
 109 A more extensive example of a data description is in
 110 .I "An Example of an XDR Data Description"
 111 below.
 112 .NH 2
 113 \&Integer
 114 .IX XDR integer
 115 .LP
 116 An XDR signed integer is a 32-bit datum that encodes an integer in
 117 the range [-2147483648,2147483647].  The integer is represented in
 118 two's complement notation.  The most and least significant bytes are
 119 0 and 3, respectively.  Integers are declared as follows:
 120 .ie t .DS
 121 .el .DS L
 122 \fIInteger\fP
 123
 124 \f(CW(MSB)                   (LSB)
 125 +-------+-------+-------+-------+
 126 |byte 0 |byte 1 |byte 2 |byte 3 |
 127 +-------+-------+-------+-------+
 128 <------------32 bits------------>\fP
 129 .DE
 130 .NH 2
 131 \&Unsigned Integer
 132 .IX XDR "unsigned integer"
 133 .IX XDR "integer, unsigned"
 134 .LP
 135 An XDR unsigned integer is a 32-bit datum that encodes a nonnegative
 136 integer in the range [0,4294967295].  It is represented by an
 137 unsigned binary number whose most and least significant bytes are 0
 138 and 3, respectively.  An unsigned integer is declared as follows:
 139 .ie t .DS
 140 .el .DS L
 141 \fIUnsigned Integer\fP
 142
 143 \f(CW(MSB)                   (LSB)
 144 +-------+-------+-------+-------+
 145 |byte 0 |byte 1 |byte 2 |byte 3 |
 146 +-------+-------+-------+-------+
 147 <------------32 bits------------>\fP
 148 .DE
 149 .NH 2
 150 \&Enumeration
 151 .IX XDR enumeration
 152 .LP
 153 Enumerations have the same representation as signed integers.
 154 Enumerations are handy for describing subsets of the integers.
 155 Enumerated data is declared as follows:
 156 .ft CW
 157 .DS
 158 enum { name-identifier = constant, ... } identifier;
 159 .DE
 160 For example, the three colors red, yellow, and blue could be
 161 described by an enumerated type:
 162 .DS
 163 .ft CW
 164 enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;
 165 .DE
 166 It is an error to encode as an enum any other integer than those that
 167 have been given assignments in the enum declaration.
 168 .NH 2
 169 \&Boolean
 170 .IX XDR boolean
 171 .LP
 172 Booleans are important enough and occur frequently enough to warrant
 173 their own explicit type in the standard.  Booleans are declared as
 174 follows:
 175 .DS
 176 .ft CW
 177 bool identifier;
 178 .DE
 179 This is equivalent to:
 180 .DS
 181 .ft CW
 182 enum { FALSE = 0, TRUE = 1 } identifier;
 183 .DE
 184 .NH 2
 185 \&Hyper Integer and Unsigned Hyper Integer
 186 .IX XDR "hyper integer"
 187 .IX XDR "integer, hyper"
 188 .LP
 189 The standard also defines 64-bit (8-byte) numbers called hyper
 190 integer and unsigned hyper integer.  Their representations are the
 191 obvious extensions of integer and unsigned integer defined above.
 192 They are represented in two's complement notation.  The most and
 193 least significant bytes are 0 and 7, respectively.  Their
 194 declarations:
 195 .ie t .DS
 196 .el .DS L
 197 \fIHyper Integer\fP
 198 \fIUnsigned Hyper Integer\fP
 199
 200 \f(CW(MSB)                                                   (LSB)
 201 +-------+-------+-------+-------+-------+-------+-------+-------+
 202 |byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 |
 203 +-------+-------+-------+-------+-------+-------+-------+-------+
 204 <----------------------------64 bits---------------------------->\fP
 205 .DE
 206 .NH 2
 207 \&Floating-point
 208 .IX XDR "integer, floating point"
 209 .IX XDR "floating-point integer"
 210 .LP
 211 The standard defines the floating-point data type "float" (32 bits or
 212 4 bytes).  The encoding used is the IEEE standard for normalized
 213 single-precision floating-point numbers [3].  The following three
 214 fields describe the single-precision floating-point number:
 215 .RS
 216 .IP \fBS\fP:
 217 The sign of the number.  Values 0 and  1 represent  positive and
 218 negative, respectively.  One bit.
 219 .IP \fBE\fP:
 220 The exponent of the number, base 2.  8  bits are devoted to this
 221 field.  The exponent is biased by 127.
 222 .IP \fBF\fP:
 223 The fractional part of the number's mantissa,  base 2.   23 bits
 224 are devoted to this field.
 225 .RE
 226 .LP
 227 Therefore, the floating-point number is described by:
 228 .DS
 229 (-1)**S * 2**(E-Bias) * 1.F
 230 .DE
 231 It is declared as follows:
 232 .ie t .DS
 233 .el .DS L
 234 \fISingle-Precision Floating-Point\fP
 235
 236 \f(CW+-------+-------+-------+-------+
 237 |byte 0 |byte 1 |byte 2 |byte 3 |
 238 S|   E   |           F          |
 239 +-------+-------+-------+-------+
 240 1|<- 8 ->|<-------23 bits------>|
 241 <------------32 bits------------>\fP
 242 .DE
 243 Just as the most and least significant bytes of a number are 0 and 3,
 244 the most and least significant bits of a single-precision floating-
 245 point number are 0 and 31.  The beginning bit (and most significant
 246 bit) offsets of S, E, and F are 0, 1, and 9, respectively.  Note that
 247 these numbers refer to the mathematical positions of the bits, and
 248 NOT to their actual physical locations (which vary from medium to
 249 medium).
 250 .LP
 251 The IEEE specifications should be consulted concerning the encoding
 252 for signed zero, signed infinity (overflow), and denormalized numbers
 253 (underflow) [3].  According to IEEE specifications, the "NaN" (not a
 254 number) is system dependent and should not be used externally.
 255 .NH 2
 256 \&Double-precision Floating-point
 257 .IX XDR "integer, double-precision floating point"
 258 .IX XDR "double-precision floating-point integer"
 259 .LP
 260 The standard defines the encoding for the double-precision floating-
 261 point data type "double" (64 bits or 8 bytes).  The encoding used is
 262 the IEEE standard for normalized double-precision floating-point
 263 numbers [3].  The standard encodes the following three fields, which
 264 describe the double-precision floating-point number:
 265 .RS
 266 .IP \fBS\fP:
 267 The sign of the number.  Values  0 and 1  represent positive and
 268 negative, respectively.  One bit.
 269 .IP \fBE\fP:
 270 The exponent of the number, base 2.  11 bits are devoted to this
 271 field.  The exponent is biased by 1023.
 272 .IP \fBF\fP:
 273 The fractional part of the number's  mantissa, base 2.   52 bits
 274 are devoted to this field.
 275 .RE
 276 .LP
 277 Therefore, the floating-point number is described by:
 278 .DS
 279 (-1)**S * 2**(E-Bias) * 1.F
 280 .DE
 281 It is declared as follows:
 282 .ie t .DS
 283 .el .DS L
 284 \fIDouble-Precision Floating-Point\fP
 285
 286 \f(CW+------+------+------+------+------+------+------+------+
 287 |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7|
 288 S|    E   |                    F                        |
 289 +------+------+------+------+------+------+------+------+
 290 1|<--11-->|<-----------------52 bits------------------->|
 291 <-----------------------64 bits------------------------->\fP
 292 .DE
 293 Just as the most and least significant bytes of a number are 0 and 3,
 294 the most and least significant bits of a double-precision floating-
 295 point number are 0 and 63.  The beginning bit (and most significant
 296 bit) offsets of S, E , and F are 0, 1, and 12, respectively.  Note
 297 that these numbers refer to the mathematical positions of the bits,
 298 and NOT to their actual physical locations (which vary from medium to
 299 medium).
 300 .LP
 301 The IEEE specifications should be consulted concerning the encoding
 302 for signed zero, signed infinity (overflow), and denormalized numbers
 303 (underflow) [3].  According to IEEE specifications, the "NaN" (not a
 304 number) is system dependent and should not be used externally.
 305 .NH 2
 306 \&Fixed-length Opaque Data
 307 .IX XDR "fixed-length opaque data"
 308 .IX XDR "opaque data, fixed length"
 309 .LP
 310 At times, fixed-length uninterpreted data needs to be passed among
 311 machines.  This data is called "opaque" and is declared as follows:
 312 .DS
 313 .ft CW
 314 opaque identifier[n];
 315 .DE
 316 where the constant n is the (static) number of bytes necessary to
 317 contain the opaque data.  If n is not a multiple of four, then the n
 318 bytes are followed by enough (0 to 3) residual zero bytes, r, to make
 319 the total byte count of the opaque object a multiple of four.
 320 .ie t .DS
 321 .el .DS L
 322 \fIFixed-Length Opaque\fP
 323
 324 \f(CW0        1     ...
 325 +--------+--------+...+--------+--------+...+--------+
 326 | byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
 327 +--------+--------+...+--------+--------+...+--------+
 328 |<-----------n bytes---------->|<------r bytes------>|
 329 |<-----------n+r (where (n+r) mod 4 = 0)------------>|\fP
 330 .DE
 331 .NH 2
 332 \&Variable-length Opaque Data
 333 .IX XDR "variable-length opaque data"
 334 .IX XDR "opaque data, variable length"
 335 .LP
 336 The standard also provides for variable-length (counted) opaque data,
 337 defined as a sequence of n (numbered 0 through n-1) arbitrary bytes
 338 to be the number n encoded as an unsigned integer (as described
 339 below), and followed by the n bytes of the sequence.
 340 .LP
 341 Byte m of the sequence always precedes byte m+1 of the sequence, and
 342 byte 0 of the sequence always follows the sequence's length (count).
 343 enough (0 to 3) residual zero bytes, r, to make the total byte count
 344 a multiple of four.  Variable-length opaque data is declared in the
 345 following way:
 346 .DS
 347 .ft CW
 348 opaque identifier<m>;
 349 .DE
 350 or
 351 .DS
 352 .ft CW
 353 opaque identifier<>;
 354 .DE
 355 The constant m denotes an upper bound of the number of bytes that the
 356 sequence may contain.  If m is not specified, as in the second
 357 declaration, it is assumed to be (2**32) - 1, the maximum length.
 358 The constant m would normally be found in a protocol specification.
 359 For example, a filing protocol may state that the maximum data
 360 transfer size is 8192 bytes, as follows:
 361 .DS
 362 .ft CW
 363 opaque filedata<8192>;
 364 .DE
 365 This can be illustrated as follows:
 366 .ie t .DS
 367 .el .DS L
 368 \fIVariable-Length Opaque\fP
 369
 370 \f(CW0     1     2     3     4     5   ...
 371 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 372 |        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
 373 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 374 |<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
 375 |<----n+r (where (n+r) mod 4 = 0)---->|\fP
 376 .DE
 377 .LP
 378 It   is  an error  to  encode  a  length  greater  than the maximum
 379 described in the specification.
 380 .NH 2
 381 \&String
 382 .IX XDR string
 383 .LP
 384 The standard defines a string of n (numbered 0 through n-1) ASCII
 385 bytes to be the number n encoded as an unsigned integer (as described
 386 above), and followed by the n bytes of the string.  Byte m of the
 387 string always precedes byte m+1 of the string, and byte 0 of the
 388 string always follows the string's length.  If n is not a multiple of
 389 four, then the n bytes are followed by enough (0 to 3) residual zero
 390 bytes, r, to make the total byte count a multiple of four.  Counted
 391 byte strings are declared as follows:
 392 .DS
 393 .ft CW
 394 string object<m>;
 395 .DE
 396 or
 397 .DS
 398 .ft CW
 399 string object<>;
 400 .DE
 401 The constant m denotes an upper bound of the number of bytes that a
 402 string may contain.  If m is not specified, as in the second
 403 declaration, it is assumed to be (2**32) - 1, the maximum length.
 404 The constant m would normally be found in a protocol specification.
 405 For example, a filing protocol may state that a file name can be no
 406 longer than 255 bytes, as follows:
 407 .DS
 408 .ft CW
 409 string filename<255>;
 410 .DE
 411 Which can be illustrated as:
 412 .ie t .DS
 413 .el .DS L
 414 \fIA String\fP
 415
 416 \f(CW0     1     2     3     4     5   ...
 417 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 418 |        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
 419 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 420 |<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
 421 |<----n+r (where (n+r) mod 4 = 0)---->|\fP
 422 .DE
 423 .LP
 424 It   is an  error  to  encode  a length greater  than   the maximum
 425 described in the specification.
 426 .NH 2
 427 \&Fixed-length Array
 428 .IX XDR "fixed-length array"
 429 .IX XDR "array, fixed length"
 430 .LP
 431 Declarations for fixed-length arrays of homogeneous elements are in
 432 the following form:
 433 .DS
 434 .ft CW
 435 type-name identifier[n];
 436 .DE
 437 Fixed-length arrays of elements numbered 0 through n-1 are encoded by
 438 individually encoding the elements of the array in their natural
 439 order, 0 through n-1.  Each element's size is a multiple of four
 440 bytes. Though all elements are of the same type, the elements may
 441 have different sizes.  For example, in a fixed-length array of
 442 strings, all elements are of type "string", yet each element will
 443 vary in its length.
 444 .ie t .DS
 445 .el .DS L
 446 \fIFixed-Length Array\fP
 447
 448 \f(CW+---+---+---+---+---+---+---+---+...+---+---+---+---+
 449 |   element 0   |   element 1   |...|  element n-1  |
 450 +---+---+---+---+---+---+---+---+...+---+---+---+---+
 451 |<--------------------n elements------------------->|\fP
 452 .DE
 453 .NH 2
 454 \&Variable-length Array
 455 .IX XDR "variable-length array"
 456 .IX XDR "array, variable length"
 457 .LP
 458 Counted arrays provide the ability to encode variable-length arrays
 459 of homogeneous elements.  The array is encoded as the element count n
 460 (an unsigned integer) followed by the encoding of each of the array's
 461 elements, starting with element 0 and progressing through element n-
 462 1.  The declaration for variable-length arrays follows this form:
 463 .DS
 464 .ft CW
 465 type-name identifier<m>;
 466 .DE
 467 or
 468 .DS
 469 .ft CW
 470 type-name identifier<>;
 471 .DE
 472 The constant m specifies the maximum acceptable element count of an
 473 array; if  m is not specified, as  in the second declaration, it is
 474 assumed to be (2**32) - 1.
 475 .ie t .DS
 476 .el .DS L
 477 \fICounted Array\fP
 478
 479 \f(CW0  1  2  3
 480 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
 481 |     n     | element 0 | element 1 |...|element n-1|
 482 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
 483 |<-4 bytes->|<--------------n elements------------->|\fP
 484 .DE
 485 It is  an error to  encode  a  value of n that  is greater than the
 486 maximum described in the specification.
 487 .NH 2
 488 \&Structure
 489 .IX XDR structure
 490 .LP
 491 Structures are declared as follows:
 492 .DS
 493 .ft CW
 494 struct {
 495         component-declaration-A;
 496         component-declaration-B;
 497         \&...
 498 } identifier;
 499 .DE
 500 The components of the structure are encoded in the order of their
 501 declaration in the structure.  Each component's size is a multiple of
 502 four bytes, though the components may be different sizes.
 503 .ie t .DS
 504 .el .DS L
 505 \fIStructure\fP
 506
 507 \f(CW+-------------+-------------+...
 508 | component A | component B |...
 509 +-------------+-------------+...\fP
 510 .DE
 511 .NH 2
 512 \&Discriminated Union
 513 .IX XDR "discriminated union"
 514 .IX XDR union discriminated
 515 .LP
 516 A discriminated union is a type composed of a discriminant followed
 517 by a type selected from a set of prearranged types according to the
 518 value of the discriminant.  The type of discriminant is either "int",
 519 "unsigned int", or an enumerated type, such as "bool".  The component
 520 types are called "arms" of the union, and are preceded by the value
 521 of the discriminant which implies their encoding.  Discriminated
 522 unions are declared as follows:
 523 .DS
 524 .ft CW
 525 union switch (discriminant-declaration) {
 526         case discriminant-value-A:
 527         arm-declaration-A;
 528         case discriminant-value-B:
 529         arm-declaration-B;
 530         \&...
 531         default: default-declaration;
 532 } identifier;
 533 .DE
 534 Each "case" keyword is followed by a legal value of the discriminant.
 535 The default arm is optional.  If it is not specified, then a valid
 536 encoding of the union cannot take on unspecified discriminant values.
 537 The size of the implied arm is always a multiple of four bytes.
 538 .LP
 539 The discriminated union is encoded as its discriminant followed by
 540 the encoding of the implied arm.
 541 .ie t .DS
 542 .el .DS L
 543 \fIDiscriminated Union\fP
 544
 545 \f(CW0   1   2   3
 546 +---+---+---+---+---+---+---+---+
 547 |  discriminant |  implied arm  |
 548 +---+---+---+---+---+---+---+---+
 549 |<---4 bytes--->|\fP
 550 .DE
 551 .NH 2
 552 \&Void
 553 .IX XDR void
 554 .LP
 555 An XDR void is a 0-byte quantity.  Voids are useful for describing
 556 operations that take no data as input or no data as output. They are
 557 also useful in unions, where some arms may contain data and others do
 558 not.  The declaration is simply as follows:
 559 .DS
 560 .ft CW
 561 void;
 562 .DE
 563 Voids are illustrated as follows:
 564 .ie t .DS
 565 .el .DS L
 566 \fIVoid\fP
 567
 568 \f(CW  ++
 569   ||
 570   ++
 571 --><-- 0 bytes\fP
 572 .DE
 573 .NH 2
 574 \&Constant
 575 .IX XDR constant
 576 .LP
 577 The data declaration for a constant follows this form:
 578 .DS
 579 .ft CW
 580 const name-identifier = n;
 581 .DE
 582 "const" is used to define a symbolic name for a constant; it does not
 583 declare any data.  The symbolic constant may be used anywhere a
 584 regular constant may be used.  For example, the following defines a
 585 symbolic constant DOZEN, equal to 12.
 586 .DS
 587 .ft CW
 588 const DOZEN = 12;
 589 .DE
 590 .NH 2
 591 \&Typedef
 592 .IX XDR typedef
 593 .LP
 594 "typedef" does not declare any data either, but serves to define new
 595 identifiers for declaring data. The syntax is:
 596 .DS
 597 .ft CW
 598 typedef declaration;
 599 .DE
 600 The new type name is actually the variable name in the declaration
 601 part of the typedef.  For example, the following defines a new type
 602 called "eggbox" using an existing type called "egg":
 603 .DS
 604 .ft CW
 605 typedef egg eggbox[DOZEN];
 606 .DE
 607 Variables declared using the new type name have the same type as the
 608 new type name would have in the typedef, if it was considered a
 609 variable.  For example, the following two declarations are equivalent
 610 in declaring the variable "fresheggs":
 611 .DS
 612 .ft CW
 613 eggbox  fresheggs;
 614 egg     fresheggs[DOZEN];
 615 .DE
 616 When a typedef involves a struct, enum, or union definition, there is
 617 another (preferred) syntax that may be used to define the same type.
 618 In general, a typedef of the following form:
 619 .DS
 620 .ft CW
 621 typedef <<struct, union, or enum definition>> identifier;
 622 .DE
 623 may be converted to the alternative form by removing the "typedef"
 624 part and placing the identifier after the "struct", "union", or
 625 "enum" keyword, instead of at the end.  For example, here are the two
 626 ways to define the type "bool":
 627 .DS
 628 .ft CW
 629 typedef enum {    /* \fIusing typedef\fP */
 630         FALSE = 0,
 631         TRUE = 1
 632         } bool;
 633
 634 enum bool {       /* \fIpreferred alternative\fP */
 635         FALSE = 0,
 636         TRUE = 1
 637         };
 638 .DE
 639 The reason this syntax is preferred is one does not have to wait
 640 until the end of a declaration to figure out the name of the new
 641 type.
 642 .NH 2
 643 \&Optional-data
 644 .IX XDR "optional data"
 645 .IX XDR "data, optional"
 646 .LP
 647 Optional-data is one kind of union that occurs so frequently that we
 648 give it a special syntax of its own for declaring it.  It is declared
 649 as follows:
 650 .DS
 651 .ft CW
 652 type-name *identifier;
 653 .DE
 654 This is equivalent to the following union:
 655 .DS
 656 .ft CW
 657 union switch (bool opted) {
 658         case TRUE:
 659         type-name element;
 660         case FALSE:
 661         void;
 662 } identifier;
 663 .DE
 664 It is also equivalent to the following variable-length array
 665 declaration, since the boolean "opted" can be interpreted as the
 666 length of the array:
 667 .DS
 668 .ft CW
 669 type-name identifier<1>;
 670 .DE
 671 Optional-data is not so interesting in itself, but it is very useful
 672 for describing recursive data-structures such as linked-lists and
 673 trees.  For example, the following defines a type "stringlist" that
 674 encodes lists of arbitrary length strings:
 675 .DS
 676 .ft CW
 677 struct *stringlist {
 678         string item<>;
 679         stringlist next;
 680 };
 681 .DE
 682 It could have been equivalently declared as the following union:
 683 .DS
 684 .ft CW
 685 union stringlist switch (bool opted) {
 686         case TRUE:
 687                 struct {
 688                         string item<>;
 689                         stringlist next;
 690                 } element;
 691         case FALSE:
 692                 void;
 693 };
 694 .DE
 695 or as a variable-length array:
 696 .DS
 697 .ft CW
 698 struct stringlist<1> {
 699         string item<>;
 700         stringlist next;
 701 };
 702 .DE
 703 Both of these declarations obscure the intention of the stringlist
 704 type, so the optional-data declaration is preferred over both of
 705 them.  The optional-data type also has a close correlation to how
 706 recursive data structures are represented in high-level languages
 707 such as Pascal or C by use of pointers. In fact, the syntax is the
 708 same as that of the C language for pointers.
 709 .NH 2
 710 \&Areas for Future Enhancement
 711 .IX XDR futures
 712 .LP
 713 The XDR standard lacks representations for bit fields and bitmaps,
 714 since the standard is based on bytes.  Also missing are packed (or
 715 binary-coded) decimals.
 716 .LP
 717 The intent of the XDR standard was not to describe every kind of data
 718 that people have ever sent or will ever want to send from machine to
 719 machine. Rather, it only describes the most commonly used data-types
 720 of high-level languages such as Pascal or C so that applications
 721 written in these languages will be able to communicate easily over
 722 some medium.
 723 .LP
 724 One could imagine extensions to XDR that would let it describe almost
 725 any existing protocol, such as TCP.  The minimum necessary for this
 726 are support for different block sizes and byte-orders.  The XDR
 727 discussed here could then be considered the 4-byte big-endian member
 728 of a larger XDR family.
 729 .NH 1
 730 \&Discussion
 731 .sp 2
 732 .NH 2
 733 \&Why a Language for Describing Data?
 734 .IX XDR language
 735 .LP
 736 There are many advantages in using a data-description language such
 737 as  XDR  versus using  diagrams.   Languages are  more  formal than
 738 diagrams   and   lead  to less  ambiguous   descriptions  of  data.
 739 Languages are also easier  to understand and allow  one to think of
 740 other   issues instead of  the   low-level details of bit-encoding.
 741 Also,  there is  a close analogy  between the  types  of XDR and  a
 742 high-level language   such  as C   or    Pascal.   This makes   the
 743 implementation of XDR encoding and decoding modules an easier task.
 744 Finally, the language specification itself  is an ASCII string that
 745 can be passed from  machine to machine  to perform  on-the-fly data
 746 interpretation.
 747 .NH 2
 748 \&Why Only one Byte-Order for an XDR Unit?
 749 .IX XDR "byte order"
 750 .LP
 751 Supporting two byte-orderings requires a higher level protocol for
 752 determining in which byte-order the data is encoded.  Since XDR is
 753 not a protocol, this can't be done.  The advantage of this, though,
 754 is that data in XDR format can be written to a magnetic tape, for
 755 example, and any machine will be able to interpret it, since no
 756 higher level protocol is necessary for determining the byte-order.
 757 .NH 2
 758 \&Why does XDR use Big-Endian Byte-Order?
 759 .LP
 760 Yes, it is unfair, but having only one byte-order means you have to
 761 be unfair to somebody.  Many architectures, such as the Motorola
 762 68000 and IBM 370, support the big-endian byte-order.
 763 .NH 2
 764 \&Why is the XDR Unit Four Bytes Wide?
 765 .LP
 766 There is a tradeoff in choosing the XDR unit size.  Choosing a small
 767 size such as two makes the encoded data small, but causes alignment
 768 problems for machines that aren't aligned on these boundaries.  A
 769 large size such as eight means the data will be aligned on virtually
 770 every machine, but causes the encoded data to grow too big.  We chose
 771 four as a compromise.  Four is big enough to support most
 772 architectures efficiently, except for rare machines such as the
 773 eight-byte aligned Cray.  Four is also small enough to keep the
 774 encoded data restricted to a reasonable size.
 775 .NH 2
 776 \&Why must Variable-Length Data be Padded with Zeros?
 777 .IX XDR "variable-length data"
 778 .LP
 779 It is desirable that the same data encode into the same thing on all
 780 machines, so that encoded data can be meaningfully compared or
 781 checksummed.  Forcing the padded bytes to be zero ensures this.
 782 .NH 2
 783 \&Why is there No Explicit Data-Typing?
 784 .LP
 785 Data-typing has a relatively high cost for what small advantages it
 786 may have.  One cost is the expansion of data due to the inserted type
 787 fields.  Another is the added cost of interpreting these type fields
 788 and acting accordingly.  And most protocols already know what type
 789 they expect, so data-typing supplies only redundant information.
 790 However, one can still get the benefits of data-typing using XDR. One
 791 way is to encode two things: first a string which is the XDR data
 792 description of the encoded data, and then the encoded data itself.
 793 Another way is to assign a value to all the types in XDR, and then
 794 define a universal type which takes this value as its discriminant
 795 and for each value, describes the corresponding data type.
 796 .NH 1
 797 \&The XDR Language Specification
 798 .IX XDR language
 799 .sp 1
 800 .NH 2
 801 \&Notational Conventions
 802 .IX "XDR language" notation
 803 .LP
 804 This specification  uses an extended Backus-Naur Form  notation for
 805 describing the XDR language.   Here is  a brief description  of the
 806 notation:
 807 .IP  1.
 808 The characters
 809 .I | ,
 810 .I ( ,
 811 .I ) ,
 812 .I [ ,
 813 .I ] ,
 814 .I " ,
 815 and
 816 .I *
 817 are special.
 818 .IP  2.
 819 Terminal symbols are  strings of any  characters surrounded by
 820 double quotes.
 821 .IP  3.
 822 Non-terminal symbols are strings of non-special characters.
 823 .IP  4.
 824 Alternative items are separated by a vertical bar ("\fI|\fP").
 825 .IP  5.
 826 Optional items are enclosed in brackets.
 827 .IP  6.
 828 Items are grouped together by enclosing them in parentheses.
 829 .IP  7.
 830 A
 831 .I *
 832 following an item means  0 or more  occurrences of that item.
 833 .LP
 834 For example,  consider  the  following pattern:
 835 .DS L
 836 "a " "very" (", " " very")* [" cold " "and"]  " rainy " ("day" | "night")
 837 .DE
 838 .LP
 839 An infinite  number of  strings match  this pattern. A few  of them
 840 are:
 841 .DS
 842 "a very rainy day"
 843 "a very, very rainy day"
 844 "a very cold and  rainy day"
 845 "a very, very, very cold and  rainy night"
 846 .DE
 847 .NH 2
 848 \&Lexical Notes
 849 .IP  1.
 850 Comments begin with '/*' and terminate with '*/'.
 851 .IP  2.
 852 White space serves to separate items and is otherwise ignored.
 853 .IP  3.
 854 An identifier is a letter followed by  an optional sequence of
 855 letters, digits or underbar ('_').  The case of identifiers is
 856 not ignored.
 857 .IP  4.
 858 A  constant is  a  sequence  of  one  or  more decimal digits,
 859 optionally preceded by a minus-sign ('-').
 860 .NH 2
 861 \&Syntax Information
 862 .IX "XDR language" syntax
 863 .DS
 864 .ft CW
 865 declaration:
 866         type-specifier identifier
 867         | type-specifier identifier "[" value "]"
 868         | type-specifier identifier "<" [ value ] ">"
 869         | "opaque" identifier "[" value "]"
 870         | "opaque" identifier "<" [ value ] ">"
 871         | "string" identifier "<" [ value ] ">"
 872         | type-specifier "*" identifier
 873         | "void"
 874 .DE
 875 .DS
 876 .ft CW
 877 value:
 878         constant
 879         | identifier
 880
 881 type-specifier:
 882           [ "unsigned" ] "int"
 883         | [ "unsigned" ] "hyper"
 884         | "float"
 885         | "double"
 886         | "bool"
 887         | enum-type-spec
 888         | struct-type-spec
 889         | union-type-spec
 890         | identifier
 891 .DE
 892 .DS
 893 .ft CW
 894 enum-type-spec:
 895         "enum" enum-body
 896
 897 enum-body:
 898         "{"
 899         ( identifier "=" value )
 900         ( "," identifier "=" value )*
 901         "}"
 902 .DE
 903 .DS
 904 .ft CW
 905 struct-type-spec:
 906         "struct" struct-body
 907
 908 struct-body:
 909         "{"
 910         ( declaration ";" )
 911         ( declaration ";" )*
 912         "}"
 913 .DE
 914 .DS
 915 .ft CW
 916 union-type-spec:
 917         "union" union-body
 918
 919 union-body:
 920         "switch" "(" declaration ")" "{"
 921         ( "case" value ":" declaration ";" )
 922         ( "case" value ":" declaration ";" )*
 923         [ "default" ":" declaration ";" ]
 924         "}"
 925
 926 constant-def:
 927         "const" identifier "=" constant ";"
 928 .DE
 929 .DS
 930 .ft CW
 931 type-def:
 932         "typedef" declaration ";"
 933         | "enum" identifier enum-body ";"
 934         | "struct" identifier struct-body ";"
 935         | "union" identifier union-body ";"
 936
 937 definition:
 938         type-def
 939         | constant-def
 940
 941 specification:
 942         definition *
 943 .DE
 944 .NH 3
 945 \&Syntax Notes
 946 .IX "XDR language" syntax
 947 .LP
 948 .IP  1.
 949 The following are keywords and cannot be used as identifiers:
 950 "bool", "case", "const", "default", "double", "enum", "float",
 951 "hyper", "opaque", "string", "struct", "switch", "typedef", "union",
 952 "unsigned" and "void".
 953 .IP  2.
 954 Only unsigned constants may be used as size specifications for
 955 arrays.  If an identifier is used, it must have been declared
 956 previously as an unsigned constant in a "const" definition.
 957 .IP  3.
 958 Constant and type identifiers within the scope of a specification
 959 are in the same name space and must be declared uniquely within this
 960 scope.
 961 .IP  4.
 962 Similarly, variable names must  be unique within  the scope  of
 963 struct and union declarations. Nested struct and union declarations
 964 create new scopes.
 965 .IP  5.
 966 The discriminant of a union must be of a type that evaluates to
 967 an integer. That is, "int", "unsigned int", "bool", an enumerated
 968 type or any typedefed type that evaluates to one of these is legal.
 969 Also, the case values must be one of the legal values of the
 970 discriminant.  Finally, a case value may not be specified more than
 971 once within the scope of a union declaration.
 972 .NH 1
 973 \&An Example of an XDR Data Description
 974 .LP
 975 Here is a short XDR data description of a thing called a "file",
 976 which might be used to transfer files from one machine to another.
 977 .ie t .DS
 978 .el .DS L
 979 .ft CW
 980
 981 const MAXUSERNAME = 32;     /*\fI max length of a user name \fP*/
 982 const MAXFILELEN = 65535;   /*\fI max length of a file      \fP*/
 983 const MAXNAMELEN = 255;     /*\fI max length of a file name \fP*/
 984
 985 .ft I
 986 /*
 987  * Types of files:
 988  */
 989 .ft CW
 990
 991 enum filekind {
 992         TEXT = 0,       /*\fI ascii data \fP*/
 993         DATA = 1,       /*\fI raw data   \fP*/
 994         EXEC = 2        /*\fI executable \fP*/
 995 };
 996
 997 .ft I
 998 /*
 999  * File information, per kind of file:
1000  */
1001 .ft CW
1002
1003 union filetype switch (filekind kind) {
1004         case TEXT:
1005                 void;                           /*\fI no extra information \fP*/
1006         case DATA:
1007                 string creator<MAXNAMELEN>;     /*\fI data creator         \fP*/
1008         case EXEC:
1009                 string interpretor<MAXNAMELEN>; /*\fI program interpretor  \fP*/
1010 };
1011
1012 .ft I
1013 /*
1014  * A complete file:
1015  */
1016 .ft CW
1017
1018 struct file {
1019         string filename<MAXNAMELEN>; /*\fI name of file \fP*/
1020         filetype type;               /*\fI info about file \fP*/
1021         string owner<MAXUSERNAME>;   /*\fI owner of file   \fP*/
1022         opaque data<MAXFILELEN>;     /*\fI file data       \fP*/
1023 };
1024 .DE
1025 .LP
1026 Suppose now that there is  a user named  "john" who wants to  store
1027 his lisp program "sillyprog" that contains just  the data "(quit)".
1028 His file would be encoded as follows:
1029 .TS
1030 box tab (&) ;
1031 lfI lfI lfI lfI
1032 rfL rfL rfL l .
1033 Offset&Hex Bytes&ASCII&Description
1034 _
1035 0&00 00 00 09&....&Length of filename = 9
1036 4&73 69 6c 6c&sill&Filename characters
1037 8&79 70 72 6f&ypro& ... and more characters ...
1038 12&67 00 00 00&g...& ... and 3 zero-bytes of fill
1039 16&00 00 00 02&....&Filekind is EXEC = 2
1040 20&00 00 00 04&....&Length of interpretor = 4
1041 24&6c 69 73 70&lisp&Interpretor characters
1042 28&00 00 00 04&....&Length of owner = 4
1043 32&6a 6f 68 6e&john&Owner characters
1044 36&00 00 00 06&....&Length of file data = 6
1045 40&28 71 75 69&(qui&File data bytes ...
1046 44&74 29 00 00&t)..& ... and 2 zero-bytes of fill
1047 .TE
1048 .NH 1
1049 \&References
1050 .LP
1051 [1]  Brian W. Kernighan & Dennis M. Ritchie, "The C Programming
1052 Language", Bell Laboratories, Murray Hill, New Jersey, 1978.
1053 .LP
1054 [2]  Danny Cohen, "On Holy Wars and a Plea for Peace", IEEE Computer,
1055 October 1981.
1056 .LP
1057 [3]  "IEEE Standard for Binary Floating-Point Arithmetic", ANSI/IEEE
1058 Standard 754-1985, Institute of Electrical and Electronics
1059 Engineers, August 1985.
1060 .LP
1061 [4]  "Courier: The Remote Procedure Call Protocol", XEROX
1062 Corporation, XSIS 038112, December 1981.