share/doc/psd/25.xdrrfc/xdr.rfc.ms

   1 .\"
   2 .\"  Must use -- tbl -- with this one
   3 .\"
   4 .\" @(#)xdr.rfc.ms      2.2 88/08/05 4.0 RPCSRC
   5 .\" $FreeBSD: src/lib/libc/rpc/PSD.doc/xdr.rfc.ms,v 1.1.14.1 2000/11/24 09:36:30 ru Exp $
   6 .\"
   7 .so stubs
   8 .de BT
   9 .if \\n%=1 .tl ''- % -''
  10 ..
  11 .ND
  12 .\" prevent excess underlining in nroff
  13 .if n .fp 2 R
  14 .OH 'External Data Representation Standard''Page %'
  15 .EH 'Page %''External Data Representation Standard'
  16 .if \n%=1 .bp
  17 .SH
  18 \&External Data Representation Standard: Protocol Specification
  19 .IX "External Data Representation"
  20 .IX XDR RFC
  21 .IX XDR "protocol specification"
  22 .LP
  23 .NH 0
  24 \&Status of this Standard
  25 .nr OF 1
  26 .IX XDR "RFC status"
  27 .LP
  28 Note: This chapter specifies a protocol that Sun Microsystems, Inc., and
  29 others are using.  It has been designated RFC1014 by the ARPA Network
  30 Information Center.
  31 .NH 1
  32 Introduction
  33 \&
  34 .LP
  35 XDR is a standard for the description and encoding of data.  It is
  36 useful for transferring data between different computer
  37 architectures, and has been used to communicate data between such
  38 diverse machines as the Sun Workstation, VAX, IBM-PC, and Cray.
  39 XDR fits into the ISO presentation layer, and is roughly analogous in
  40 purpose to X.409, ISO Abstract Syntax Notation.  The major difference
  41 between these two is that XDR uses implicit typing, while X.409 uses
  42 explicit typing.
  43 .LP
  44 XDR uses a language to describe data formats.  The language can only
  45 be used only to describe data; it is not a programming language.
  46 This language allows one to describe intricate data formats in a
  47 concise manner. The alternative of using graphical representations
  48 (itself an informal language) quickly becomes incomprehensible when
  49 faced with complexity.  The XDR language itself is similar to the C
  50 language [1], just as Courier [4] is similar to Mesa. Protocols such
  51 as Sun RPC (Remote Procedure Call) and the NFS (Network File System)
  52 use XDR to describe the format of their data.
  53 .LP
  54 The XDR standard makes the following assumption: that bytes (or
  55 octets) are portable, where a byte is defined to be 8 bits of data.
  56 A given hardware device should encode the bytes onto the various
  57 media in such a way that other hardware devices may decode the bytes
  58 without loss of meaning.  For example, the Ethernet standard
  59 suggests that bytes be encoded in "little-endian" style [2], or least
  60 significant bit first.
  61 .NH 2
  62 \&Basic Block Size
  63 .IX XDR "basic block size"
  64 .IX XDR "block size"
  65 .LP
  66 The representation of all items requires a multiple of four bytes (or
  67 32 bits) of data.  The bytes are numbered 0 through n-1.  The bytes
  68 are read or written to some byte stream such that byte m always
  69 precedes byte m+1.  If the n bytes needed to contain the data are not
  70 a multiple of four, then the n bytes are followed by enough (0 to 3)
  71 residual zero bytes, r, to make the total byte count a multiple of 4.
  72 .LP
  73 We include the familiar graphic box notation for illustration and
  74 comparison.  In most illustrations, each box (delimited by a plus
  75 sign at the 4 corners and vertical bars and dashes) depicts a byte.
  76 Ellipses (...) between boxes show zero or more additional bytes where
  77 required.
  78 .ie t .DS
  79 .el .DS L
  80 \fIA Block\fP
  81
  82 \f(CW+--------+--------+...+--------+--------+...+--------+
  83 | byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
  84 +--------+--------+...+--------+--------+...+--------+
  85 |<-----------n bytes---------->|<------r bytes------>|
  86 |<-----------n+r (where (n+r) mod 4 = 0)>----------->|\fP
  87
  88 .DE
  89 .NH 1
  90 \&XDR Data Types
  91 .IX XDR "data types"
  92 .IX "XDR data types"
  93 .LP
  94 Each of the sections that follow describes a data type defined in the
  95 XDR standard, shows how it is declared in the language, and includes
  96 a graphic illustration of its encoding.
  97 .LP
  98 For each data type in the language we show a general paradigm
  99 declaration.  Note that angle brackets (< and >) denote
 100 variable length sequences of data and square brackets ([ and ]) denote
 101 fixed-length sequences of data.  "n", "m" and "r" denote integers.
 102 For the full language specification and more formal definitions of
 103 terms such as "identifier" and "declaration", refer to
 104 .I "The XDR Language Specification" ,
 105 below.
 106 .LP
 107 For some data types, more specific examples are included.
 108 A more extensive example of a data description is in
 109 .I "An Example of an XDR Data Description"
 110 below.
 111 .NH 2
 112 \&Integer
 113 .IX XDR integer
 114 .LP
 115 An XDR signed integer is a 32-bit datum that encodes an integer in
 116 the range [-2147483648,2147483647].  The integer is represented in
 117 two's complement notation.  The most and least significant bytes are
 118 0 and 3, respectively.  Integers are declared as follows:
 119 .ie t .DS
 120 .el .DS L
 121 \fIInteger\fP
 122
 123 \f(CW(MSB)                   (LSB)
 124 +-------+-------+-------+-------+
 125 |byte 0 |byte 1 |byte 2 |byte 3 |
 126 +-------+-------+-------+-------+
 127 <------------32 bits------------>\fP
 128 .DE
 129 .NH 2
 130 \&Unsigned Integer
 131 .IX XDR "unsigned integer"
 132 .IX XDR "integer, unsigned"
 133 .LP
 134 An XDR unsigned integer is a 32-bit datum that encodes a nonnegative
 135 integer in the range [0,4294967295].  It is represented by an
 136 unsigned binary number whose most and least significant bytes are 0
 137 and 3, respectively.  An unsigned integer is declared as follows:
 138 .ie t .DS
 139 .el .DS L
 140 \fIUnsigned Integer\fP
 141
 142 \f(CW(MSB)                   (LSB)
 143 +-------+-------+-------+-------+
 144 |byte 0 |byte 1 |byte 2 |byte 3 |
 145 +-------+-------+-------+-------+
 146 <------------32 bits------------>\fP
 147 .DE
 148 .NH 2
 149 \&Enumeration
 150 .IX XDR enumeration
 151 .LP
 152 Enumerations have the same representation as signed integers.
 153 Enumerations are handy for describing subsets of the integers.
 154 Enumerated data is declared as follows:
 155 .ft CW
 156 .DS
 157 enum { name-identifier = constant, ... } identifier;
 158 .DE
 159 For example, the three colors red, yellow, and blue could be
 160 described by an enumerated type:
 161 .DS
 162 .ft CW
 163 enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;
 164 .DE
 165 It is an error to encode as an enum any other integer than those that
 166 have been given assignments in the enum declaration.
 167 .NH 2
 168 \&Boolean
 169 .IX XDR boolean
 170 .LP
 171 Booleans are important enough and occur frequently enough to warrant
 172 their own explicit type in the standard.  Booleans are declared as
 173 follows:
 174 .DS
 175 .ft CW
 176 bool identifier;
 177 .DE
 178 This is equivalent to:
 179 .DS
 180 .ft CW
 181 enum { FALSE = 0, TRUE = 1 } identifier;
 182 .DE
 183 .NH 2
 184 \&Hyper Integer and Unsigned Hyper Integer
 185 .IX XDR "hyper integer"
 186 .IX XDR "integer, hyper"
 187 .LP
 188 The standard also defines 64-bit (8-byte) numbers called hyper
 189 integer and unsigned hyper integer.  Their representations are the
 190 obvious extensions of integer and unsigned integer defined above.
 191 They are represented in two's complement notation.  The most and
 192 least significant bytes are 0 and 7, respectively.  Their
 193 declarations:
 194 .ie t .DS
 195 .el .DS L
 196 \fIHyper Integer\fP
 197 \fIUnsigned Hyper Integer\fP
 198
 199 \f(CW(MSB)                                                   (LSB)
 200 +-------+-------+-------+-------+-------+-------+-------+-------+
 201 |byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 |
 202 +-------+-------+-------+-------+-------+-------+-------+-------+
 203 <----------------------------64 bits---------------------------->\fP
 204 .DE
 205 .NH 2
 206 \&Floating-point
 207 .IX XDR "integer, floating point"
 208 .IX XDR "floating-point integer"
 209 .LP
 210 The standard defines the floating-point data type "float" (32 bits or
 211 4 bytes).  The encoding used is the IEEE standard for normalized
 212 single-precision floating-point numbers [3].  The following three
 213 fields describe the single-precision floating-point number:
 214 .RS
 215 .IP \fBS\fP:
 216 The sign of the number.  Values 0 and  1 represent  positive and
 217 negative, respectively.  One bit.
 218 .IP \fBE\fP:
 219 The exponent of the number, base 2.  8  bits are devoted to this
 220 field.  The exponent is biased by 127.
 221 .IP \fBF\fP:
 222 The fractional part of the number's mantissa,  base 2.   23 bits
 223 are devoted to this field.
 224 .RE
 225 .LP
 226 Therefore, the floating-point number is described by:
 227 .DS
 228 (-1)**S * 2**(E-Bias) * 1.F
 229 .DE
 230 It is declared as follows:
 231 .ie t .DS
 232 .el .DS L
 233 \fISingle-Precision Floating-Point\fP
 234
 235 \f(CW+-------+-------+-------+-------+
 236 |byte 0 |byte 1 |byte 2 |byte 3 |
 237 S|   E   |           F          |
 238 +-------+-------+-------+-------+
 239 1|<- 8 ->|<-------23 bits------>|
 240 <------------32 bits------------>\fP
 241 .DE
 242 Just as the most and least significant bytes of a number are 0 and 3,
 243 the most and least significant bits of a single-precision floating-
 244 point number are 0 and 31.  The beginning bit (and most significant
 245 bit) offsets of S, E, and F are 0, 1, and 9, respectively.  Note that
 246 these numbers refer to the mathematical positions of the bits, and
 247 NOT to their actual physical locations (which vary from medium to
 248 medium).
 249 .LP
 250 The IEEE specifications should be consulted concerning the encoding
 251 for signed zero, signed infinity (overflow), and denormalized numbers
 252 (underflow) [3].  According to IEEE specifications, the "NaN" (not a
 253 number) is system dependent and should not be used externally.
 254 .NH 2
 255 \&Double-precision Floating-point
 256 .IX XDR "integer, double-precision floating point"
 257 .IX XDR "double-precision floating-point integer"
 258 .LP
 259 The standard defines the encoding for the double-precision floating-
 260 point data type "double" (64 bits or 8 bytes).  The encoding used is
 261 the IEEE standard for normalized double-precision floating-point
 262 numbers [3].  The standard encodes the following three fields, which
 263 describe the double-precision floating-point number:
 264 .RS
 265 .IP \fBS\fP:
 266 The sign of the number.  Values  0 and 1  represent positive and
 267 negative, respectively.  One bit.
 268 .IP \fBE\fP:
 269 The exponent of the number, base 2.  11 bits are devoted to this
 270 field.  The exponent is biased by 1023.
 271 .IP \fBF\fP:
 272 The fractional part of the number's  mantissa, base 2.   52 bits
 273 are devoted to this field.
 274 .RE
 275 .LP
 276 Therefore, the floating-point number is described by:
 277 .DS
 278 (-1)**S * 2**(E-Bias) * 1.F
 279 .DE
 280 It is declared as follows:
 281 .ie t .DS
 282 .el .DS L
 283 \fIDouble-Precision Floating-Point\fP
 284
 285 \f(CW+------+------+------+------+------+------+------+------+
 286 |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7|
 287 S|    E   |                    F                        |
 288 +------+------+------+------+------+------+------+------+
 289 1|<--11-->|<-----------------52 bits------------------->|
 290 <-----------------------64 bits------------------------->\fP
 291 .DE
 292 Just as the most and least significant bytes of a number are 0 and 3,
 293 the most and least significant bits of a double-precision floating-
 294 point number are 0 and 63.  The beginning bit (and most significant
 295 bit) offsets of S, E , and F are 0, 1, and 12, respectively.  Note
 296 that these numbers refer to the mathematical positions of the bits,
 297 and NOT to their actual physical locations (which vary from medium to
 298 medium).
 299 .LP
 300 The IEEE specifications should be consulted concerning the encoding
 301 for signed zero, signed infinity (overflow), and denormalized numbers
 302 (underflow) [3].  According to IEEE specifications, the "NaN" (not a
 303 number) is system dependent and should not be used externally.
 304 .NH 2
 305 \&Fixed-length Opaque Data
 306 .IX XDR "fixed-length opaque data"
 307 .IX XDR "opaque data, fixed length"
 308 .LP
 309 At times, fixed-length uninterpreted data needs to be passed among
 310 machines.  This data is called "opaque" and is declared as follows:
 311 .DS
 312 .ft CW
 313 opaque identifier[n];
 314 .DE
 315 where the constant n is the (static) number of bytes necessary to
 316 contain the opaque data.  If n is not a multiple of four, then the n
 317 bytes are followed by enough (0 to 3) residual zero bytes, r, to make
 318 the total byte count of the opaque object a multiple of four.
 319 .ie t .DS
 320 .el .DS L
 321 \fIFixed-Length Opaque\fP
 322
 323 \f(CW0        1     ...
 324 +--------+--------+...+--------+--------+...+--------+
 325 | byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
 326 +--------+--------+...+--------+--------+...+--------+
 327 |<-----------n bytes---------->|<------r bytes------>|
 328 |<-----------n+r (where (n+r) mod 4 = 0)------------>|\fP
 329 .DE
 330 .NH 2
 331 \&Variable-length Opaque Data
 332 .IX XDR "variable-length opaque data"
 333 .IX XDR "opaque data, variable length"
 334 .LP
 335 The standard also provides for variable-length (counted) opaque data,
 336 defined as a sequence of n (numbered 0 through n-1) arbitrary bytes
 337 to be the number n encoded as an unsigned integer (as described
 338 below), and followed by the n bytes of the sequence.
 339 .LP
 340 Byte m of the sequence always precedes byte m+1 of the sequence, and
 341 byte 0 of the sequence always follows the sequence's length (count).
 342 enough (0 to 3) residual zero bytes, r, to make the total byte count
 343 a multiple of four.  Variable-length opaque data is declared in the
 344 following way:
 345 .DS
 346 .ft CW
 347 opaque identifier<m>;
 348 .DE
 349 or
 350 .DS
 351 .ft CW
 352 opaque identifier<>;
 353 .DE
 354 The constant m denotes an upper bound of the number of bytes that the
 355 sequence may contain.  If m is not specified, as in the second
 356 declaration, it is assumed to be (2**32) - 1, the maximum length.
 357 The constant m would normally be found in a protocol specification.
 358 For example, a filing protocol may state that the maximum data
 359 transfer size is 8192 bytes, as follows:
 360 .DS
 361 .ft CW
 362 opaque filedata<8192>;
 363 .DE
 364 This can be illustrated as follows:
 365 .ie t .DS
 366 .el .DS L
 367 \fIVariable-Length Opaque\fP
 368
 369 \f(CW0     1     2     3     4     5   ...
 370 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 371 |        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
 372 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 373 |<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
 374 |<----n+r (where (n+r) mod 4 = 0)---->|\fP
 375 .DE
 376 .LP
 377 It   is  an error  to  encode  a  length  greater  than the maximum
 378 described in the specification.
 379 .NH 2
 380 \&String
 381 .IX XDR string
 382 .LP
 383 The standard defines a string of n (numbered 0 through n-1) ASCII
 384 bytes to be the number n encoded as an unsigned integer (as described
 385 above), and followed by the n bytes of the string.  Byte m of the
 386 string always precedes byte m+1 of the string, and byte 0 of the
 387 string always follows the string's length.  If n is not a multiple of
 388 four, then the n bytes are followed by enough (0 to 3) residual zero
 389 bytes, r, to make the total byte count a multiple of four.  Counted
 390 byte strings are declared as follows:
 391 .DS
 392 .ft CW
 393 string object<m>;
 394 .DE
 395 or
 396 .DS
 397 .ft CW
 398 string object<>;
 399 .DE
 400 The constant m denotes an upper bound of the number of bytes that a
 401 string may contain.  If m is not specified, as in the second
 402 declaration, it is assumed to be (2**32) - 1, the maximum length.
 403 The constant m would normally be found in a protocol specification.
 404 For example, a filing protocol may state that a file name can be no
 405 longer than 255 bytes, as follows:
 406 .DS
 407 .ft CW
 408 string filename<255>;
 409 .DE
 410 Which can be illustrated as:
 411 .ie t .DS
 412 .el .DS L
 413 \fIA String\fP
 414
 415 \f(CW0     1     2     3     4     5   ...
 416 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 417 |        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
 418 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
 419 |<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
 420 |<----n+r (where (n+r) mod 4 = 0)---->|\fP
 421 .DE
 422 .LP
 423 It   is an  error  to  encode  a length greater  than   the maximum
 424 described in the specification.
 425 .NH 2
 426 \&Fixed-length Array
 427 .IX XDR "fixed-length array"
 428 .IX XDR "array, fixed length"
 429 .LP
 430 Declarations for fixed-length arrays of homogeneous elements are in
 431 the following form:
 432 .DS
 433 .ft CW
 434 type-name identifier[n];
 435 .DE
 436 Fixed-length arrays of elements numbered 0 through n-1 are encoded by
 437 individually encoding the elements of the array in their natural
 438 order, 0 through n-1.  Each element's size is a multiple of four
 439 bytes. Though all elements are of the same type, the elements may
 440 have different sizes.  For example, in a fixed-length array of
 441 strings, all elements are of type "string", yet each element will
 442 vary in its length.
 443 .ie t .DS
 444 .el .DS L
 445 \fIFixed-Length Array\fP
 446
 447 \f(CW+---+---+---+---+---+---+---+---+...+---+---+---+---+
 448 |   element 0   |   element 1   |...|  element n-1  |
 449 +---+---+---+---+---+---+---+---+...+---+---+---+---+
 450 |<--------------------n elements------------------->|\fP
 451 .DE
 452 .NH 2
 453 \&Variable-length Array
 454 .IX XDR "variable-length array"
 455 .IX XDR "array, variable length"
 456 .LP
 457 Counted arrays provide the ability to encode variable-length arrays
 458 of homogeneous elements.  The array is encoded as the element count n
 459 (an unsigned integer) followed by the encoding of each of the array's
 460 elements, starting with element 0 and progressing through element n-
 461 1.  The declaration for variable-length arrays follows this form:
 462 .DS
 463 .ft CW
 464 type-name identifier<m>;
 465 .DE
 466 or
 467 .DS
 468 .ft CW
 469 type-name identifier<>;
 470 .DE
 471 The constant m specifies the maximum acceptable element count of an
 472 array; if  m is not specified, as  in the second declaration, it is
 473 assumed to be (2**32) - 1.
 474 .ie t .DS
 475 .el .DS L
 476 \fICounted Array\fP
 477
 478 \f(CW0  1  2  3
 479 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
 480 |     n     | element 0 | element 1 |...|element n-1|
 481 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
 482 |<-4 bytes->|<--------------n elements------------->|\fP
 483 .DE
 484 It is  an error to  encode  a  value of n that  is greater than the
 485 maximum described in the specification.
 486 .NH 2
 487 \&Structure
 488 .IX XDR structure
 489 .LP
 490 Structures are declared as follows:
 491 .DS
 492 .ft CW
 493 struct {
 494         component-declaration-A;
 495         component-declaration-B;
 496         \&...
 497 } identifier;
 498 .DE
 499 The components of the structure are encoded in the order of their
 500 declaration in the structure.  Each component's size is a multiple of
 501 four bytes, though the components may be different sizes.
 502 .ie t .DS
 503 .el .DS L
 504 \fIStructure\fP
 505
 506 \f(CW+-------------+-------------+...
 507 | component A | component B |...
 508 +-------------+-------------+...\fP
 509 .DE
 510 .NH 2
 511 \&Discriminated Union
 512 .IX XDR "discriminated union"
 513 .IX XDR union discriminated
 514 .LP
 515 A discriminated union is a type composed of a discriminant followed
 516 by a type selected from a set of prearranged types according to the
 517 value of the discriminant.  The type of discriminant is either "int",
 518 "unsigned int", or an enumerated type, such as "bool".  The component
 519 types are called "arms" of the union, and are preceded by the value
 520 of the discriminant which implies their encoding.  Discriminated
 521 unions are declared as follows:
 522 .DS
 523 .ft CW
 524 union switch (discriminant-declaration) {
 525         case discriminant-value-A:
 526         arm-declaration-A;
 527         case discriminant-value-B:
 528         arm-declaration-B;
 529         \&...
 530         default: default-declaration;
 531 } identifier;
 532 .DE
 533 Each "case" keyword is followed by a legal value of the discriminant.
 534 The default arm is optional.  If it is not specified, then a valid
 535 encoding of the union cannot take on unspecified discriminant values.
 536 The size of the implied arm is always a multiple of four bytes.
 537 .LP
 538 The discriminated union is encoded as its discriminant followed by
 539 the encoding of the implied arm.
 540 .ie t .DS
 541 .el .DS L
 542 \fIDiscriminated Union\fP
 543
 544 \f(CW0   1   2   3
 545 +---+---+---+---+---+---+---+---+
 546 |  discriminant |  implied arm  |
 547 +---+---+---+---+---+---+---+---+
 548 |<---4 bytes--->|\fP
 549 .DE
 550 .NH 2
 551 \&Void
 552 .IX XDR void
 553 .LP
 554 An XDR void is a 0-byte quantity.  Voids are useful for describing
 555 operations that take no data as input or no data as output. They are
 556 also useful in unions, where some arms may contain data and others do
 557 not.  The declaration is simply as follows:
 558 .DS
 559 .ft CW
 560 void;
 561 .DE
 562 Voids are illustrated as follows:
 563 .ie t .DS
 564 .el .DS L
 565 \fIVoid\fP
 566
 567 \f(CW  ++
 568   ||
 569   ++
 570 --><-- 0 bytes\fP
 571 .DE
 572 .NH 2
 573 \&Constant
 574 .IX XDR constant
 575 .LP
 576 The data declaration for a constant follows this form:
 577 .DS
 578 .ft CW
 579 const name-identifier = n;
 580 .DE
 581 "const" is used to define a symbolic name for a constant; it does not
 582 declare any data.  The symbolic constant may be used anywhere a
 583 regular constant may be used.  For example, the following defines a
 584 symbolic constant DOZEN, equal to 12.
 585 .DS
 586 .ft CW
 587 const DOZEN = 12;
 588 .DE
 589 .NH 2
 590 \&Typedef
 591 .IX XDR typedef
 592 .LP
 593 "typedef" does not declare any data either, but serves to define new
 594 identifiers for declaring data. The syntax is:
 595 .DS
 596 .ft CW
 597 typedef declaration;
 598 .DE
 599 The new type name is actually the variable name in the declaration
 600 part of the typedef.  For example, the following defines a new type
 601 called "eggbox" using an existing type called "egg":
 602 .DS
 603 .ft CW
 604 typedef egg eggbox[DOZEN];
 605 .DE
 606 Variables declared using the new type name have the same type as the
 607 new type name would have in the typedef, if it was considered a
 608 variable.  For example, the following two declarations are equivalent
 609 in declaring the variable "fresheggs":
 610 .DS
 611 .ft CW
 612 eggbox  fresheggs;
 613 egg     fresheggs[DOZEN];
 614 .DE
 615 When a typedef involves a struct, enum, or union definition, there is
 616 another (preferred) syntax that may be used to define the same type.
 617 In general, a typedef of the following form:
 618 .DS
 619 .ft CW
 620 typedef <<struct, union, or enum definition>> identifier;
 621 .DE
 622 may be converted to the alternative form by removing the "typedef"
 623 part and placing the identifier after the "struct", "union", or
 624 "enum" keyword, instead of at the end.  For example, here are the two
 625 ways to define the type "bool":
 626 .DS
 627 .ft CW
 628 typedef enum {    /* \fIusing typedef\fP */
 629         FALSE = 0,
 630         TRUE = 1
 631         } bool;
 632
 633 enum bool {       /* \fIpreferred alternative\fP */
 634         FALSE = 0,
 635         TRUE = 1
 636         };
 637 .DE
 638 The reason this syntax is preferred is one does not have to wait
 639 until the end of a declaration to figure out the name of the new
 640 type.
 641 .NH 2
 642 \&Optional-data
 643 .IX XDR "optional data"
 644 .IX XDR "data, optional"
 645 .LP
 646 Optional-data is one kind of union that occurs so frequently that we
 647 give it a special syntax of its own for declaring it.  It is declared
 648 as follows:
 649 .DS
 650 .ft CW
 651 type-name *identifier;
 652 .DE
 653 This is equivalent to the following union:
 654 .DS
 655 .ft CW
 656 union switch (bool opted) {
 657         case TRUE:
 658         type-name element;
 659         case FALSE:
 660         void;
 661 } identifier;
 662 .DE
 663 It is also equivalent to the following variable-length array
 664 declaration, since the boolean "opted" can be interpreted as the
 665 length of the array:
 666 .DS
 667 .ft CW
 668 type-name identifier<1>;
 669 .DE
 670 Optional-data is not so interesting in itself, but it is very useful
 671 for describing recursive data-structures such as linked-lists and
 672 trees.  For example, the following defines a type "stringlist" that
 673 encodes lists of arbitrary length strings:
 674 .DS
 675 .ft CW
 676 struct *stringlist {
 677         string item<>;
 678         stringlist next;
 679 };
 680 .DE
 681 It could have been equivalently declared as the following union:
 682 .DS
 683 .ft CW
 684 union stringlist switch (bool opted) {
 685         case TRUE:
 686                 struct {
 687                         string item<>;
 688                         stringlist next;
 689                 } element;
 690         case FALSE:
 691                 void;
 692 };
 693 .DE
 694 or as a variable-length array:
 695 .DS
 696 .ft CW
 697 struct stringlist<1> {
 698         string item<>;
 699         stringlist next;
 700 };
 701 .DE
 702 Both of these declarations obscure the intention of the stringlist
 703 type, so the optional-data declaration is preferred over both of
 704 them.  The optional-data type also has a close correlation to how
 705 recursive data structures are represented in high-level languages
 706 such as Pascal or C by use of pointers. In fact, the syntax is the
 707 same as that of the C language for pointers.
 708 .NH 2
 709 \&Areas for Future Enhancement
 710 .IX XDR futures
 711 .LP
 712 The XDR standard lacks representations for bit fields and bitmaps,
 713 since the standard is based on bytes.  Also missing are packed (or
 714 binary-coded) decimals.
 715 .LP
 716 The intent of the XDR standard was not to describe every kind of data
 717 that people have ever sent or will ever want to send from machine to
 718 machine. Rather, it only describes the most commonly used data-types
 719 of high-level languages such as Pascal or C so that applications
 720 written in these languages will be able to communicate easily over
 721 some medium.
 722 .LP
 723 One could imagine extensions to XDR that would let it describe almost
 724 any existing protocol, such as TCP.  The minimum necessary for this
 725 are support for different block sizes and byte-orders.  The XDR
 726 discussed here could then be considered the 4-byte big-endian member
 727 of a larger XDR family.
 728 .NH 1
 729 \&Discussion
 730 .sp 2
 731 .NH 2
 732 \&Why a Language for Describing Data?
 733 .IX XDR language
 734 .LP
 735 There are many advantages in using a data-description language such
 736 as  XDR  versus using  diagrams.   Languages are  more  formal than
 737 diagrams   and   lead  to less  ambiguous   descriptions  of  data.
 738 Languages are also easier  to understand and allow  one to think of
 739 other   issues instead of  the   low-level details of bit-encoding.
 740 Also,  there is  a close analogy  between the  types  of XDR and  a
 741 high-level language   such  as C   or    Pascal.   This makes   the
 742 implementation of XDR encoding and decoding modules an easier task.
 743 Finally, the language specification itself  is an ASCII string that
 744 can be passed from  machine to machine  to perform  on-the-fly data
 745 interpretation.
 746 .NH 2
 747 \&Why Only one Byte-Order for an XDR Unit?
 748 .IX XDR "byte order"
 749 .LP
 750 Supporting two byte-orderings requires a higher level protocol for
 751 determining in which byte-order the data is encoded.  Since XDR is
 752 not a protocol, this can't be done.  The advantage of this, though,
 753 is that data in XDR format can be written to a magnetic tape, for
 754 example, and any machine will be able to interpret it, since no
 755 higher level protocol is necessary for determining the byte-order.
 756 .NH 2
 757 \&Why does XDR use Big-Endian Byte-Order?
 758 .LP
 759 Yes, it is unfair, but having only one byte-order means you have to
 760 be unfair to somebody.  Many architectures, such as the Motorola
 761 68000 and IBM 370, support the big-endian byte-order.
 762 .NH 2
 763 \&Why is the XDR Unit Four Bytes Wide?
 764 .LP
 765 There is a tradeoff in choosing the XDR unit size.  Choosing a small
 766 size such as two makes the encoded data small, but causes alignment
 767 problems for machines that aren't aligned on these boundaries.  A
 768 large size such as eight means the data will be aligned on virtually
 769 every machine, but causes the encoded data to grow too big.  We chose
 770 four as a compromise.  Four is big enough to support most
 771 architectures efficiently, except for rare machines such as the
 772 eight-byte aligned Cray.  Four is also small enough to keep the
 773 encoded data restricted to a reasonable size.
 774 .NH 2
 775 \&Why must Variable-Length Data be Padded with Zeros?
 776 .IX XDR "variable-length data"
 777 .LP
 778 It is desirable that the same data encode into the same thing on all
 779 machines, so that encoded data can be meaningfully compared or
 780 checksummed.  Forcing the padded bytes to be zero ensures this.
 781 .NH 2
 782 \&Why is there No Explicit Data-Typing?
 783 .LP
 784 Data-typing has a relatively high cost for what small advantages it
 785 may have.  One cost is the expansion of data due to the inserted type
 786 fields.  Another is the added cost of interpreting these type fields
 787 and acting accordingly.  And most protocols already know what type
 788 they expect, so data-typing supplies only redundant information.
 789 However, one can still get the benefits of data-typing using XDR. One
 790 way is to encode two things: first a string which is the XDR data
 791 description of the encoded data, and then the encoded data itself.
 792 Another way is to assign a value to all the types in XDR, and then
 793 define a universal type which takes this value as its discriminant
 794 and for each value, describes the corresponding data type.
 795 .NH 1
 796 \&The XDR Language Specification
 797 .IX XDR language
 798 .sp 1
 799 .NH 2
 800 \&Notational Conventions
 801 .IX "XDR language" notation
 802 .LP
 803 This specification  uses an extended Backus-Naur Form  notation for
 804 describing the XDR language.   Here is  a brief description  of the
 805 notation:
 806 .IP  1.
 807 The characters
 808 .I | ,
 809 .I ( ,
 810 .I ) ,
 811 .I [ ,
 812 .I ] ,
 813 .I " ,
 814 and
 815 .I *
 816 are special.
 817 .IP  2.
 818 Terminal symbols are  strings of any  characters surrounded by
 819 double quotes.
 820 .IP  3.
 821 Non-terminal symbols are strings of non-special characters.
 822 .IP  4.
 823 Alternative items are separated by a vertical bar ("\fI|\fP").
 824 .IP  5.
 825 Optional items are enclosed in brackets.
 826 .IP  6.
 827 Items are grouped together by enclosing them in parentheses.
 828 .IP  7.
 829 A
 830 .I *
 831 following an item means  0 or more  occurrences of that item.
 832 .LP
 833 For example,  consider  the  following pattern:
 834 .DS L
 835 "a " "very" (", " " very")* [" cold " "and"]  " rainy " ("day" | "night")
 836 .DE
 837 .LP
 838 An infinite  number of  strings match  this pattern. A few  of them
 839 are:
 840 .DS
 841 "a very rainy day"
 842 "a very, very rainy day"
 843 "a very cold and  rainy day"
 844 "a very, very, very cold and  rainy night"
 845 .DE
 846 .NH 2
 847 \&Lexical Notes
 848 .IP  1.
 849 Comments begin with '/*' and terminate with '*/'.
 850 .IP  2.
 851 White space serves to separate items and is otherwise ignored.
 852 .IP  3.
 853 An identifier is a letter followed by  an optional sequence of
 854 letters, digits or underbar ('_').  The case of identifiers is
 855 not ignored.
 856 .IP  4.
 857 A  constant is  a  sequence  of  one  or  more decimal digits,
 858 optionally preceded by a minus-sign ('-').
 859 .NH 2
 860 \&Syntax Information
 861 .IX "XDR language" syntax
 862 .DS
 863 .ft CW
 864 declaration:
 865         type-specifier identifier
 866         | type-specifier identifier "[" value "]"
 867         | type-specifier identifier "<" [ value ] ">"
 868         | "opaque" identifier "[" value "]"
 869         | "opaque" identifier "<" [ value ] ">"
 870         | "string" identifier "<" [ value ] ">"
 871         | type-specifier "*" identifier
 872         | "void"
 873 .DE
 874 .DS
 875 .ft CW
 876 value:
 877         constant
 878         | identifier
 879
 880 type-specifier:
 881           [ "unsigned" ] "int"
 882         | [ "unsigned" ] "hyper"
 883         | "float"
 884         | "double"
 885         | "bool"
 886         | enum-type-spec
 887         | struct-type-spec
 888         | union-type-spec
 889         | identifier
 890 .DE
 891 .DS
 892 .ft CW
 893 enum-type-spec:
 894         "enum" enum-body
 895
 896 enum-body:
 897         "{"
 898         ( identifier "=" value )
 899         ( "," identifier "=" value )*
 900         "}"
 901 .DE
 902 .DS
 903 .ft CW
 904 struct-type-spec:
 905         "struct" struct-body
 906
 907 struct-body:
 908         "{"
 909         ( declaration ";" )
 910         ( declaration ";" )*
 911         "}"
 912 .DE
 913 .DS
 914 .ft CW
 915 union-type-spec:
 916         "union" union-body
 917
 918 union-body:
 919         "switch" "(" declaration ")" "{"
 920         ( "case" value ":" declaration ";" )
 921         ( "case" value ":" declaration ";" )*
 922         [ "default" ":" declaration ";" ]
 923         "}"
 924
 925 constant-def:
 926         "const" identifier "=" constant ";"
 927 .DE
 928 .DS
 929 .ft CW
 930 type-def:
 931         "typedef" declaration ";"
 932         | "enum" identifier enum-body ";"
 933         | "struct" identifier struct-body ";"
 934         | "union" identifier union-body ";"
 935
 936 definition:
 937         type-def
 938         | constant-def
 939
 940 specification:
 941         definition *
 942 .DE
 943 .NH 3
 944 \&Syntax Notes
 945 .IX "XDR language" syntax
 946 .LP
 947 .IP  1.
 948 The following are keywords and cannot be used as identifiers:
 949 "bool", "case", "const", "default", "double", "enum", "float",
 950 "hyper", "opaque", "string", "struct", "switch", "typedef", "union",
 951 "unsigned" and "void".
 952 .IP  2.
 953 Only unsigned constants may be used as size specifications for
 954 arrays.  If an identifier is used, it must have been declared
 955 previously as an unsigned constant in a "const" definition.
 956 .IP  3.
 957 Constant and type identifiers within the scope of a specification
 958 are in the same name space and must be declared uniquely within this
 959 scope.
 960 .IP  4.
 961 Similarly, variable names must  be unique within  the scope  of
 962 struct and union declarations. Nested struct and union declarations
 963 create new scopes.
 964 .IP  5.
 965 The discriminant of a union must be of a type that evaluates to
 966 an integer. That is, "int", "unsigned int", "bool", an enumerated
 967 type or any typedefed type that evaluates to one of these is legal.
 968 Also, the case values must be one of the legal values of the
 969 discriminant.  Finally, a case value may not be specified more than
 970 once within the scope of a union declaration.
 971 .NH 1
 972 \&An Example of an XDR Data Description
 973 .LP
 974 Here is a short XDR data description of a thing called a "file",
 975 which might be used to transfer files from one machine to another.
 976 .ie t .DS
 977 .el .DS L
 978 .ft CW
 979
 980 const MAXUSERNAME = 32;     /*\fI max length of a user name \fP*/
 981 const MAXFILELEN = 65535;   /*\fI max length of a file      \fP*/
 982 const MAXNAMELEN = 255;     /*\fI max length of a file name \fP*/
 983
 984 .ft I
 985 /*
 986  * Types of files:
 987  */
 988 .ft CW
 989
 990 enum filekind {
 991         TEXT = 0,       /*\fI ascii data \fP*/
 992         DATA = 1,       /*\fI raw data   \fP*/
 993         EXEC = 2        /*\fI executable \fP*/
 994 };
 995
 996 .ft I
 997 /*
 998  * File information, per kind of file:
 999  */
1000 .ft CW
1001
1002 union filetype switch (filekind kind) {
1003         case TEXT:
1004                 void;                           /*\fI no extra information \fP*/
1005         case DATA:
1006                 string creator<MAXNAMELEN>;     /*\fI data creator         \fP*/
1007         case EXEC:
1008                 string interpretor<MAXNAMELEN>; /*\fI program interpretor  \fP*/
1009 };
1010
1011 .ft I
1012 /*
1013  * A complete file:
1014  */
1015 .ft CW
1016
1017 struct file {
1018         string filename<MAXNAMELEN>; /*\fI name of file \fP*/
1019         filetype type;               /*\fI info about file \fP*/
1020         string owner<MAXUSERNAME>;   /*\fI owner of file   \fP*/
1021         opaque data<MAXFILELEN>;     /*\fI file data       \fP*/
1022 };
1023 .DE
1024 .LP
1025 Suppose now that there is  a user named  "john" who wants to  store
1026 his lisp program "sillyprog" that contains just  the data "(quit)".
1027 His file would be encoded as follows:
1028 .TS
1029 box tab (&) ;
1030 lfI lfI lfI lfI
1031 rfL rfL rfL l .
1032 Offset&Hex Bytes&ASCII&Description
1033 _
1034 0&00 00 00 09&....&Length of filename = 9
1035 4&73 69 6c 6c&sill&Filename characters
1036 8&79 70 72 6f&ypro& ... and more characters ...
1037 12&67 00 00 00&g...& ... and 3 zero-bytes of fill
1038 16&00 00 00 02&....&Filekind is EXEC = 2
1039 20&00 00 00 04&....&Length of interpretor = 4
1040 24&6c 69 73 70&lisp&Interpretor characters
1041 28&00 00 00 04&....&Length of owner = 4
1042 32&6a 6f 68 6e&john&Owner characters
1043 36&00 00 00 06&....&Length of file data = 6
1044 40&28 71 75 69&(qui&File data bytes ...
1045 44&74 29 00 00&t)..& ... and 2 zero-bytes of fill
1046 .TE
1047 .NH 1
1048 \&References
1049 .LP
1050 [1]  Brian W. Kernighan & Dennis M. Ritchie, "The C Programming
1051 Language", Bell Laboratories, Murray Hill, New Jersey, 1978.
1052 .LP
1053 [2]  Danny Cohen, "On Holy Wars and a Plea for Peace", IEEE Computer,
1054 October 1981.
1055 .LP
1056 [3]  "IEEE Standard for Binary Floating-Point Arithmetic", ANSI/IEEE
1057 Standard 754-1985, Institute of Electrical and Electronics
1058 Engineers, August 1985.
1059 .LP
1060 [4]  "Courier: The Remote Procedure Call Protocol", XEROX
1061 Corporation, XSIS 038112, December 1981.