contrib/ntp/html/exec.htm

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" &amp;lt;html>
   2 <html>
   3 <head>
   4 <meta name="generator" content="HTML Tidy, see www.w3.org">
   5 <title>Executive Summary - Computer Network Time
   6 Synchronization</title>
   7 </head>
   8 <body>
   9 <h3>Executive Summary - Computer Network Time Synchronization</h3>
  10
  11 <img align="left" src="pic/alice12.gif" alt="gif"><a href=
  12 "pictures.htm">from <i>Alice's Adventures in Wonderland</i>, Lewis
  13 Carroll</a>
  14
  15 <p>The executive is the one on the left.<br clear="left">
  16 </p>
  17
  18 <hr>
  19 <h4>Introduction</h4>
  20
  21 <p>The standard timescale used by most nations of the world is
  22 Coordinated UniversalTime (UTC), which is based on the Earth's
  23 rotation about its axis, and the Gregorian Calendar, which is based
  24 on the Earth's rotation about the Sun. The UTC timescale is
  25 disciplined with respect to International Atomic Time (TAI) by
  26 inserting leap seconds at intervals of about 18 months. UTC time is
  27 disseminated by various means, including radio and satellite
  28 navigation systems, telephone modems and portable clocks.</p>
  29
  30 <p>Special purpose receivers are available for many
  31 time-dissemination services, including the Global Position System
  32 (GPS) and other services operated by various national governments.
  33 For reasons of cost and convenience, it is not possible to equip
  34 every computer with one of these receivers. However, it is possible
  35 to equip some number of computers acting as primary time servers to
  36 synchronize a much larger number of secondary servers and clients
  37 connected by a common network. In order to do this, a distributed
  38 network clock synchronization protocol is required which can read a
  39 server clock, transmit the reading to one or more clients and
  40 adjust each client clock as required. Protocols that do this
  41 include the Network Time Protocol (NTP), Digital Time
  42 Synchronization Protocol (DTSS) and others found in the literature
  43 (See "Further Reading" at the end of this article.)</p>
  44
  45 <h4>Protocol Design Issues</h4>
  46
  47 <p>The synchronization protocol determines the time offset of the
  48 server clock relative to the client clock. The various
  49 synchronization protocols in use today provide different means to
  50 do this, but they all follow the same general model. On request,
  51 the server sends a message including its current clock value or <i>
  52 timestamp</i> and the client records its own timestamp upon arrival
  53 of the message. For the best accuracy, the client needs to measure
  54 the server-client propagation delay to determine its clock offset
  55 relative to the server. Since it is not possible to determine the
  56 one-way delays, unless the actual clock offset is known, the
  57 protocol measures the total roundtrip delay and assumes the
  58 propagation times are statistically equal in each direction. In
  59 general, this is a useful approximation; however, in the Internet
  60 of today, network paths and the associated delays can differ
  61 significantly due to the individual service providers.</p>
  62
  63 <p>The community served by the synchronization protocol can be very
  64 large. For instance, the NTP community in the Internet of 1998
  65 includes over 230 primary time servers, synchronized by radio,
  66 satellite and modem, and well over 100,000 secondary servers and
  67 clients. In addition, there are many thousands of private
  68 communities in large government, corporate and institution
  69 networks. Each community is organized as a tree graph or <i>
  70 subnet</i>, with the primary servers at the root and secondary
  71 servers and clients at increasing hop count, or stratum level, in
  72 corporate, department and desktop networks. It is usually necessary
  73 at each stratum level to employ redundant servers and diverse
  74 network paths in order to protect against broken software, hardware
  75 and network links.</p>
  76
  77 <p>Synchronization protocols work in one or more association modes,
  78 depending on the protocol design. Client/server mode, also called
  79 master/slave mode, is supported in both DTSS and NTP. In this mode,
  80 a client synchronizes to a stateless server as in the conventional
  81 RPC model. NTP also supports symmetric mode, which allows either of
  82 two peer servers to synchronize to the other, in order to provide
  83 mutual backup. DTSS and NTP support a broadcast mode which allows
  84 many clients to synchronize to one or a few servers, reducing
  85 network traffic when large numbers of clients are involved. In NTP,
  86 IP multicast can be used when the subnet spans multiple
  87 networks.</p>
  88
  89 <p>Configuration management can be a serious problem in large
  90 subnets. Various schemes which index public databases and network
  91 directory services are used in DTSS and NTP to discover servers.
  92 Both protocols use broadcast modes to support large client
  93 populations; but, since listen-only clients cannot calibrate the
  94 delay, accuracy can suffer. In NTP, clients determine the delay at
  95 the time a server is first discovered by polling the server in
  96 client/server mode and then reverting to listen-only mode. In
  97 addition, NTP clients can broadcast a special "manycast" message to
  98 solicit responses from nearby servers and continue in client/server
  99 mode with the respondents.</p>
 100
 101 <h4>Security Issues</h4>
 102
 103 <p>A reliable network time service requires provisions to prevent
 104 accidental or malicious attacks on the servers and clients in the
 105 network. Reliability requires that clients can determine that
 106 received messages are authentic; that is, were actually sent by the
 107 intended server and not manufactured or modified by an intruder.
 108 Ubiquity requires that any client can verify the authenticity of
 109 any server using only public information. This is especially
 110 important in such ubiquitous network services as directory
 111 services, cryptographic key management and time
 112 synchronization.</p>
 113
 114 <p>NTP includes provisions to cryptographically authenticate
 115 individual servers using symmetric-key cryptography in which
 116 clients authenticate servers using shared secret keys. However, the
 117 secret keys must be distributed in advance using secure means
 118 beyond the scope of the protocol. This can be awkward and fragile
 119 with a large population of potential clients, possibly intruding
 120 hackers.</p>
 121
 122 <p>Modern public-key cryptography provides means to reliably bind
 123 the server identification credentials and related public values
 124 using public directory services. However, these means carry a high
 125 computing cost, especially when large numbers of time-critical
 126 clients are involved as often the case with NTP servers. In
 127 addition, there are problems unique to NTP in the interaction
 128 between the authentication and synchronization functions, since
 129 each requires the other for success.</p>
 130
 131 <p>The recent NTP Version 4 includes a revised security model and
 132 authentication scheme supporting both symmetric and public-key
 133 cryptography. The public-key variant is specially crafted to reduce
 134 the risk of intrusion, minimize the consumption of processor
 135 resources and minimize the vulnerability to hacker attack.</p>
 136
 137 <h4>Computer Clock Modelling and Error Analysis</h4>
 138
 139 Most computers include a quartz resonator-stabilized oscillator and
 140 hardware counter that interrupts the processor at intervals of a
 141 few milliseconds. At each interrupt, a quantity called <i>tick</i>
 142 is added to a system variable representing the clock time. The
 143 clock can be read by system and application programs and set on
 144 occasion to an external reference. Once set, the clock readings
 145 increment at a nominal rate, depending on the value of <i>tick</i>.
 146 Typical Unix system kernels provide a programmable mechanism to
 147 increase or decrease the value of <i>tick</i> by a small, fixed
 148 amount in order to amortize a given time adjustment smoothly over
 149 multiple <i>tick</i> intervals.
 150
 151 <p>Clock errors are due to variations in network delay and
 152 latencies in computer hardware and software (jitter), as well as
 153 clock oscillator instability (wander). The time of a client
 154 relative to its server can be expressed</p>
 155
 156 <center><i>T</i>(<i>t</i>) = <i>T</i>(<i>t</i><sub>0</sub>) + <i>
 157 R</i>(<i>t - t</i><sub>0</sub>) + 1/2 <i>D</i>(<i>t -
 158 t</i><sub>0</sub>)<sup>2</sup>,</center>
 159
 160 <p>where <i>t</i> is the current time, <i>T</i> is the time offset
 161 at the last measurement update <i>t</i><sub>0</sub>, <i>R</i> is
 162 the frequency offset and <i>D</i> is the drift due to resonator
 163 ageing. All three terms include systematic offsets that can be
 164 corrected and random variations that cannot. Some protocols,
 165 including DTSS, estimate only the first term in this expression,
 166 while others, including NTP, estimate the first two terms. Errors
 167 due to the third term, while important to model resonator aging in
 168 precision applications, are neglected, since they are usually
 169 dominated by errors in the first two terms.</p>
 170
 171 <p>The synchronization protocol estimates <i>
 172 T</i>(<i>t</i><sub>0</sub>) (and <i>R</i>(<i>t</i><sub>0</sub>),
 173 where relevant) at regular intervals <font face="symbol">t</font>
 174 and adjusts the clock to minimize <i>T</i>(<i>t</i>) in future. In
 175 common cases, <i>R</i> can have systematic offsets of several
 176 hundred parts-per-million (PPM) with random variations of several
 177 PPM due to ambient temperature changes. If not corrected, the
 178 resulting errors can accumulate to seconds per day. In order that
 179 these errors do not exceed a nominal specification, the protocol
 180 must periodically re-estimate <i>T</i> and <i>R</i> and compensate
 181 for variations by adjusting the clock at regular intervals. As a
 182 practical matter, for nominal accuracies of tens of milliseconds,
 183 this requires clients to exchange messages with servers at
 184 intervals in the order of tens of minutes.</p>
 185
 186 <p>Analysis of quartz-resonator stabilized oscillators show that
 187 errors are a function of the averaging time, which in turn depends
 188 on the interval between corrections. At correction intervals less
 189 than a few hundred seconds, errors are dominated by jitter, while,
 190 at intervals greater than this, errors are dominated by wander. As
 191 explained later, the characteristics of each regime determine the
 192 algorithm used to discipline the clock. These errors accumulate at
 193 each stratum level from the root to the leaves of the subnet tree.
 194 It is possible to quantify these errors by statistical means, as in
 195 NTP. This allows real-time applications to adjust audio or video
 196 playout delay, for example. However, the required statistics may be
 197 different for various classes of applications. Some applications
 198 need absolute error bounds guaranteed never to exceeded, as
 199 provided by the following correctness principles.</p>
 200
 201 <h4>Correctness Principles</h4>
 202
 203 <p>Applications requiring reliable time synchronization such as air
 204 traffic control must have confidence that the local clock is
 205 correct within some bound relative to a given timescale such as
 206 UTC. There is a considerable body of literature that studies these
 207 issues with respect to various failure models such as fail-stop and
 208 Byzantine disagreement. While these models inspire much confidence
 209 in a theoretical setting, most require multiple message rounds for
 210 each measurement and would be impractical in a large computer
 211 network such as the Internet. However, it can be shown that the
 212 worst-case error in reading a remote server clock cannot exceed
 213 one-half the roundtrip delay measured by the client. This is a
 214 valuable insight, since it permits strong statements about the
 215 correctness of the timekeeping system.</p>
 216
 217 <p>In the Probabilistic Clock Synchronization (PCS) scheme devised
 218 by Cristian, a maximum error tolerance is established in advance
 219 and time value samples associated with roundtrip delays that exceed
 220 twice this value are discarded. By the above argument, the
 221 remaining samples must represent time values within the specified
 222 tolerance. As the tolerance is decreased, more samples fail the
 223 test until a point where no samples survive. The tolerance can be
 224 adjusted for the best compromise between the highest accuracy
 225 consistent with acceptable sample survival rate.</p>
 226
 227 <p>In a scheme devised by Marzullo and exploited in NTP and DTSS,
 228 the worst-case error determined for each server determines a
 229 correctness interval. If each of a number of servers are in fact
 230 synchronized to a common timescale, the actual time must be
 231 contained in the intersection of their correctness intervals. If
 232 some intervals do not intersect, then the clique containing the
 233 maximum number of intersections is assumed correct <i>
 234 truechimers</i> and the others assumed incorrect <i>
 235 falsetickers</i>. Only the truechimers are used to adjust the
 236 system clock.</p>
 237
 238 <h4>Data Grooming Algorithms</h4>
 239
 240 By its very nature, clock synchronization is a continuous process,
 241 resulting in a sequence of measurements with each of possibly
 242 several servers and resulting in a clock adjustment. In some
 243 protocols, crafted algorithms are used to improve the time and
 244 frequency estimates and refine the clock adjustment. Algorithms
 245 described in the literature are based on trimmed-mean and median
 246 filter methods. The clock filter algorithm used in NTP is based on
 247 the above observation that the correctness interval depends on the
 248 roundtrip delay. The algorithm accumulates offset/delay samples in
 249 a window of several samples and selects the offset sample
 250 associated with the minimum delay. In general, larger window sizes
 251 provide better estimates; however, stability considerations limit
 252 the window size to about eight.
 253
 254 <p>The same principle could be used when selecting the best subset
 255 of servers and combining their offsets to determine the clock
 256 adjustment. However, different servers often show different
 257 systematic offsets, so the best statistic for the central tendency
 258 of the server population may not be obvious. Various kinds of
 259 clustering algorithms have been found useful for this purpose. The
 260 one used in NTP sorts the offsets by a quality metric, then
 261 calculates the variance of all servers relative to each server
 262 separately. The algorithm repeatedly discards the outlyer with the
 263 largest variance until further discards will not improve the
 264 residual variance or until a minimum number of servers remain. The
 265 final clock adjustment is computed as a weighted average of the
 266 survivors.</p>
 267
 268 <p>At the heart of the synchronization protocol is the algorithm
 269 used to adjust the system clock in accordance with the final
 270 adjustment determined by the above algorithms. This is called the
 271 clock discipline algorithm or simply the discipline. Such
 272 algorithms can be classed according to whether they minimize the
 273 time offset or frequency offset or both. For instance, the
 274 discipline used in DTSS minimizes only the time offset, while the
 275 one used in NTP minimizes both time and frequency offsets. While
 276 the DTSS algorithm cannot remove residual errors due to systematic
 277 frequency errors, the NTP algorithm is more complicated and less
 278 forgiving of design and implementation mistakes.</p>
 279
 280 <p>All clock disciplines function as a feedback loop, with measured
 281 offsets used to adjust the clock oscillator phase and frequency to
 282 match the external synchronization source. The behavior of feedback
 283 loops is well understood and modelled by mathematical analysis. The
 284 significant design parameter is the time constant, or
 285 responsiveness to external or internal variations in time or
 286 frequency. Optimum selection of time constant depends on the
 287 interval between update messages. In general, the longer these
 288 intervals, the larger the time constant and vice versa. In practice
 289 and with typical network configurations the optimal poll intervals
 290 vary between one and twenty minutes for network paths to some
 291 thousands of minutes for modem paths.</p>
 292
 293 <h4>Further Reading</h4>
 294
 295 <ol>
 296 <li>
 297 <p>Cristian, F. Probabilistic clock synchronization. In Distributed
 298 Computing 3, Springer Verlag, 1989, 146-158.</p>
 299 </li>
 300
 301 <li>
 302 <p>Digital Time Service Functional Specification Version T.1.0.5.
 303 DigitalEquipment Corporation, 1989.</p>
 304 </li>
 305
 306 <li>
 307 <p>Gusella, R., and S. Zatti. TEMPO - A network time controller for
 308 a distributed Berkeley UNIX system. IEEE Distributed Processing
 309 Technical Committee Newsletter 6, NoSI-2 (June 1984), 7-15. Also
 310 in: Proc. Summer 1984 USENIX (Salt Lake City, June 1984).</p>
 311 </li>
 312
 313 <li>
 314 <p>Kopetz, H., and W. Ochsenreiter. Clock synchronization in
 315 distributed real-time systems. IEEE Trans. Computers C-36, 8
 316 (August 1987), 933-939.</p>
 317 </li>
 318
 319 <li>
 320 <p>Lamport, L., and P.M. Melliar-Smith. Synchronizing clocks in the
 321 presence of faults. JACM 32, 1 (January 1985), 52-78.</p>
 322 </li>
 323
 324 <li>
 325 <p>Marzullo, K., and S. Owicki. Maintaining the time in a
 326 distributed system. ACM Operating Systems Review 19, 3 (July 1985),
 327 44-54.</p>
 328 </li>
 329
 330 <li>
 331 <p>Mills, D.L. Adaptive hybrid clock discipline algorithm for the
 332 Network Time Protocol. <i>IEEE/ACM Trans. Networking 6, 5</i>
 333 (October 1998), 505-514.</p>
 334 </li>
 335
 336 <li>
 337 <p>Mills, D.L. Improved algorithms for synchronizing computer
 338 network clocks. <i>IEEE/ACM Trans. Networks 3, 3</i> (June 1995),
 339 245-254.</p>
 340 </li>
 341
 342 <li>
 343 <p>Mills, D.L. Internet time synchronization: the Network Time
 344 Protocol. IEEE Trans. Communications COM-39, 10 (October 1991),
 345 1482-1493. Also in: Yang, Z., and T.A. Marsland (Eds.). Global
 346 States and Time in Distributed Systems, IEEE Press, Los Alamitos,
 347 CA, 91-102.</p>
 348 </li>
 349
 350 <li>
 351 <p>Mills, D.L. Modelling and analysis of computer network clocks.
 352 Electrical Engineering Department Report 92-5-2, University of
 353 Delaware, May 1992, 29 pp.</p>
 354 </li>
 355
 356 <li>
 357 <p>NIST Time and Frequency Dissemination Services. NBS Special
 358 Publication432 (Revised 1990), National Institute of Science and
 359 Technology, U.S. Department of Commerce, 1990.</p>
 360 </li>
 361
 362 <li>
 363 <p>Schneider, F.B. A paradigm for reliable clock synchronization.
 364 Department of Computer Science Technical Report TR 86-735, Cornell
 365 University, February 1986.</p>
 366 </li>
 367
 368 <li>
 369 <p>Srikanth, T.K., and S. Toueg. Optimal clock synchronization.
 370 JACM 34, 3 (July 1987), 626-645.</p>
 371 </li>
 372
 373 <li>
 374 <p>Stein, S.R. Frequency and time - their measurement and
 375 characterization (Chapter 12). In: E.A. Gerber and A. Ballato
 376 (Eds.). Precision Frequency Control, Vol. 2, Academic Press, New
 377 York 1985, 191-232, 399-416. Also in: Sullivan, D.B., D.W. Allan,
 378 D.A. Howe and F.L. Walls (Eds.). Characterization of Clocks and
 379 Oscillators. National Institute of Standards and Technology
 380 Technical Note 1337, U.S. Government Printing Office (January,
 381 1990), TN61-TN119.</p>
 382 </li>
 383 </ol>
 384
 385 <hr>
 386 <a href="index.htm"><img align="left" src="pic/home.gif" alt=
 387 "home"></a>
 388
 389 <address><a href="mailto:mills@udel.edu">David L. Mills
 390 &lt;mills@udel.edu&gt;</a></address>
 391 </body>
 392 </html>
 393