1 .\" Copyright (c) 1979 The Regents of the University of California.
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
12 .\" 3. All advertising materials mentioning features or use of this software
13 .\" must display the following acknowledgement:
14 .\" This product includes software developed by the University of
15 .\" California, Berkeley and its contributors.
16 .\" 4. Neither the name of the University nor the names of its contributors
17 .\" may be used to endorse or promote products derived from this software
18 .\" without specific prior written permission.
20 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32 .\" @(#)pxin2.n 5.2 (Berkeley) 4/17/91
33 .\" $FreeBSD: src/share/doc/papers/px/pxin2.n,v 1.1.1.1.14.1 2000/11/30 17:13:59 ru Exp $
40 Naming conventions and operation summary
42 Table 2.1 outlines the opcode typing convention.
43 The expression ``a above b'' means that `a' is on top
44 of the stack with `b' below it.
45 Table 2.3 describes each of the opcodes.
46 The character `*' at the end of a name specifies that
47 all operations with the root prefix
49 are summarized by one entry.
50 Table 2.2 gives the codes used
51 to describe the type inline data expected by each instruction.
60 Basic control operations
65 Corresponds to the Pascal procedure
67 causes execution to end with a post-mortem backtrace as if a run-time
72 Causes the second part of the block mark to be created, and
74 bytes of local variable space to be allocated and cleared to zero.
75 Stack overflow is detected here.
77 is the first line of the body of this section for error traceback,
78 and the inline string (length s) the character representation of its name.
84 and used to begin the main program when the ``p''
85 option is disabled so that the post-mortem backtrace will be inhibited.
89 Complementary to the operators
93 exits the current block, calling the procedure
95 to flush buffers for and release any local files.
96 Restores the environment of the caller from the block mark.
97 If this is the end for the main program, all files are
99 and the interpreter is exited.
103 Saves the current line number, return address, and active display entry pointer
105 in the first part of the block mark, then transfers to the entry point
106 given by the relative address
108 that is the beginning of a
120 Used to make space for the return value of a
122 just before calling it.
133 returns to remove the arguments from the stack.
137 Transfer control to relative address
141 or part of a structured statement.
145 Transfer control to an absolute address as part of a non-local
147 or to branch over procedure bodies.
151 Set current line number to
153 For consistency, check that the expression stack is empty
154 as it should be (as this is the start of a statement.)
155 This consistency check will fail only if there is a bug in the
156 interpreter or the interpreter code has somehow been damaged.
157 Increment the statement count and if it exceeds the statement limit,
162 Transfer control to address
164 that is in the block at level
169 Causes each block to be exited as if with
171 flushing and freeing files with
173 until the current display entry is at level
178 Duplicate the word or long on the top of
180 This is used mostly for constructing sets.
183 If and relational operators
187 The interpreter conditional transfers all take place using this operator
188 that examines the Boolean value on the top of the stack.
191 the next code is executed,
192 otherwise control transfers to the specified address.
196 These take two arguments on the stack,
197 and the sub-operation code specifies the relational operation to
198 be done, coded as follows with `a' above `b' on the stack:
215 Each operation does a test to set the condition code
216 appropriately and then does an indexed branch based on the
217 sub-operation code to a test of the condition here specified,
218 pushing a Boolean value on the stack.
220 Consider the statement fragment:
223 \*bif\fR a = b \*bthen\fR
230 are integers this generates the following code:
237 IF \fIElse part offset\fR
241 \fI\&... Then part code ...\fR
247 The Boolean operators
252 manipulate values on the top of the stack.
253 All Boolean values are kept in single bytes in memory,
254 or in single words on the stack.
255 Zero represents a Boolean \fIfalse\fP, and one a Boolean \fItrue\fP.
257 Right value, constant, and assignment operators
263 The right value operators load values on the stack.
264 They take a block number as a sub-opcode and load the appropriate
265 number of bytes from that block at the offset specified
266 in the following word onto the stack. As an example, consider
271 \fBcvtbl\fR (lc)+,r0 #r0 has display index
272 \fBaddl3\fR _display(r0),(lc)+,r1 #r1 has variable address
273 \fBpushl\fR (r1) #put value on the stack
277 Here the interpreter places the display level in r0.
278 It then adds the appropriate display value to the inline offset and
279 pushes the value at this location onto the stack.
280 Control then returns to the main
284 operators have short inline data that
285 reduces the space required to address the first 32K of
286 stack space in each stack frame.
291 provide explicit conversion to long as the data
293 This saves the generation of
295 to align arguments to
301 The constant operators load a value onto the stack from inline code.
302 Small integer values are condensed and loaded by the
304 operator, that is given by
308 \fBcvtbw\fR (lc)+,\-(sp)
312 Here note that little work was required as the required constant
313 was available at (lc)+.
314 For longer constants,
316 must be incremented before moving the constant.
319 takes a length specification in the sub-opcode and can be used to load
320 strings and other variable length data onto the stack.
325 provide explicit conversion to long as the constant is pushed.
329 The assignment operators are similar to arithmetic and relational operators
330 in that they take two operands, both in the stack,
331 but the lengths given for them specify
332 first the length of the value on the stack and then the length
333 of the target in memory.
334 The target address in memory is under the value to be stored.
342 is a full-length, 4 byte, integer,
343 will generate the code sequence
355 will load the address of
357 that is really given as a block number in the sub-opcode and an
358 offset in the following word,
359 onto the stack, occupying a single word.
361 that is a single word instruction,
362 then loads the constant 1,
363 that is in its sub-opcode,
365 Since there are not one byte constants on the stack,
366 this becomes a 2 byte, single word integer.
367 The interpreter then assigns a length 2 integer to a length 4 integer using
369 The code sequence for
376 \fBcvtwl\fR (sp)+,*(sp)+
380 Thus the interpreter gets the single word off the stack,
381 extends it to be a 4 byte integer
382 gets the target address off the stack,
383 and finally stores the value in the target.
384 This is a typical use of the constant and assignment operators.
386 Addressing operations
392 The most common operation done by the interpreter
393 is the ``left value'' or ``address of'' operation.
398 \fBcvtbl\fR (lc)+,r0 #r0 has display index
399 \fBaddl3\fR _display(r0),(lc)+,\-(sp) #push address onto the stack
403 It calculates an address in the block specified in the sub-opcode
404 by adding the associated display entry to the
405 offset that appears in the following word.
408 operator has a short inline data that reduces the space
409 required to address the first 32K of stack space in each call frame.
413 The offset operator is used in field names.
414 Thus to get the address of
420 would generate the sequence
434 given its block in the sub-opcode and offset in the following word,
435 and the interpreter then adds the offset of the field
437 in its record to get the correct address.
439 takes its argument in the sub-opcode if it is small enough.
443 The example above is incomplete, lacking a check for a
446 The code generated would be
458 operation checks for a
460 pointer and generates the appropriate runtime error if it is.
464 A pointer to the specified length inline data is pushed
466 This is primarily used for
470 (see sections 3.6 and 3.8)
478 are used for subscripting.
479 For example, the statement
489 ``array [1..1000] of real''
504 operation takes the address of
506 and places it on the stack.
509 is then placed on top of this on the stack.
510 The array address is indexed by the
511 length 4 index (a length 2 index would use
513 where the individual elements have a size of 8 bytes.
522 \fBcvtwl\fR (lc)+,r0 #r0 has size of records
524 \fBcvtwl\fR (lc)+,r1 #r1 has lower bound
525 \fBmovzwl\fR (lc)+,r2 #r2 has upper-lower bound
526 \fBsubl3\fR r1,(sp)+,r3 #r3 has base subscript
527 \fBcmpl\fR r3,r2 #check for out of bounds
529 \fBmull2\fR r0,r3 #calculate byte offset
530 \fBaddl2\fR r3,(sp) #calculate actual address
533 \fBmovw\fR $ESUBSCR,_perrno
537 Here the lower bound is subtracted, and range checked against the
538 upper minus lower bound.
539 The offset is then scaled to a byte offset into the array
540 and added to the base address on the stack.
541 Multi-dimension subscripts are translated as a sequence of single subscriptings.
545 For indirect references through
547 parameters and pointers,
548 the interpreter has a set of indirection operators that convert a pointer
549 on the stack into a value on the stack from that address.
552 operators are necessary because of the possibility of different
558 operators do conversions to long
559 as they push their data.
563 The interpreter has many arithmetic operators.
564 All operators produce results long enough to prevent overflow
565 unless the bounds of the base type are exceeded.
566 The basic operators available are
568 Addition: ADD*, SUCC*
569 Subtraction: SUB*, PRED*
570 Multiplication: MUL*, SQR*
571 Division: DIV*, DVD*, MOD*
577 The interpreter has several range checking operators.
578 The important distinction among these operators is between values whose
579 legal range begins at zero and those that do not begin at zero,
581 a subrange variable whose values range from 45 to 70.
582 For those that begin at zero, a simpler ``logical'' comparison against
583 the upper bound suffices.
584 For others, both the low and upper bounds must be checked independently,
585 requiring two comparisons.
588 both checks are done using a single index instruction
589 so the only gain is in reducing the inline data.
593 The interpreter includes three operators for
595 statements that are used depending on the width of the
598 For each width, the structure of the case data is the same, and
599 is represented in figure 2.4.
605 case statement operators do a sequential search through the
607 If they find the label value, they take the corresponding entry
608 from the transfer table and cause the interpreter to branch to the
610 If the specified label is not found, an error results.
614 operators take the number of cases as a sub-opcode
616 Three different operators are needed to handle single byte,
617 word, and long case transfer table values.
620 operator has the following code sequence:
626 \fBcvtwl\fR (lc)+,r0 #r0 has length of case table
628 \fBmovaw\fR (lc)[r0],r2 #r2 has pointer to case labels
629 \fBmovzwl\fR (sp)+,r3 #r3 has the element to find
630 \fBlocc\fR r3,r0,(r2) #r0 has index of located element
631 \fBbeql\fR caserr #element not found
632 \fBmnegl\fR r0,r0 #calculate new lc
633 \fBcvtwl\fR (r2)[r0],r1 #r1 has lc offset
637 \fBmovw\fR $ECASE,_perrno
641 Here the interpreter first computes the address of the beginning
642 of the case label value area by adding twice the number of case label
643 values to the address of the transfer table, since the transfer
644 table entries are 2 byte address offsets.
645 It then searches through the label values, and generates an ECASE
646 error if the label is not found.
647 If the label is found, the index of the corresponding entry
648 in the transfer table is extracted and that offset is added
649 to the interpreter location counter.
651 Operations supporting pxp
653 The following operations are defined to do execution profiling.
657 Causes the interpreter to allocate a count buffer
661 and to clear them to zero.
662 The count buffer is placed within an image of the
664 file as described in the
665 .I "PXP Implementation Notes."
666 The contents of this buffer are written to the file
668 when the program ends.
672 Increments the counter specified by
677 Used at the entry point to procedures and functions,
678 combining a transfer to the entry point of the block with
679 an incrementing of its entry count.
690 and the set relationals
693 The following operations are more interesting.
697 Takes the cardinality of a set of size
699 bytes on top of the stack, leaving a 2 byte integer count.
703 opcode to successively count the number of set bits in the set.
708 This operation requires a non-trivial amount of work,
709 checking bounds and setting individual bits or ranges of bits.
710 This operation sequence is slow,
711 and motivates the presence of the operator
716 include the number of elements
718 in the constructed set,
719 the lower and upper bounds of the set,
723 and a pair of values on the stack for each range in the set, single
724 elements in constructed sets being duplicated with
726 to form degenerate ranges.
735 specifies the size of the set,
738 values the lower and upper bounds of the set.
739 The value on the stack is checked to be in the set on the stack,
740 and a Boolean value of
744 replaces the operands.
750 on a constructed set without constructing it.
753 is on top of the stack followed by the number of pairs in the
755 and then the pairs themselves, all as single word integers.
756 Pairs designate runs of values and single values are represented by
757 a degenerate pair with both value equal.
758 This operator is generated in grammatical constructs such as
760 \fBif\fR character \fBin\fR [`+', '\-', `*', `/']
765 \fBif\fR character \fBin\fR [`a'..`z', `$', `_']
768 These constructs are common in Pascal, and
770 makes them run much faster in the interpreter,
771 as if they were written as an efficient series of
777 Other miscellaneous operators that are present in the interpreter
780 that causes the program to end if the Boolean value on the stack is not
788 that convert between different length arithmetic operands for
789 use in aligning the arguments in
793 calls, and with some untyped built-ins, such as
798 Finally, if the program is run with the run-time testing disabled, there
799 are special operators for
802 and special indexing operators for arrays
803 that have individual element size that is a power of 2.
804 The code can run significantly faster using these operators.
806 Mathematical Functions
808 The transcendental functions
818 are taken from the standard UNIX
819 mathematical package.
820 These functions take double precision floating point
821 values and return the same.
828 take a double precision floating point number.
830 returns an integer representing the machine
831 representation of its argument's exponent,
833 returns the integer part of its argument, and
835 returns the rounded integer part of its argument.
837 System functions and procedures
841 A line limit and a file pointer are passed on the stack.
842 If the limit is non-negative the line limit is set to the
843 specified value, otherwise it is set to unlimited.
844 The default is unlimited.
848 A statement limit is passed on the stack. The statement limit
850 The default is 500,000.
851 No limit is enforced when the ``p'' option is disabled.
858 returns the number of milliseconds of user time used by the program;
860 returns the number of milliseconds of system time used by the program.
864 The number of seconds since some predefined time is
865 returned. Its primary usefulness is in determining
866 elapsed time and in providing a unique time stamp.
869 The other system time procedures are
873 that copy an appropriate text string into a pascal string array.
876 returns the number of command line arguments passed to the program.
879 takes an index on the stack and copies the specified
880 command line argument into a pascal string array.
882 Pascal procedures and functions
888 They function as a memory to memory move with several
890 They do no ``unpacking'' or ``packing'' in the true sense as the
891 interpreter supports no packed data types.
899 of a pointer is passed.
901 allocates a record of a specified size and puts a pointer
902 to it into the pointer variable.
904 deallocates the record pointed to by the pointer
905 and sets the pointer to
911 converts a suitably small integer into an ascii character.
912 Its primary purpose is to do a range check.
917 if its argument is odd and returns
919 if its argument is even.
922 always returns the value