1 .\" Copyright (c) 1979 The Regents of the University of California.
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
12 .\" 3. All advertising materials mentioning features or use of this software
13 .\" must display the following acknowledgement:
14 .\" This product includes software developed by the University of
15 .\" California, Berkeley and its contributors.
16 .\" 4. Neither the name of the University nor the names of its contributors
17 .\" may be used to endorse or promote products derived from this software
18 .\" without specific prior written permission.
20 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32 .\" @(#)pxin2.n 5.2 (Berkeley) 4/17/91
33 .\" $FreeBSD: src/share/doc/papers/px/pxin2.n,v 1.1.1.1.14.1 2000/11/30 17:13:59 ru Exp $
34 .\" $DragonFly: src/share/doc/papers/px/pxin2.n,v 1.2 2003/06/17 04:36:56 dillon Exp $
41 Naming conventions and operation summary
43 Table 2.1 outlines the opcode typing convention.
44 The expression ``a above b'' means that `a' is on top
45 of the stack with `b' below it.
46 Table 2.3 describes each of the opcodes.
47 The character `*' at the end of a name specifies that
48 all operations with the root prefix
50 are summarized by one entry.
51 Table 2.2 gives the codes used
52 to describe the type inline data expected by each instruction.
61 Basic control operations
66 Corresponds to the Pascal procedure
68 causes execution to end with a post-mortem backtrace as if a run-time
73 Causes the second part of the block mark to be created, and
75 bytes of local variable space to be allocated and cleared to zero.
76 Stack overflow is detected here.
78 is the first line of the body of this section for error traceback,
79 and the inline string (length s) the character representation of its name.
85 and used to begin the main program when the ``p''
86 option is disabled so that the post-mortem backtrace will be inhibited.
90 Complementary to the operators
94 exits the current block, calling the procedure
96 to flush buffers for and release any local files.
97 Restores the environment of the caller from the block mark.
98 If this is the end for the main program, all files are
100 and the interpreter is exited.
104 Saves the current line number, return address, and active display entry pointer
106 in the first part of the block mark, then transfers to the entry point
107 given by the relative address
109 that is the beginning of a
121 Used to make space for the return value of a
123 just before calling it.
134 returns to remove the arguments from the stack.
138 Transfer control to relative address
142 or part of a structured statement.
146 Transfer control to an absolute address as part of a non-local
148 or to branch over procedure bodies.
152 Set current line number to
154 For consistency, check that the expression stack is empty
155 as it should be (as this is the start of a statement.)
156 This consistency check will fail only if there is a bug in the
157 interpreter or the interpreter code has somehow been damaged.
158 Increment the statement count and if it exceeds the statement limit,
163 Transfer control to address
165 that is in the block at level
170 Causes each block to be exited as if with
172 flushing and freeing files with
174 until the current display entry is at level
179 Duplicate the word or long on the top of
181 This is used mostly for constructing sets.
184 If and relational operators
188 The interpreter conditional transfers all take place using this operator
189 that examines the Boolean value on the top of the stack.
192 the next code is executed,
193 otherwise control transfers to the specified address.
197 These take two arguments on the stack,
198 and the sub-operation code specifies the relational operation to
199 be done, coded as follows with `a' above `b' on the stack:
216 Each operation does a test to set the condition code
217 appropriately and then does an indexed branch based on the
218 sub-operation code to a test of the condition here specified,
219 pushing a Boolean value on the stack.
221 Consider the statement fragment:
224 \*bif\fR a = b \*bthen\fR
231 are integers this generates the following code:
238 IF \fIElse part offset\fR
242 \fI\&... Then part code ...\fR
248 The Boolean operators
253 manipulate values on the top of the stack.
254 All Boolean values are kept in single bytes in memory,
255 or in single words on the stack.
256 Zero represents a Boolean \fIfalse\fP, and one a Boolean \fItrue\fP.
258 Right value, constant, and assignment operators
264 The right value operators load values on the stack.
265 They take a block number as a sub-opcode and load the appropriate
266 number of bytes from that block at the offset specified
267 in the following word onto the stack. As an example, consider
272 \fBcvtbl\fR (lc)+,r0 #r0 has display index
273 \fBaddl3\fR _display(r0),(lc)+,r1 #r1 has variable address
274 \fBpushl\fR (r1) #put value on the stack
278 Here the interpreter places the display level in r0.
279 It then adds the appropriate display value to the inline offset and
280 pushes the value at this location onto the stack.
281 Control then returns to the main
285 operators have short inline data that
286 reduces the space required to address the first 32K of
287 stack space in each stack frame.
292 provide explicit conversion to long as the data
294 This saves the generation of
296 to align arguments to
302 The constant operators load a value onto the stack from inline code.
303 Small integer values are condensed and loaded by the
305 operator, that is given by
309 \fBcvtbw\fR (lc)+,\-(sp)
313 Here note that little work was required as the required constant
314 was available at (lc)+.
315 For longer constants,
317 must be incremented before moving the constant.
320 takes a length specification in the sub-opcode and can be used to load
321 strings and other variable length data onto the stack.
326 provide explicit conversion to long as the constant is pushed.
330 The assignment operators are similar to arithmetic and relational operators
331 in that they take two operands, both in the stack,
332 but the lengths given for them specify
333 first the length of the value on the stack and then the length
334 of the target in memory.
335 The target address in memory is under the value to be stored.
343 is a full-length, 4 byte, integer,
344 will generate the code sequence
356 will load the address of
358 that is really given as a block number in the sub-opcode and an
359 offset in the following word,
360 onto the stack, occupying a single word.
362 that is a single word instruction,
363 then loads the constant 1,
364 that is in its sub-opcode,
366 Since there are not one byte constants on the stack,
367 this becomes a 2 byte, single word integer.
368 The interpreter then assigns a length 2 integer to a length 4 integer using
370 The code sequence for
377 \fBcvtwl\fR (sp)+,*(sp)+
381 Thus the interpreter gets the single word off the stack,
382 extends it to be a 4 byte integer
383 gets the target address off the stack,
384 and finally stores the value in the target.
385 This is a typical use of the constant and assignment operators.
387 Addressing operations
393 The most common operation done by the interpreter
394 is the ``left value'' or ``address of'' operation.
399 \fBcvtbl\fR (lc)+,r0 #r0 has display index
400 \fBaddl3\fR _display(r0),(lc)+,\-(sp) #push address onto the stack
404 It calculates an address in the block specified in the sub-opcode
405 by adding the associated display entry to the
406 offset that appears in the following word.
409 operator has a short inline data that reduces the space
410 required to address the first 32K of stack space in each call frame.
414 The offset operator is used in field names.
415 Thus to get the address of
421 would generate the sequence
435 given its block in the sub-opcode and offset in the following word,
436 and the interpreter then adds the offset of the field
438 in its record to get the correct address.
440 takes its argument in the sub-opcode if it is small enough.
444 The example above is incomplete, lacking a check for a
447 The code generated would be
459 operation checks for a
461 pointer and generates the appropriate runtime error if it is.
465 A pointer to the specified length inline data is pushed
467 This is primarily used for
471 (see sections 3.6 and 3.8)
479 are used for subscripting.
480 For example, the statement
490 ``array [1..1000] of real''
505 operation takes the address of
507 and places it on the stack.
510 is then placed on top of this on the stack.
511 The array address is indexed by the
512 length 4 index (a length 2 index would use
514 where the individual elements have a size of 8 bytes.
523 \fBcvtwl\fR (lc)+,r0 #r0 has size of records
525 \fBcvtwl\fR (lc)+,r1 #r1 has lower bound
526 \fBmovzwl\fR (lc)+,r2 #r2 has upper-lower bound
527 \fBsubl3\fR r1,(sp)+,r3 #r3 has base subscript
528 \fBcmpl\fR r3,r2 #check for out of bounds
530 \fBmull2\fR r0,r3 #calculate byte offset
531 \fBaddl2\fR r3,(sp) #calculate actual address
534 \fBmovw\fR $ESUBSCR,_perrno
538 Here the lower bound is subtracted, and range checked against the
539 upper minus lower bound.
540 The offset is then scaled to a byte offset into the array
541 and added to the base address on the stack.
542 Multi-dimension subscripts are translated as a sequence of single subscriptings.
546 For indirect references through
548 parameters and pointers,
549 the interpreter has a set of indirection operators that convert a pointer
550 on the stack into a value on the stack from that address.
553 operators are necessary because of the possibility of different
559 operators do conversions to long
560 as they push their data.
564 The interpreter has many arithmetic operators.
565 All operators produce results long enough to prevent overflow
566 unless the bounds of the base type are exceeded.
567 The basic operators available are
569 Addition: ADD*, SUCC*
570 Subtraction: SUB*, PRED*
571 Multiplication: MUL*, SQR*
572 Division: DIV*, DVD*, MOD*
578 The interpreter has several range checking operators.
579 The important distinction among these operators is between values whose
580 legal range begins at zero and those that do not begin at zero,
582 a subrange variable whose values range from 45 to 70.
583 For those that begin at zero, a simpler ``logical'' comparison against
584 the upper bound suffices.
585 For others, both the low and upper bounds must be checked independently,
586 requiring two comparisons.
589 both checks are done using a single index instruction
590 so the only gain is in reducing the inline data.
594 The interpreter includes three operators for
596 statements that are used depending on the width of the
599 For each width, the structure of the case data is the same, and
600 is represented in figure 2.4.
606 case statement operators do a sequential search through the
608 If they find the label value, they take the corresponding entry
609 from the transfer table and cause the interpreter to branch to the
611 If the specified label is not found, an error results.
615 operators take the number of cases as a sub-opcode
617 Three different operators are needed to handle single byte,
618 word, and long case transfer table values.
621 operator has the following code sequence:
627 \fBcvtwl\fR (lc)+,r0 #r0 has length of case table
629 \fBmovaw\fR (lc)[r0],r2 #r2 has pointer to case labels
630 \fBmovzwl\fR (sp)+,r3 #r3 has the element to find
631 \fBlocc\fR r3,r0,(r2) #r0 has index of located element
632 \fBbeql\fR caserr #element not found
633 \fBmnegl\fR r0,r0 #calculate new lc
634 \fBcvtwl\fR (r2)[r0],r1 #r1 has lc offset
638 \fBmovw\fR $ECASE,_perrno
642 Here the interpreter first computes the address of the beginning
643 of the case label value area by adding twice the number of case label
644 values to the address of the transfer table, since the transfer
645 table entries are 2 byte address offsets.
646 It then searches through the label values, and generates an ECASE
647 error if the label is not found.
648 If the label is found, the index of the corresponding entry
649 in the transfer table is extracted and that offset is added
650 to the interpreter location counter.
652 Operations supporting pxp
654 The following operations are defined to do execution profiling.
658 Causes the interpreter to allocate a count buffer
662 and to clear them to zero.
663 The count buffer is placed within an image of the
665 file as described in the
666 .I "PXP Implementation Notes."
667 The contents of this buffer are written to the file
669 when the program ends.
673 Increments the counter specified by
678 Used at the entry point to procedures and functions,
679 combining a transfer to the entry point of the block with
680 an incrementing of its entry count.
691 and the set relationals
694 The following operations are more interesting.
698 Takes the cardinality of a set of size
700 bytes on top of the stack, leaving a 2 byte integer count.
704 opcode to successively count the number of set bits in the set.
709 This operation requires a non-trivial amount of work,
710 checking bounds and setting individual bits or ranges of bits.
711 This operation sequence is slow,
712 and motivates the presence of the operator
717 include the number of elements
719 in the constructed set,
720 the lower and upper bounds of the set,
724 and a pair of values on the stack for each range in the set, single
725 elements in constructed sets being duplicated with
727 to form degenerate ranges.
736 specifies the size of the set,
739 values the lower and upper bounds of the set.
740 The value on the stack is checked to be in the set on the stack,
741 and a Boolean value of
745 replaces the operands.
751 on a constructed set without constructing it.
754 is on top of the stack followed by the number of pairs in the
756 and then the pairs themselves, all as single word integers.
757 Pairs designate runs of values and single values are represented by
758 a degenerate pair with both value equal.
759 This operator is generated in grammatical constructs such as
761 \fBif\fR character \fBin\fR [`+', '\-', `*', `/']
766 \fBif\fR character \fBin\fR [`a'..`z', `$', `_']
769 These constructs are common in Pascal, and
771 makes them run much faster in the interpreter,
772 as if they were written as an efficient series of
778 Other miscellaneous operators that are present in the interpreter
781 that causes the program to end if the Boolean value on the stack is not
789 that convert between different length arithmetic operands for
790 use in aligning the arguments in
794 calls, and with some untyped built-ins, such as
799 Finally, if the program is run with the run-time testing disabled, there
800 are special operators for
803 and special indexing operators for arrays
804 that have individual element size that is a power of 2.
805 The code can run significantly faster using these operators.
807 Mathematical Functions
809 The transcendental functions
819 are taken from the standard UNIX
820 mathematical package.
821 These functions take double precision floating point
822 values and return the same.
829 take a double precision floating point number.
831 returns an integer representing the machine
832 representation of its argument's exponent,
834 returns the integer part of its argument, and
836 returns the rounded integer part of its argument.
838 System functions and procedures
842 A line limit and a file pointer are passed on the stack.
843 If the limit is non-negative the line limit is set to the
844 specified value, otherwise it is set to unlimited.
845 The default is unlimited.
849 A statement limit is passed on the stack. The statement limit
851 The default is 500,000.
852 No limit is enforced when the ``p'' option is disabled.
859 returns the number of milliseconds of user time used by the program;
861 returns the number of milliseconds of system time used by the program.
865 The number of seconds since some predefined time is
866 returned. Its primary usefulness is in determining
867 elapsed time and in providing a unique time stamp.
870 The other system time procedures are
874 that copy an appropriate text string into a pascal string array.
877 returns the number of command line arguments passed to the program.
880 takes an index on the stack and copies the specified
881 command line argument into a pascal string array.
883 Pascal procedures and functions
889 They function as a memory to memory move with several
891 They do no ``unpacking'' or ``packing'' in the true sense as the
892 interpreter supports no packed data types.
900 of a pointer is passed.
902 allocates a record of a specified size and puts a pointer
903 to it into the pointer variable.
905 deallocates the record pointed to by the pointer
906 and sets the pointer to
912 converts a suitably small integer into an ascii character.
913 Its primary purpose is to do a range check.
918 if its argument is odd and returns
920 if its argument is even.
923 always returns the value