Section notes: Week 11

CS 164, Fall 2005

General

MT 2 follow-up?
We will probably use these notes next week, too; please bring them with you next time.
VM (source) and code generator (compiled) are in class directory.

Stack machine architecture

Our stack machine is based on a Scheme virtual machine from Peter Norvig's Paradigms in AI Programming. We've made some modifications, incorporating features of other virtual stack machines we know about (including the Java Virtual Machine). The main pieces of the stack machine are:

Code segment - A long array, with one instruction per entry.
Program counter - Points to the next instruction to be executed.
Current instruction
Run-time stack - A long array, with one data item per entry. Can be accessed by offsets from the top of the stack, or by offsets from the current frame. Since our machine has no general purpose registers, all computations are done on the stack.
Frame pointer - Points to the beginning of a stack frame. During a function call, the stack frame typically starts with the first function argument.
Stack pointer - Points to the next empty position in the stack.
Run-time heap - The heap is where arrays and objects live. Our virtual machine has instructions to allocate a block of memory from the heap, and to read and write into chunks of memory in the heap. If we were running directly on hardware, the heap management would be done by the system, and we would use a system call (instead of an instruction) to get a block of heap memory.

Our machine goes through the same basic loop as most other machines, virtual or physical. First an instruction is fetched, and the program counter incremented to the next instruction. Then the instruction is decoded, and then executed. Then we fetch the next instruction and start the process all over again.

A lot of the code in simple-machine.lisp is for setup or debugging. The routine run-vm, which implements the actual fetch/decode/execute loop, is pretty short. I recommend looking through run-vm, since the VM source code is probably the most unambiguous, concise documentation of the instruction set.

Example codes

If-then-else

START:
    PUSHI 1 
    PUSHI 2 
    <              ; Compute 2 < 1
    JUMPZ ELSE     ; If zero (false), go to else
    PUSHI 3 
    PRINT          ; Print 3
    JUMP ENDIF 
ELSE:
    PUSHI 4 
    PRINT          ; Print 4
ENDIF:
    EXIT 0

Factorial

START:
    PUSHI 0        ; Reserve space for caller FP and PC
    PUSHI 0 
    PUSHI 3        ; Argument = 3
    CALL FACT 1    ; Call fact(3)
    PRINT          ; Print result
    EXIT 0 
FACT:
    LVAR 0         ; Get argument n
    JUMPZ BASECASE ; If it's zero, go to base case
    PUSHI 0        ; Reserve space for caller FP and PC
    PUSHI 0 
    PUSHI 1 
    LVAR 0 
    -              ; Argument = n-1
    CALL FACT 1    ; Call fact(n-1)
    LVAR 0 
    *              ; Multiply result by n
    RETURN 
BASECASE:
    PUSHI 1        ; 0! = 1
    RETURN

Instruction set

Instructions in the virtual machine have stack operands and immediate operands. Immediate operands are encoded in the instruction; for example, in most of the jump instructions, the target address is an immediate operand. Stack operands are popped from the run-time stack; most of the arithmetic instructions take only stack operands.

Variable/stack manipulation

   lvar i  - Gets the ith variable from the stack frame
   lset i  - Sets the ith variable from the stack frame
   pop     - Pops the stack
   swap    - Swaps the top two elements
   dup i   - Duplicates the ith entry from top of stack
   addi i  - Adds an immediate value to the top of stack
   alloc   - Pops a size from the stack; allocates a new array of that size
   alen    - Pops an array, pushes the length
   mem     - Pops index and array; pushes array[index].
   smem    - Pops value, index, and array; sets array[index] = value.
   pushi i - Push an immediate value
   pusha i - Push an address

Branching instructions

   jump a  - Jumps to a
   jumpz a - Pops val, jumps to a if it's zero
   jumpn a - Pops val, jumps to a if it isn't zero
   jumpi   - Pops an address and jumps to it

Function calls

   call f n - Calls function f with n arguments.
   calli n  - Calls a function with n arguments.  Address popped from stack.
   frame m  - Push zeros onto the stack until there are m slots in the frame.
   return   - Returns from a call.  Stack should contain just the return val.

Arithmetic and logical operations

   +, *, -, <, and - Operations on two variables (e.g. pop a, b, push a-b)
   not             - Operates on one variable

System access

   print  - Pops an integer and prints it
   read   - Reads an integer and pushes it
   exit s - Exits with status code s

Debugging hooks

   debug  - Ignored by the machine; may be used for debugging hooks
   break  - Aborts execution; may be useful for debugging

Debugging facilities

Have you ever wondered how gdb works? Most modern machines provide some form of hardware support for debuggers, which allows you to set breakpoints and to step through code. When the debugging support is inadequate, people frequently resort to virtual machines (for example, the valgrind memory debugger implements a virtual x86 machine). Why not have debugging facilities for our machine, too?

Our machine has a debugging hook: the user can set the field debug-hook to a function which is called after each instruction is fetched, just before the decode/execute stage. We also provide a crash hook, which is called whenever there is an error during program execution (invalid instruction, arithmetic on an array type, stack over/underflow, out-of-bounds jumps, etc). If no crash handler is defined, the default handler prints out a diagnostic message and the current state information for the machine. Note that the machine is not re-set after a crash -- you can always fix the problem in the VM state and restart execution where you left off!

Using these debugging hooks, one can build a fairly complete debugger. So far, though, we have written only a primitive trace facility and a few examples that use the hooks for other things. Feel free to write your own debugging extensions and share them!

We have also provided a pseudo-instruction, debug, specifically for communicating information to the debugger. You can use this pseudo-instruction to instrument your code at compile time. For example, the following code sequence prints some debugging information at run-time.

(setf *debug-test*
      '(start
        (debug "~%Start of program")
        (pushi 1)
        (pushi 2)
        (<)
        (jumpz else)
        (debug "~%Took if")
        (pushi 3)
        (print)
        (jump endif)
        else
        (debug "~%Took else")
        (pushi 4)
        (print)
        endif
        (exit 0)))

(run-vm (make-vm-state :code (assemble *debug-test*)
                       :debug-hook #'(lambda (vm)
                                       (let ((i (vm-state-inst vm)))
                                         (when (eq (car i) 'debug)
                                           (apply #'format t (cdr i)))))))