Introduction

If you are behind on your project, stop reading this page, and come back next week. The use of the compiler is not required, and is provided as a fun toy to implement applications for your new computer platform.

A good compiler is a core part of any computer system. It is the compiler's job to derive performance from semi-portable code via detailed knowledge of the target architecture. For this reason, the best compilers are often built by the computer architects responsible for the system in question. Furthermore, compilers are often built before the architecture, and the two are co-optimized (in some sense).

The compiler we have made available for your cutting-edge MIPS150 processor is not quite as fancy (and not as feature-complete) as the ones used for x86 machines. Nevertheless, it creates a much more productive abstraction than writing MIPS150 assembly by hand.

Without further adue, please find the toolflow executables (built for the x86 Windows platform) here. Please do not hesitate to contact Ilia if you would like linux x86 or x86_64 binaries, or even the source code. Be warned, though: the source tree is rather on the enormous side.

The Bird's-Eye View

In writing the compiler, we extended two excellent tools: the LLVM compiler infrastructure, and the MARS assembler and simulator. Our flow consists of 3 tools, for which are briefly documented below:
Tool Description
clang A front-end compiler for LLVM. We use this tool to compile C to LLVM bytecode (which directly encodes the application as a single static assignment DAG). This compiler is fully-featured, but no standard libraries are available (you will not be able to use things like malloc).
llvm-link The LLVM linker, which we will use to stich together several compiled files into one object file (expressed in LLVM bytecode assembly)
llc The LLVM static compiler. We extended this tool to translate LLVM bytecode to MIPS150 assembly. This tool lacks a number of obvious features, such as support for byte, half, and double-word memory accesses, as well as floating point arithmetic. In other words, char pointers and floats will not work!
mars (coming later this week) A java-based assembler and simulator for MIPS. We stripped this tool down by removing features not present in MIPS150, and relaxed the parser to accept compiler output.
To summarize, your C code follows the following path:
C files → LLVM bytecode files → One LLVM bytecode file → MIPS150 assembly → MIPS150 binary

limitations

Your MIPS150 computer is not quite a complete architecture. Most notably, we lack support for an operating system (no user access control), a disk, a timer, interrupts. As a result, you will probably not be able to compile your favorite games or Linux distributions to run on your processor.
Missing Therefore
Non-32b memory accesses Although you can manipulate chars and other data types on the stack, you will not be able to dereference their pointers or create arrays. Our architecture lacks the load/store byte and half instructions, making byte access rather annoying (HW6, problem 2). If time permits, we will add this feature to the llc tool.
Floating point arithmetic Since we obviously didn't ask you to build a floating point unit, we will rely on software for floating point arithmetic. If time permits (and it looks like it will), we will use the SoftFloat library (by John Hauser, who was once John Wawrzynek's PhD student).
Linker The linker is now provided, and a single program can be expressed in multiple C files.
Standard libraries Unfortunately we are unable to provide implementations of the C standard library files (all of the things #include to make life easier). These libraries are implemented on a per-system basis, and are quite large. The main consequence is the lack of support for strings and dynamic memory management. We will write a few and make them available, but you will have to write your own routines if you need them. Dereference integer constants to access physical memory

Tutorial

Consider a C file, main.c:

void storeWordAlignedByte(int a, char c);

int main(){
   int a = 0x12345678;
   char c = 'X';
   storeWordAlignedByte(a,c);

   return 0;
}

void storeWordAlignedByte(int a, char c){
   *((int*)a) = (int)c;
}


To compile this file to LLV bytecode, we use clang as follows:

clang.exe -O4 -S -emit-llvm -o out main.c

Feel free to look at the textual representation of the LLVM bytecode by examining the contents of "out".

The -O4 flag specifies the extent to which your code is optimized by the front end compiler (this is where most of the optimization happens). Use -O0 if you are worried that something is being pruned away by the compiler.
The -S -emit-llvm flags tell clang to produce LLVM bytecode assembly. If this is not specified, clang attempts to invoke an assembler for the current architecture (x86, which is excluded from this build).
The -o out flag specifies the filename ("out", in this case) for the output produced by clang. The -o flag will not be used once a linker is released, and we can build multiple object files.
main.c is the C file we are compiling.


If we had multiple object files (result of compiling multiple C files intended to live in one program), we would need to use a linker. A linker tutorial is coming soon. To map the single static assignment DAG (LLVM bytecode) to the desired machine architecture (MIPS150), we use the following:

llc.exe -O3 -march=mips150 -o out.s out

The -O3 flag tells the assembler to optimize the output, if applicable. Some architectures (not MIPS150) can take advantage of post-linker optimization better than others.
The -march=mips150 is what tells the toolchain that we actually want to run on a MIPS150 processor.
The -o out.s instructs llc to write its output to a file named "out.s"


If you used a main function (a good idea), insert the following as the first few lines of the assembly file:
jal main
ENDPROGRAM: nop
j ENDPROGRAM
to actually call the main function (and stop the execution once main returns). If this step is omitted, the binary will be executed in from the beginning, which is probably not a good thing. A python script will shortly be provided to automate this step.

Now it is time to use MARS to assemble the binary file.
A customized MARS assembler will shortly be provided.
A customized MIPS150 assembler is now provided, which allows the output of the compiler tool flow to be mapped to loadable MIPS150 binary.
A tutorial is coming soon.

Sharing

While we do not permit any sharing of hardware designs, the sharing of software is allowed and encouraged. If you write software you would like to share with the class, please send it to us so we can post it here.