Andy Gee 11 February 2007 CS162 Lecture Notes #9 Speaker: Thomas Kho Today: Nachos Tutorial Nachos and monitors Sign up group on course website for design reviews If you have issues with your SVN account, give Thomas an email A walkthrough Nachos: What is Nachos? Today we'll discuss its capabilities, purpose and history. What does it do? Why use Nachos? And how to get started? What is Nachos? An instructional OS. What else? It is a hardware simulator, allows for a simulation of a MIPS processor, one can write in C and compile it to MIPS to run in Nachos, it has a console, a network interface and a timer. Why? Building an operating system from scratch is the best way to learn. Simulation is easier, especially in Java. The skeleton allows one to build one piece at a time. Why Java? It's portable, and it's type safe. History of Nachos: In 1992, it was written in C++ by Christopher, Procter and Anderson. It was rewritten by Henetta. How does it work? It's all in Java with a bunch of packages. Nachos is a simulator: ____________ ____________ [User Program] [User Program] ______________________ [Nachos Kernel] [Sim HW]<--- this is the Nachos Portion [ JVM ] [ Real OS ] [ HW ] Booting Nachos: It all starts in nachos.machine.Machine.main. Main initializes devices, then passes control to the autograder. The autograder will create a kernel and start, which starts the OS. The Machine: it's in nachos.machine.Machine This starts the system, and gives hardware devices access to: Machine.interrupt(), timer(), console(), and networkLink(). The Interrupt Controller: Handles hardware interrupts. It also maintains an event queue and a clock. The Clock Ticking conditions (it ticks if either of these conditions are met): 1. One tick for executing a MIPS instruction 2. Ten ticks fro re-enabling interrupts After any tick, Interrupt checks for pending interrupts and runs them. This calls the device even handler, not the software interrupt handler. It has important methods which are accessible to the other simulated HW: scheduler(), tick(), checkIfDue(), enable(), and disable(). All HW devices depend on interrupts, meaning they don't get threads. Timer: from nachos.Machine.timer This causes interrupts every ~500 ticks. It has some important methods like getTime() which tells how many ticks so far, and setInterruptHandler() which tells the time what to do when it goes off The Timer provides preemption. The Serial Console: A Java interface Its methods are: readByte()--interrupt if something needs to be read writeByte()--interrupt when its ready for more setInterruptHandler()-tell console what to call when it's finished this uses stdin and stdout. Other HW devices: The disk isn't part of Nachos (in the Java rendition). Network Link is similar to console, but packet based. It's for phase 4. The Kernel It's an abstract class with a couple different methods: initialize(), selfTest() (this runs your own tests, not the autograder ones), run() (which runs user code), and terminate() Threading All nachos threads are instances of KThread. They can either be in one of the following states: new, ready, running, blocked, or finished. Each thread has a TCB (a task control block). A kThread is internally implemented with Java threads. Running Threads First, one needs to create a runnable() object, a KThread and then call fork(). class Sprinter implements Runnable { public void run(){ // run real fast } } Sprinter s = new Sprinter(); new KThread(s).fork(); Scheduler This makes a threadQueue to decide which one to run next The default is Round Robin It's specified in the config file. Nachos Config Nachos.conf file lets one specify options like which classes to use for Kernel, Scheduler and whether to be able to run user programs. This should be different for each project. Creating the First Thread ThreadedKernel.Initialize public void initialize (string [] args){ ... // start threading new KThread(null); ... } What does kThread perform? What does it create? This defaults to the idle thread with nothing Advanced Topics: simulated MIPS processor, address translation, user level analysis. How are we using it? There's four phases: 1 - threading 2 - multiprogramming 3 - Caching and VM 4 - Networks and distributed systems Extend and embrace nachos The projects will add features and also implement stuff in C Phase 1 Threading 5% join (when you fork a thread, get a process threadID, block the parent and wait for the child) 5% condition variables 10% alarm which is probably the easier one to do 20% communicator 35% priority scheduler 25% rowing Hawaiian kids Row boat synchonization [Molokai]<--------[Oahu with a crap load of people] Get adults and children from Oahu to Molokai on a boat. Constraints: There's only 1 boat, and that boat can fit either 1 child, 2 children or 1 adult. A pilot is required. Also, each child and adult acts independently (no GPS) Phase 2 Multiprogramming 30% File system calls 25% multiprogrammming (allow many programs at once) 30% system calls 15% lottery scheduler Phase 3: Caching and VM 30% implement TLB, inverted page table 40% paged VM 30% lazy loading (don't load until needed) Phase 4: Networking 75% networking syscalls (connect, accept) 25% chat program (like IRC) Workload (grading) %'s given Divide work fairly (for the evals). The projects depend on each other, like lottery depends on priority scheduler phase 4 depends on 2,3 phase 3 depends on 2 phase 2 depends on 1 How to start On the class webpage there's a tutorial. Download and install the nachos package (only 1 person imports to the SVN repository). Read the README, check that one can make proj1. Initial design doc (6-10 pages) due this Tuesday at midnight. Advice Do one step at a time and then test. Use eclipse and a debugger. For more information, check the README, the course webpage, and read the code. Subersion and CVS It allows multiple people to work on code concurrently. It merges changes, updates, has commit. SVN information at svnbook.red-bean.com SVN for windows: tortoisesvn.tigris.org The eclipse plugin is available (it's called Subclipse). There's an issue with the assert function in Eclipse, and the makefile will fix it for you. ---End Nachos Tutorial--- ---Break--- Condition Variables: Process synchronization with condition variables. Processor or threads can cooperate using wait and signal along with condition variables. Wait-->process until some other process invokes signal. x.signal up --> assume 1 suspended process. x.signal and x.wait used to control within monitors (a special type of critical region). There's a binary semaphore wieth each monitor, and mutual exclusion is implicit. P on entry, V on exit. Monitors have a couple advantages: they are easier and safer They also have 3 features: shared data, operation on data, and they synch and schedule But, one needs a way to wait Condition variables: things to wait on When using Wait, it releases the monitor lock, puts processes to sleep, and when the process wakes up, reacquire the lock immediately. When using signal, it wakes up a process When using a broadcast, it wakes up all the sleeping threads Mesa Semantics: On signal, the signaller keeps monitor lock Awakened process waits for monitr lock with no specified priority (so need to double check). Readers and writers problem with monitors: need to implement check read, check write, done read, done write A couple conditions are needed too: okToRead, okToWrite checkRead(){ if ((AW+WW) >0){ WR++; wait(okToRead); WR--; } AR++; read { doneRead(){{ AR--; if(AR == 0 && W > 0) signal(okToWrite); } checkWrite(){ while((AW+AR) > 0){ WW++; wait(okToWrite); WW--;} AW++; write } In this version of doneWrite, writers have priority doneWrite(){ AW--; if (WW > 0) signal(okToWrite); else broadcast(okToRead); } In summary: monitors are not present in many languages, but they are useful. (Java has built-in support for monitors, and C#) So it's kind of changing. Semaphores use a single structue for exclusion and scheduling. Monitors use a different structure for each. Monitors enforce a style of programming where complex synchronization code doesn't get mixed. Existing implementations of monitors are embedded in porgamming languages like the Mesa language Xerox. The monitor queue handler example (some source code) In Unix, there are generalized semaphores created in sets, where several operations can be done simultaneously. Increments and decrements can be values > 1. Kernel does all of these operations atomically. Associated with each semaphore is a queue of processes suspended on that semaphore. Semp system call takes a list of semaphore ops, each defined on semaphore in that set. It proceesses them 1 at a time. Sem op has 3 cases: > 0 the kernel increments value of semaphore and awakens all processes waiting for that value of semaphroe to increase == 0 kernel checks semaphore value, if semaphore value == 0, continue on list, else block the process and it waits on semaphore < 0 and abs(value) <= semaphore value the kernel adds the semop which is a negative value. If the result is 0, then awaken all proceses waiitng for the value of the semaphore. Else, the kernel suspends the process until its value increases Signals: they are software interrupts used by processes. There are 20 defined signals. Sotrage Aolloaction; linkers Object code has three sections: code, data and stack Why distinguish? Want more processes on the same code (usually read only) Data and stack segments must apply to each process (modified) For dynamical linking and separate compilatioin Code is generated by compiler from source code It has addresses. But some of these are not known at assembly time or at compile time. This results from the compiler compiling at separate times. If it's down all at once, the computation is intense. So the compiler does it separately to preserve modularity. Division of responsibility between portions of the system. The compiler generates an object file, which is incomplete. This provides a symbol table and a relocation table. The linker takes many object files and merges them into 1 object file which is self-sufficient. The loader takes the object file and adjusts the addresses. The linker and loader are terms which are used interchangably. The OS puts the object files in memory, allows processes to share it and access it. The Run Time library works with the OS to dynamically allocate routines (alloc and free) When code is compiled or assembled, set the address as if loaded at zero. When compiling, there are relative addresses and external addresses. Create a segment table: name, size and base address Symbol table: global definitions, labels needed in other sections, no internals Relocation table: table of addresses that need to be fixed, internal references (locations in segment) external references (not in segment) Compiler provides these tables and the object code to us. 3 steps in Linker/loader 1. Determine location of each segment 2. Calculate each value of symbols and update the symbol table 3. Scan Relocation table Operation of a linker Collect all pieces of program (search libraries) One needs to assign each segment a final location, and then create a segment table. Also, one needs to resolve all addresses that need to be fixed, and these are reflected in a new object file. That object file has either fixed or unresolved addresses in it. Then one should compare the symbol table with it to get the new addresses for each symbol. The relocation table is scanned, and replace each address with a new absolute value. If the address is an absolute one, calculate it. If the address is relative, calculate it with respect to zero. Output the new relocation table.