--CS 162 Notes 2/12/07-- --Topics-- --Process Syncronization with Condition Variables-- --Linkers and Loaders-- --Dynamic Storage Allocation-- --Sharing Main Memory-- --Process Syncronization with Condition Variables-- //Condition Variables Threads cooperate with wait/signal and condition variables. Wait: x.wait waits until some other thread invokes x.signal. Signal: x.signal resumes exactly one suspended process. Broadcast: wake up all processes waiting on the condition variable. If no threads are waiting, signal/broadcast have no effect. //Monitors -Monitors are a higher-level concept than sempahors. They combine three features: .Shared data .Operations on the data .Synchronization, scheduling -They are built on semaphores but are safer and easier to use. //Mesa versus Hoare Semantics In Mesa semantics, the thread calling "wake" keeps the lock. Thus, by the time a woken thread aquires the lock, state may have changed. In Hoare semantics, the woken thread gets the lock immediately and the calling thread waits to aquire the lock. //Monitor Example: Producers and Consumers -Two conditions; nonfull and nonempty. Both conditions describe the buffer. -Append Function: If the buffer is full, call nonfull.wait (that is, until someone signals that the buffer is nonfull). Then signal the condition that the send buffer is not empty (call nonempty.wake). -Remove Function: If the buffer is empty, call nonempty.wait (wait until someone signals that the buffer is nonempty). Then signal the condition that the recieve buffer is nonfull. In both of these this assumes that locks are aquired when entering the procedure and are released when leaving the procedure. //Monitor Example: Disk Head Scheduler -Overview: .Two functions; request and release. Request is called by issuing a command to move head to destination. Release is called after the cylinder is finished. Headpos/Sweep is state of the head. Busy is a flag as to whether or not the disk is busy. Want to schedule the disk arm so that it moves a minimal distance. .Servicing requests in nearest order can end in starvation; ends of disk will be less likely to be read. So we will use the elevator algorithm. -Exact code is in the notes; what follows is the general idea. The point is that monitors are used to queue requests. -Request(dest): If the disk is busy, we have a choice of being scheduled on the next downsweep or the next upsweep. If we choose upsweep we call upsweep.wait(dest). If we choose downsweep we call downsweep.wait(dest). If the disk wasn't busy, then we set busy=true and headpos=dest. -Release: busy=false. If sweeping up: If there is someone on upsweep.queue, do upsweep.signal(), otherwise set direction=down and downsweep.signal(). If we were sweeping down to begin with, do the opposite (that is, if downsweep.queue call downsweep.signal(). Otherwise direction=up and upsweep.signal().) Picture: ----------4 ----------3 ----------2 <--Head is at 2, sweeping down, is busy ----------1 A request(4) would call upsweep.wait(4). A request(1) would call downsweep.wait(1). A request(2) would call downsweep.wait(2), but if it had been sweeping up, it would call upsweep.wait(2). --Linkers and Loaders-- -Object code is divided into three parts: Code, Data, and Stack. We distinguish because we may want to share code between processes, share data between threads, and keep stack private. -Code is generated by compiler from source code. The code contains addresses, but some of these are not know at compile-time. These may not be known because we want to rearrange code during linking or we may not have all the addresses yet (addresses may refer to symbols in other source code files, etc). -The linker combines object files together while the loader takes the object file and fixes the addresses. -when compiling we create a segment table (name, size, and assumed base address of the segment), symbol table (global definitions that may be needed by other segments), and relocation table (table of addresses that will need to be fixed). There are 3 steps in a Linker/Loader: -Determine location of each segment -Caluculate value of symbols and update symbol table -Scan relacation table and relocate addresses. Your relocated addresses may be absolute or relative to a new base address. Linking/Loading in detail: -Collect all pieces of the program, which may include finding libraries (libc.a). -Assign each segment a final location and build the segment table. -Resolve all addresses that can be fixed. .For each symbol table, assign each symbol an address based on the absolute address (or the absolute address relative to zero) .Scan the relocation table and replace values with the new absolute value. --Dynamic Storage Allocation-- -Static allocation isn't sufficient because program behavior can be unpredicable. May depend on recursive precdures, complex data structures, or user input. -Two basic operations in dynamic storage management: Allocate and Free. -Two general ways of doing dynamic storage: .Stack organization: restricted but simple. Allocation and freeing are predictable. They keep free space together. .Heap organization: allocation and release are unpredictable. Used for arbitrary list structures. Memory consists of allocated areas and free ares (or holes). You will inevitably end up with holes, and the goal is to reuse space in holes so that the number of holes is small and their size is large. Fragmentation results when ineffecient use of memory is caused by holes that are too small to be useful. -Alogrithms differ in how the manage this free list. .Best fit: Keep linked list of free blocks, search the list on each allocation, and choose the block that comes closest to matching the needs of the allocation. During relase, merge adjacent free blocks. .First fit: scan the list for the first hole that is large enough. Free excess and merge on releaes. .Next fit: Same as first-fit but pick up where last call left off. -No scheme is necessarily best. Situations can be thought of where the other will perform better. .Knuth claims that if storage is close to running out, it will run out with any scheme and so easiest/most efficient algorithm should be used (usually first fit). -Pools: Keep a separate allocation pool for each popular size. Allocation is fast and no fragmentation. -Reclamation Methods: Easy when data is only used in one place. Hard when information is shared since it can't be reclaimed until all sharers are finished. Sharing is indicated by pointers. //Two problems in reclamation -Dangling pointers: freeing data that is still being used. -Core leaks: better not lose storage by forgetting to free it. -Reference counts: keep track of the number of pointers to each chunk of memory. When it goes to zero, free memory. Works fine for hierarchical structures, but does not work for circular structures. -Garbage Collection: Storage not freed explicitly, but rather implicity; just deletepointers. When the system needs storage it searches through pointers and collects things not being used. This is the only way to reclaim space with circular structres. By they are incredibly difficlt to program and debug, especially if compaction is done as well. .Must be able to find all objects .Must be able to find pointers to all objects .Pass 1: mark: go through all the pointers and mark each object seen, and recurse. .Pass 2: sweep: free those objects not marked -Garbage collection is expensive: 20% or more of CPU time in systems that use it. -Buddy System: Fixed size chunks (such as power of 2), and memory is divided into 2 until the size is just big enough to fit the request. Easy to reclaim memory by rejoining buddies. ----------------- |D| | | | Data fits into space D. When D is freed, the segments of ----------------- memory are joined together into one block again. --Sharing Main Memory-- How to allocate memory to processes. -In uniprogramming with a single segment per process .One program in memory at a time +Highest (or lowest) memory holds OS +Process is allocated memory starting at 0 up to OS area +Old batch systems worked this way and the worst that could happen is destroy the OS (manual reboot) .Advantages: Low overhead, simple, no need for relocation .Disadvantages: No protection which means it can get complete control of the system .Multiprogramming requires swapping entire process in and out, so overhead for swapping and idle time while swapping .Process limited to size of memory .No good way to share memory ---------- | | | | | USER | | | |--------| | OS | ---------- -Relocation: Load program anywhere in memory .Loader/Linker loads program at an arbitrary memory address. .Program can't be moved once loaded. .Essentially same as previous scheme, but the ability to load at any address will be used below ---------- |////////| |--------| | USER | |--------| |////////| |--------| | OS | ---------- -Simple multiprogramming: no protection and one segment per process .Highest/lowest memory holds OS .Processes allocated memory starting 0 up to OS area .Can have several programs in memory at once since each is loaded at diferent non-overlapping addresses .Advantages +Allows multiple programs/users in memory .Disadvantages +No procetction +External fragmentation +Overheard for variable size momery allocation +Still limited to size of physical memory +Hard to increase allocation +Programs are statically loaded, so can't be moved or expanded ---------- | USER3 | |--------| |////////| |--------| | USER2 | |--------| | USER1 | |--------| |////////| |--------| | OS | ---------- -Dynamic memory relocation: change addresses of a program during every reference ------- ------- ---------- | CPU |---| MMU |---| Memory | ------- ------- ---------- .Each program-generated address is translated by hardware to a physical address and happens as part of each memory reference. .Leads to two views of memory: virtual address space and real address space. Each process has its own virtual address space. -Base and bounds relocation .Two hardware registers: base register and bounds register that indicates the last valid address the process may generate +Real address=base+virtual (as long as virtual < bounds) .Advantages +Each process appears to have a completely private memeory of size bounds+1 +Protection between processes +No address relocation is neccesary when a process is loaded (just update base register) +Task switching is cheap: just reload registers. High overhead to load from disk since entire process has to be loaded into memory +Compaction possible by OS by changing base registers .Disadvantages +Still limited to size of main memeory +External fragmentation (between processes) +Overhead for variable size spaces in memory +Sharing difficult: only by overlapping +Only one segment (region of memory) +Need special hardware, and the time for the translation isn't free .OS must be able to change value of relocation registers .Users must not be able to change their registers .How does OS regain control? +Entered on a trap or interrupt +Disable base/bounds registers .Base and bounds is cheap and fast -Three types of systems using base/bounds registers .Uniprogramming - single user region .Multiprogramming with fixed partitions. User goes into region of given size. Not flexible. .Multiprogramming with variable sized partitions -Task Switching .Don't have to reload memory, so can be done quickly .Can run processes which are not in memory: Put an active process onto disk, replace the space with the new process -Multiple Segments: Segmentation .Divide virtual address space into sgements .Use seperate base and bound for each segment, as well as protection bits (read/write/execute/valid/dirty) -Segments typically associated with logacial partitions: cod, data, stack, etc. Or can do each module, procedure, etc. .Need either segment table or segment registers to hold base/bounds for each segment Segment Table --------------------------- |Segment|Base|Bounds|Flags| |-------------------------| | 0| n| m| x| | 1| ...| ...| ...| | 2| ...| ...| ...| --------------------------- .Memory mapping consists of table lookup, add, and compare (last two can be done in parallel) -Addreess Translation .Use segment table entries (STE), i.e. a row from the above picture .Need hardware to automatically map virtual (segment number, word number) to real address. Real = seg_table(seg#) + word_number +Invalid if word_number > bounds .Valid and permission bits must be on .Have segment table base register (STBR) for hardware to use to point to the segment table. +Can switch processes by just changing the STBR. -If you have a small number of segments, can have one register per segment instead of a table. -Can also multiplex a small number of segment registers among a larger number of segments -Advantages .Each process has own virtual address space .Proctection between address spaces .Seperate protection between segments .Virtual address space can be larger than physical memory .Unused segments don't need to be loaded .Can share one or more segments .Segment can be placed anywhere in memory .Memory compaction easy .Segment sizes can be change independently -Disadvantges .Each segment must be allocated contiguously .Segment size < memory size .External fragmantation .Overhead of allocating memory .Need hardware for address translation .Space for segment table -Need one segment table per process. Sharing the segment table would be a protection problem .When switching processes, we reload STBR which changes address space