# Lecture 17 Instruction Representation III 2010-03-01 Hello to Sherif Kandel listening from Egypt! #### Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Handling a mountain of data! ⇒ Microsoft Live Labs has released a tool to "make it easier to interact with massive amounts of data in ways that are powerful, informative and fun" Imagine being able to look at informative and fun." Imagine being able to look at all of wikipedia or flickr and filter/query very easily. getpivot.com Spring 2010 © UC #### **Review** MIPS Machine Language Instruction: 32 bits representing a single instruction | R | opcode | rs | rt | rd | shamt | funct | |---|--------|----------------|----|-----------|-------|-------| | ı | opcode | rs | rt | immediate | | | | J | opcode | target address | | | | | • Branches use PC-relative addressing, Jumps use absolute addressing. Spring 2010 © UCB #### **Outline** - Disassembly - Pseudoinstructions - "True" Assembly Language (TAL) vs. "MIPS" Assembly Language (MAL) Spring 2010 © UC # **Decoding Machine Language** How do we convert 1s and 0s to assembly language and to C code? Machine language $\Rightarrow$ assembly $\Rightarrow$ C? - For each 32 bits: - 1. Look at opcode to distinguish between R-Format, J-Format, and I-Format. - 2. Use instruction format to determine which fields exist. - 3. Write out MIPS assembly code, converting each field to name, register number/name, or decimal/hex number. - 4. Logically convert this MIPS code into valid C code. Always possible? Unique? Cal Spring 2010 @ ### **Decoding Example (1/7)** Here are six machine language instructions in hexadecimal: > 00001025<sub>hex</sub> 0005402A<sub>hex</sub> 11000003<sub>hex</sub> 00441020<sub>hex</sub> 20A5FFFF<sub>hex</sub> 08100001<sub>hex</sub> - Let the first instruction be at address $4,194,304_{\rm ten}$ (0x00400000 $_{\rm hex}$ ). - Next step: convert hex to binary Spring 2010 © UC ### **Decoding Example (2/7)** • The six machine language instructions in binary: Next step: identify opcode and format | R | 0 | rs | rt | rd | shamt | funct | |----|---------|----|----|-----------|-------|-------| | ı | 1, 4-62 | rs | rt | immediate | | | | .1 | 2013 | | + | + | | | Spring 2010 © U # **Decoding Example (3/7)** Select the opcode (first 6 bits) to determine the format: - 0001000100000000000000000000000011 0000000000100001000000100000 0010000010100101111111111111111111 - Look at opcode: 0 means R-Format, 2 or 3 mean J-Format, otherwise I-Format. #### **Decoding Example (4/7)** Fields separated based on format/opcode: | R | 0 | 0 | 0 | 2 | 0 | 37 | |---|---|-----------|---|---|----|----| | R | 0 | 0 | 5 | 8 | 0 | 42 | | 1 | 4 | 8 | 0 | | +3 | | | R | 0 | 2 | 4 | 2 | 0 | 32 | | 1 | 8 | 5 | 5 | | -1 | | | J | 2 | 1,048,577 | | | | | · Next step: translate ("disassemble") to MIPS assembly instructions #### **Decoding Example (5/7)** • MIPS Assembly (Part 1): #### Address: #### Assembly instructions: | 0x00400000 | or | \$2,\$0,\$0 | |---------------------|------|-------------| | $0 \times 00400004$ | slt | \$8,\$0,\$5 | | $0 \times 00400008$ | beq | \$8,\$0,3 | | 0x0040000c | add | \$2,\$2,\$4 | | 0x00400010 | addi | \$5,\$5,-1 | | 0x00400014 | i | 0x100001 | • Better solution: translate to more meaningful MIPS instructions (fix the branch/jump and add labels, règisters) #### **Decoding Example (6/7)** • MIPS Assembly (Part 2): ``` $v0,$0,$0 or $t0,$0,$a1 slt Loop: $t0,$0,Exit beq add $v0,$v0,$a0 addi $a1,$a1,-1 j Loop Exit: ``` Next step: translate to C code (must be creative!) # **Decoding Example (7/7)** #### **Before Hex:** After C code (Mapping below) \$v0: product 00001025<sub>hex</sub> \$a0: multiplicand 0005402A<sub>hex</sub> \$a1: multiplier 11000003<sub>hex</sub> product = 0; 00441020<sub>hex</sub> while (multiplier > 0) { 20A5FFFF<sub>hex</sub> product += multiplicand; multiplier -= 1; $08100001_{\rm hex}$ \$v0,\$0,\$0 or\$t0,\$0,\$a1 Loop: slt \$t0,\$0,Exit bea add \$v0,\$v0,\$a0 addi \$a1,\$a1,-1 Loop Exit: **Demonstrated Big 61C** Idea: Instructions are just numbers, code is treated like data ### Review from before: lui - •So how does lui help us? - · Example: addi \$t0,\$t0, 0xABABCDCD becomes: \$at, 0xABAB \$at, \$at, 0xCDCD \$t0,\$t0,\$at lui ori - · Now each I-format instruction has only a 16bit immediate. - Wouldn't it be nice if the assembler would this for us automatically? - If number too big, then just automatically replace addi with lui, ori, add # **True Assembly Language (1/3)** - Pseudoinstruction: A MIPS instruction that doesn't turn directly into a machine language instruction, but into other MIPS instructions - What happens with pseudo-instructions? - They're broken up by the assembler into several "real" MIPS instructions. - Some examples follow #### **Example Pseudoinstructions** Register Move ``` move reg2, reg1 Expands to: add reg2,$zero,reg1 ``` Load Immediate ``` reg, value If value fits in 16 bits: addi reg, $zero, value else: reg,upper 16 bits of value lui ori reg, $zero, lower 16 bits ``` # **Example Pseudoinstructions** Load Address: How do we get the address of an instruction or global variable into a register? ``` reg, label Again if value fits in 16 bits: addi reg,$zero,label_value else: reg,upper 16 bits of value lui ``` reg, \$zero, lower 16 bits ori ### True Assembly Language (2/3) - Problem: - · When breaking up a pseudo-instruction, the assembler may need to use an extra - · If it uses any regular register, it'll overwrite whatever the program has put into it. - Solution: - Reserve a register (\$1, called \$at for "assembler temporary") that assembler will use to break up pseudo-instructions. - Since the assembler may use this at any time, it's not safe to code with it. CS81C L17 MIPS Instruction For #### **Example Pseudoinstructions** Rotate Right Instruction ``` reg, value ror Expands to: 0 ||||||||||||0 srl $at, reg, value sll reg, reg, 32-value 0 or reg, reg, $at ``` • "No OPeration" instruction Expands to instruction = $0_{ten}$ , \$0, \$0, 0 # **Example Pseudoinstructions** Wrong operation for operand ``` addu reg, reg, value # should be addiu ``` If value fits in 16 bits, addu is changed to: ``` addiu reg, reg, value else: $at,upper 16 bits of value lui $at,$at,lower 16 bits ori addu reg, reg, $at ``` How do we avoid confusion about whether we are talking about MIPS assembler with or without pseudoinstructions? # **True Assembly Language (3/3)** - MAL (MIPS Assembly Language): the set of instructions that a programmer may use to code in MIPS; this includes pseudoinstructions - TAL (True Assembly Language): set of instructions that can actually get translated into a single machine language instruction (32-bit binary string) - A program must be converted from MAL into TAL before translation into 1s & 0s. oring 2010 © U #### **Questions on Pseudoinstructions** - Question: - How does MIPS assembler / SPIM recognize pseudo-instructions? - Answer: - It looks for officially defined pseudoinstructions, such as ror and move - It looks for special cases where the operand is incorrect for the operation and tries to handle it gracefully CS61C L17 MIPS Instruction Format III (21) pring 2010 © UCB #### **Rewrite TAL as MAL** •TAL: or \$v0,\$0,\$0 Loop: slt \$t0,\$0,\$a1 beq \$t0,\$0,Exit add \$v0,\$v0,\$a0 addi \$a1,\$a1,-1 j Loop Exit: - This time convert to MAL - It's OK for this exercise to make up MAL instructions CS61C L17 MIDS Instruction Format III (22) Spring 2010 © UC #### **Rewrite TAL as MAL (Answer)** • TAL: Loop: \$\frac{\\$v0,\\$0,\\$0}{\\$1t} \\ \\$\text{\$\to,\\$0,\\$a1}{\\$\text{beq}} \\ \\$\text{\$\to,\\$0,\\$xit} \\ \\$\text{add} \\$\text{add} \\ \\$\text{\$\to,\\$0,\\$xit} \\ \\$\text{add} \\ \ext{add} Exit: • MAL: Cal Exit: S61C I 17 MIPS Instruction Format III (23) Spring 2010 © #### In Conclusion - Disassembly is simple and starts by decoding opcode field. - · Be creative, efficient when authoring C - Assembler expands real instruction set (TAL) with pseudoinstructions (MAL) - · Only TAL can be converted to raw binary - · Assembler's job to do conversion - · Assembler uses reserved register \$at - · MAL makes it much easier to write MIPS Spring 2010 © UCI