# How Process \* How do we build a m operations? - How do we build a machine to perform these operations? - + From Digital Samples → compressed digital data → Digital Samples - \* With simple gates and registers - + can build a machine to perform any digital computation - + ...if we have enough of them. # **ECONOMY AND UNIVERSALITY** - \* What if we only have a small number of gates? - OR ... how many physical gates do we really need? - + How do we perform computation with minimal hardware? - How do we change the computation performed by our hardware? # LECTURE TOPICS × Setup × Where are we? × Memory One-gate processor × Wide-Word, Stored-Program Processor Contemporary Processors: ARM, Arduino Next Lab DEA \* Store register and gate outputs in memory \* Compute one gate at a time + Using a single physical gate BASIC IDIOM 1. Read gate value from memory 2. Perform operation on gate 3. Write result back to memory | × Fillir | n Missing | GATE AN | | | ND 0 1 2 | | | 010001000001010 | | | |-----------|------------------------------------------------------------------------------------|----------|------|-----|----------|---|---|-----------------|--|--| | t1=t1 t2; | read value in slot 2 and slot 3, perform an OR on | | GATE | OR | 3 | 4 | 3 | 01011101110001 | | | | | ues, and store into slot 3 | one var- | | | | | | | | | | t2=a&c | | | GATE | AND | 0 | 2 | 4 | 01000100001010 | | | | o1=t1 t2; | read value in slot 2 and<br>slot 3, perform an OR on<br>ues, and store into slot 5 | | GATE | OR | 3 | 4 | 5 | | | | ESE150 Spring 2018 STORED-PROGRAM PROCESSOR ESE150 Spring 2018 ### "STORED PROGRAM" COMPUTER - Can build physical machines that perform any computation. - Can be built with limited hardware that is reused in time. - Historically: this was a key contribution of Penn's Moore School - + ENIAC→ EDVAC - Computer Engineers: Eckert and Mauchly - + (often credited to Von Neumann) BASIC IDEA - Express computation in terms of a few primitives - + E.g. Add, Multiply, OR, AND, NAND - \* Provide one of each hardware primitive - × Store intermediates in memory - Sequence operations on hardware to perform larger computation - Store description of operation sequence in memory as well – hence "Stored Program" - By filling in memory, can program to perform any computation 0 ### **BEYOND SINGLE GATE** - \* Single gate extreme to make the high-level point - + Except in some particular cases, not practical - Usually reuse larger blocks - + Adders - + Multipliers - \* Get more done per cycle than one gate - $\,\,^{\times}$ Now it's a matter of engineering the design point - + Where do we want to be between one gate and full circuit extreme? - + How many gate evaluations should we physically compute each cycle? ### ESE150 Spring 2 # × Common to compute on multibit words - + Add two 16b numbers - + Multiply two 16b numbers **WORD-WIDE PROCESSORS** Perform bitwise-XOR on two 32b numbers ### × More hardware + 16 full adders, 32 XOR gates - \* All programmable gates doing the same thing - + So don't require more instruction bits 42 ALU OPS (ON 8BIT WORDS) XOR 00011000 00010100 = xor 0x18 to 0x14 result is: ADD 00011000 00010100 = + Add 0x18 to 0x14 result is: + Add 24 to 20 result is: SUB 00011000 00010100 = + Subtract 0x14 from 0x18 ...result is: INV 00011000 XXXXXXXX = + Invert the bits in 0x18 ...gives us: SLL 00011000 XXXXXXXXX = + Shift left 0x18 ... gives us: ALU-BASED WORD-WIDE PROCESSOR Vigorial of the control con # INSTRUCTIONS: TWO OPERAND \* Typically 2-operand, where one operation is both source and destination \* ADD R1, R2 + Says: R1←R1+R2 \* Use to make code more compact | BRANCH INS | TRUCTIONS | | | | | |------------|-----------|--------------------------|-----------------|-------|---------| | Mnemonics | Operands | Description | Operation | Flags | #Clocks | | RJMP | k | Relative Jump | PC ← PC + k + 1 | None | 2 | | IJMP | | Indirect Jump to (Z) | PC ← Z | None | 2 | | JMP(1) | k | Direct Jump | PC ← k | None | 3 | | BRANCH INS | TRUCTIONS | _ | | | | | Mnemonics | Operands | Description | Operation | Flags | #Clocks | | RCALL | k | Relative Subroutine Call | PC ← PC + k + 1 | None | 3 | | ICALL | | Indirect Call to (Z) | PC ← Z | None | 3 | | ICALL | | | | | 4 | | CALL(1) | k | Direct Subroutine Call | PC ← k | None | 4 | ESE150 Spring 2018 ## **NEXT LAB** - \* Look at Instruction-Level code for Arduino - Understand performance from instruction-level code ESE150 Spring 201 ## **BIG IDEAS** - \* Memory stores data compactly - Can implement large computations on small hardware by reusing hardware in time - + Storing computational state in memory - Can store program control in instruction memory - + Change program by reprogramming memory - + Universal machine: Stored-Program Processor 62 01 ESE150 Spring 2018 ## **LEARN MORE** - × CIS240 processor organization and assembly - × CIS371 implement and optimize processors - + Including FPGA mapping in Verilog 63