Preliminary Demo by 7pm Friday, February 3rd.
Final Demo by 7pm Friday, February 10th.
Writeup due before class on Monday, February 13th.
This lab is to be done individually.
This lab is worth 25 points.
In this lab, you will construct the ALU (Arithmetic/Logical Unit) for a P37X ISA processor. Before you can build the ALU, you need to create a few building blocks (4-bit adder, 16-bit adder, 16-bit multiplier, 16-bit shifter) which you will then combine to form an ALU.
Before you begin, we have another tutorial for you to walk through: ModelSim simulation tutorial. This tutorial covers simulation of designs for verifying they are correct and debugging them when they are not.
Design and test the following combinational logic structures.
Important
Before you write any Verilog code, first draw a hand-drawn schematic diagram of the circuit with all wires and input/outputs labeled. Why? When designing hardware, even when using Verilog, you need to be thinking explicitly about the structure and interconnectedness of the circuits. Only when the diagram is complete should you write the Verilog code that corresponds to the circuit. As described below, you need to turn in both the hand-drawn schematic and a printout of the Verilog code.
Before creating a 16-bit adder, first create a signed 4-bit ripple-carry adder as a basic building block. It has three inputs: two 4-bit signed values and a 1-bit carry in signal. It two output values: the 4-bit output and a 1-bit carry out signal. You might want to use the 3-input, 2-output single-bit "full adder" you designed in Lab 0 (or an improved version of it) as a building block.
Testing: Test the adder both in simulation and on the board. To test the adder on the board, hook your 4-bit adder inputs to two sets of four input switches on the extension board; hook the outputs to five LEDs on the extension board.
The 16-bit adder takes in two 16-bit signed values and a single-bit carry-in signal. It has a single 16-bit signed output.
Implementation: For comparison purposes, create three different adder implementations (using the 4-bit adder specified above):
In the lab writeup, compare the delay (in nanoseconds) and area (in terms of lookup tables or "LUTs") of these three different adder implementations.
See the CSE371 lecture notes for more information on carry-select adders.
Testing: Test the adder both in simulation and on the board. Unfortunately, the extension boards do not have enough switches to represent two 16-bit inputs. As an incomplete workaround, test the adder on the board by sign extending the two sets of four input switches on the extension board; hook the eight low-order bits of the 16-bit output to the eight LEDs on the extension board. This setup will give you partial test coverage (enough to demonstrate the design is basically working).
The 16-bit multiplier takes in two 16-bit signed values. It has a single 16-bit signed output. The multiplier is single-cycle and fully combinational (in contrast, a sequential multiplier takes multiple cycles and latches intermediate values).
Implementation: The most straightforward implementation uses a chain or tree of 15 sixteen-bit adders you just created to add up the 16 partial values. You'll also need to use some multiplexors, ranged bit selection, and/or other combinational logic. For comparison purposes, create three different multiplier implementations:
Note: As you'll be including the 16-bit adder as a structural component, the textual differences between these multipliers should be minor.
In the lab writeup, explain your general multiplier design and compare its delay using these three adders.
Testing: Test the multiplier much like you tested the 16-bit adder.
The shifter unit has three inputs: a 16-bit value, a 4-bit shift amount, and a 2-bit shift type (00 is left shift, 01 is logical right shift, 10 is arithmetic right shift, 11 is no shift). It has a single 16-bit output.
Implementation: Note, there are several ways to implement this shifter. You could create three different shifters using 2-to-1 MUXes at each level. You would then use a 4-to-1 mux to select among them at the end. An alternative implementation would use four copies of the 4-to-1 MUX to select between the three kinds of shifts and no shift at all at each stage.
Testing: Test the shifters much like you tested the 16-bit adder. Use an additional two switches to specify the specific shift operation.
The ALU has three inputs: two 16-bit signed values and a 4-bit control signal that determines which operation the ALU should perform. The ALU has a single 16-bit signed output, which is the result of the operation. The ALU can perform ten operations:
Description | Insn | Control |
---|---|---|
Addition | ADD | 0 100 |
Subtraction | SUB | 0 101 |
Multiplication | MUL | 0 110 |
Bitwise or | OR | 1 000 |
Bitwise not | NOT | 1 001 |
Bitwise and | AND | 1 010 |
Bitwise xor | XOR | 1 011 |
Shift left logical | SLL | 1 100 |
Shift right logical | SRL | 1 101 |
Shift right arithmetic | SRA | 1 110 |
A few notes:
Implementation: The ALU should instantiate a single 16-bit adder (also used for subtract), a 16-bit multiplier, and a left/right shifter. Using the outputs from these modules and some combinational logic to generate all ten possible values. Finally, use a 16-to-1 multiplexer to select the correct signal.
Testing: Test the shifters much like you tested the 16-bit adder, but use an additional four switches (the small switches on the main FGPA board) as the 4-bit input select.
Don't forget to walk through the ModelSim simulation tutorial before you begin.
This lab should be implemented using only low-level structural Verilog and the assign statement. You are not allowed to use the following Verilog operators: +, -, *, /, <<, >>, etc. However, you are allowed to use the following operators: ~, &, |, ^, ==, !=, ?:, {}, etc. If you're not sure if you're allowed to use a certain Verilog construct, just ask (post a message on the newsgroup, send an e-mail, etc.).
We'll be using an extension board that contains additional LEDs and switches. See lab1.v and lab1.ucf for a top-level Verilog module and mappings for the LED and switch pins.
Note
The switches on the extention boards are "active high", but (as described in the lab 0), the LEDs and the switches on the main board are "active low" signals.
The delay and resource usage of your design can be found in various reports:
When reporting timing results, use the "Post Place and Route Timing" information.
For this lab, there is a preliminary and final demo.
For each of the designs, turn in:
Please interleave the schematics with the Verilog code for each module.
In addition, answer the following questions. When reporting timing results, use the information from the "Post Place and Route Timing" report.
Note
As part of your grade will be determined based on your lab writeups, they should be clear, concise and neat (preferably typed). You could have the greatest design in the world but if you cannot convey your idea clearly to the graders and convince them that it works you will not get good marks. Your lab writeups should include a brief explanation of what the circuits are supposed to do and how they do it.
To earn honors points on this assignment, design a faster adder and multiplier. You can use any of the various techniques discussed in class (e.g., non-uniform segment carry-select adders, carry-lookahead adders, carry-save tree multipliers, modified multi-bit booth multipliers, etc.). Feel free to search on-line for other ideas and techniques not discussed in CSE371 (although you're not allowed to directly copy any code found on-line). In addition, you can't use the Xilinx primitives.
Anyone that creates a faster adder and/or multiplier will earn 10 honor points, each. However, if you make a faster adder, your multiplier must be better than the original multiplier with the newer adder (that is, you have to actually improve the multiplier's design, not just the adder sub-component).
In addition, the designers of the five fastest adders and five fastest multipliers will be given an additional 5 honor points each.
As such, the 30 honor points is the maximum for this assignment.
Note
To receive these points, you must describe your implementation in the lab writeup.