Part A: addi
In this part, you will design a skeleton CPU that can execute the addi
instruction.
Task 1: Arithmetic Logic Unit (ALU)
Fill in the ALU in alu.circ
so that it can perform the required arithmetic calculations.
Input Name | Bit Width | Description |
---|---|---|
A | 32 | Data to use for Input A in the ALU operation |
B | 32 | Data to use for Input B in the ALU operation |
ALUSel | 4 | Selects which operation the ALU should perform (see the list of operations with corresponding switch values below) |
Output Name | Bit Width | Description |
---|---|---|
Result | 32 | Result of the ALU operation |
Below is the list of ALU operations for you to implement, along with their associated ALUSel values. add
is already made for you. You are allowed and encouraged to use built-in Logisim components to implement the arithmetic operations.
ALUSel Value | Instruction |
---|---|
0 | add: Result = A + B |
1 | sll: Result = A << B |
2 | slt: Result = (A < B (signed)) ? 1 : 0 |
3 | Unused |
4 | xor: Result = A ^ B |
5 | srl: Result = (unsigned) A >> B |
6 | or: Result = A | B |
7 | and: Result = A & B |
8 | mul: Result = (signed) (A * B)[31:0] |
9 | mulh: Result = (signed) (A * B)[63:32] |
10 | Unused |
11 | mulhu: Result = (A * B)[63:32] |
12 | sub: Result = A - B |
13 | sra: Result = (signed) A >> B |
14 | Unused |
15 | bsel: Result = B |
Some additional tips:
- You might find bit splitters or extenders useful when implementing shift operations.
- The result of multiplying 2 32-bit numbers can be up to 64 bits of information, but we're limited to 32-bit data lines, so
mulh
andmulhu
are used to get the upper 32 bits of the product. TheMultiplier
component has aCarry Out
output, with the description: "the upper bits of the product". This might be particularly useful for certain multiply operations. - The comparator component might be useful for implementing instructions that involve comparing inputs.
- A multiplexer (MUX) might be useful when deciding between operation outputs. In other words, consider simply processing the input for all operations, and then outputting the one of your choice.
- The ALU tests for Part A only use ALUSel values for defined instructions, so your design doesn't need to worry about the unused values.
Testing
Here's a companion video that goes with this section.
We've provided some tests for each task, located in subdirectories under tests/
. For example, the ALU tests are in tests/part-a/alu/
. When these tests are run, the outputs from your circuits are saved in a tests/part-a/alu/student-output/
subdirectory.
For example, to run the ALU tests:
You can also specify a single test circuit, or a grandparent/great-grandparent directory:
After the tests finish running, your ALU circuit's outputs will be saved under tests/part-a/alu/student-output/
with a -student.out
suffix (e.g. alu-add-student.out
). The corresponding reference outputs can be found at tests/part-a/alu/reference-output/
with a -ref.out
suffix (e.g. alu-add-ref.out
).
We've also provided format-output.py
, which accepts a path to an output file and displays the output in a more readable format (left-aligned hexadecimal numbers). For example, to get the reference output of the alu-add
sanity test in readable format, you would do:
If you want to see the difference between your output and the reference solution, you can put the readable outputs into temporary files and diff
them. For example, for the alu-add
test, you would do:
Note: If the lines are wrapping, try resizing your terminal window (or try a slightly smaller font). Or see the following note.
Experimental Note
Here's a companion video that goes with this note.
Each output file is technically a valid CSV file, so you can also import the output in a spreadsheet app if you really want to crunch the numbers (or you really hate tables in terminal). If the app requires a .csv
extension, you can use
and import the resulting .csv
file.
Debugging
Here's a companion video that goes with this section.
Similar to how you can step through your C code in GDB, you can also step through the test circuits in Logisim! For this example we'll be using the alu-add
test.
Open tests/part-a/alu/alu-add.circ
in Logisim. Among other things, one ROM (read-only memory) each feeds into the Input_A
, Input_B
, and ALUSel
tunnels. These tunnels then feed into your ALU near the upper right. The ROMs contain corresponding values for the test being considered. Every clock cycle, the adder on the top left increments by one, which advances each ROM by one entry, thus feeding the next set of inputs to your ALU. If you tick the circuit a couple times (File -> Tick Full Cycle
or the corresponding keyboard shortcut), you can see the test circuit advance through each set of inputs and your ALU's corresponding outputs. If you want to start over, use Simulate -> Reset Simulation
(Command/Control + R
).
Now, let's see what your ALU is actually doing with the inputs. Right-click your ALU, and select View alu
. Your ALU circuit will appear, with the input values for the current test cycle already on the ALU input pins. With this, you can see exactly what your ALU is doing in every line from the output files! The Poke Tool
will be very useful here.
Note: Edits to the test circuit, including the ALU we just inspected, will not be saved. Avoid making edits in the test circuit, as they may be lost!
Task 2: Register File (RegFile)
Fill in regfile.circ
so that it contains 32 registers that can be written to and read from.
Input Name | Bit Width | Description |
---|---|---|
rs1 | 5 | Determines which register's value is sent to the Read_Data_1 output |
rs2 | 5 | Determines which register's value is sent to the Read_Data_2 output |
rd | 5 | The register to write to on the next rising edge of the clock (if RegWEn is 1) |
Write_Data | 32 | The data to write into rd on the next rising edge of the clock (if RegWEn is 1) |
RegWEn | 1 | Determines whether data is written to the register file on the next rising edge of the clock |
clk | 1 | Clock input |
Output Name | Bit Width | Description |
---|---|---|
Read_Data_1 | 32 | The value of the register identified by rs1 |
Read_Data_2 | 32 | The value of the register identified by rs2 |
ra | 32 | The value of ra (x1 ) |
sp | 32 | The value of sp (x2 ) |
t0 | 32 | The value of t0 (x5 ) |
t1 | 32 | The value of t1 (x6 ) |
t2 | 32 | The value of t2 (x7 ) |
s0 | 32 | The value of s0 (x8 ) |
s1 | 32 | The value of s1 (x9 ) |
a0 | 32 | The value of a0 (x10 ) |
- The 8 constant output registers are included in the output of the
regfile
circuit for testing and debugging purposes. Make sure to connect these 8 output pins to their corresponding registers. - The
x0
register should always contain the 0 value, even if an instruction tries writing to it.
Some additional tips:
- Take advantage of copy-paste! It might be a good idea to make one register completely and use it as a template for the others to avoid repetitive work. You can duplicate a selected component or group of components in Logisim using
Ctrl/Cmd + D
. - The
Enable
pin on the built-in register may come in handy.
Testing
We've provided the autograder RegFile tests in the tests/part-a/regfile/
directory. You can run these with:
Refer to the testing section of Task 1 for more information on test outputs.
Task 3: Immediate Generator
For the rest of Part A, we will be creating just enough of the CPU to execute the addi
instruction. In Part B, you will revisit these circuits and expand them to support more instructions.
Fill in the immediate generator in imm-gen.circ
(not the imm_gen
subcircuit in cpu.circ
) so that it can generate immediates for the addi
instruction. You can ignore other immediate types for now.
Input Name | Bit Width | Description |
---|---|---|
inst | 32 | The instruction being executed |
ImmSel | 3 | Value determining how to reconstruct the immediate (you can ignore this for now) |
imm | 32 | Value of the immediate in the instruction (assume the instruction is addi for now) |
Output Name | Bit Width | Description |
---|---|---|
imm | 32 | Value of the immediate in the instruction (assume the instruction is addi for now) |
Task 4: Datapath
Fill in cpu.circ
so that it contains a datapath for a single-cycle (not pipelined) processor that can execute the addi
instruction.
Here are the inputs and outputs to the processor. You can leave most of them unchanged in this task, since they are not needed for the addi
instruction.
Input Name | Bit Width | Description |
---|---|---|
READ_DATA | 32 | Data at WRITE_ADDRESS from memory |
INSTRUCTION | 32 | The instruction at memory address PROGRAM_COUNTER |
CLOCK | 1 | Clock input |
Output Name | Bit Width | Description |
---|---|---|
ra | 32 | The value of ra (x1 ) |
sp | 32 | The value of sp (x2 ) |
t0 | 32 | The value of t0 (x5 ) |
t1 | 32 | The value of t1 (x6 ) |
t2 | 32 | The value of t2 (x7 ) |
s0 | 32 | The value of s0 (x8 ) |
s1 | 32 | The value of s1 (x9 ) |
a0 | 32 | The value of a0 (x10 ) |
tohost | 32 | The contents of CSR |
WRITE_ADDRESS | 32 | The address in memory to read from or write to |
WRITE_DATA | 32 | Data to write to memory |
WRITE_ENABLE | 4 | The write enable mask for writing data to memory |
PROGRAM_COUNTER | 32 | Address of the INSTRUCTION input |
We know that trying to build a datapath from scratch might be intimidating, so the rest of this section offers more detailed guidance for creating your processor.
Recall the five stages for executing an instruction:
- Instruction Fetch (IF)
- Instruction Decode (ID)
- Execute (EX)
- Memory (MEM)
- Write Back (WB)
Task 4.1: Instruction Fetch
We have already provided a simple implementation of the program counter. It is a 32-bit register that increments by 4 on each clock cycle. The PROGRAM_COUNTER
is connected to IMEM (instruction memory), and the INSTRUCTION
is returned from IMEM.
Nothing for you to implement in this sub-task!
Task 4.2: Instruction Decode
In this step, we need to break down the INSTRUCTION
input and send the bits to the right subcircuits.
What type of instruction is addi
? What are the different fields in the instruction, and which bits correspond to each field?
addi
is an I-type instruction. The fields are:
imm [31-20]
rs1 [19-15]
funct3 [14-12]
rd [11-7]
opcode [6-0]
In Logisim, what tool would you use to split out different groups of bits?
Use the splitter to extract each of the 5 fields from the instruction.
Which fields should connect to the register file? Which inputs of the register file should they connect to?
The rs1
bits you split from the instruction should connect to rs1
on the regfile.
The rd
bits you split from the instruction should connect to rd
on the regfile.
I-type instructions don't have rs2
so we can ignore rs2
for now.
Remember to connect the clock to the register file!
What needs to be connected to the immediate generator?
Connect the INSTRUCTION
to the immediate generator. Your immediate generator from the previous task should take the instruction and output the correct immediate for you.
Task 4.3: Execute
In this step, we will use the decoded instruction fields to compute the actual instruction.
What two data values (A
and B
) should the addi
instruction input to the ALU?
Input A
should be the read_data_1
from the regfile.
Input B
should be the immediate from the immediate generator.
What ALUSel
value should the instruction input to the ALU?
ALUSel
selects which computation the ALU will perform. Since we only care about implementing addi
for now, we can hard-code ALU to always select the add
operation (ALUSel = 0000
).
Task 4.4: Memory
The addi
instruction doesn't use memory, so there's nothing for you to implement in this sub-task!
The memory stage is where the memory can be written to using store instructions and read from using load instructions. Because the addi
instruction does not use memory, we do not have to worry about it for Part A. Please ignore the DMEM and leave its I/O pins undriven.
Task 4.5: Write Back
In this step, we will write the result of our addi
instruction back into a register.
What data is the addi
instruction writing, and where is the instruction writing this data to?
addi
takes the result of the addition computation (from the ALU output) and writes it to the register rd
.
Connect the ALU Result
to Write_Data
on the regfile.
Since the addi
instruction always writes to a register, you can hard-wire RegWEn
to 1
for now so that register writes are always enabled.
Testing
Here's a companion video that goes with this section.
Each CPU test is a copy of the run.circ
file included with the starter code that has instructions loaded into its IMEM. When you run the test.py
script, it runs Logisim in the background. The clock ticks, the program counter is incremented, and the values in each of the outputs is printed to a .out
file in the student-outputs
directory.
Let's take the single-cycle cpu-addi-basic
sanity test as an example. It has 4 addi
instructions (see tests/part-a/addi/inputs/cpu-addi-basic.s
). Open tests/part-a/addi/cpu-addi-basic.circ
in Logisim, and take a closer look at the various parts of the test file. At the top, you'll see the place where your CPU is connected to the test outputs. With the starter code, you'll see lots of UUUU
s or XXXX
s; when your CPU is working, this should not be the case. Your CPU takes in one input (INSTRUCTION
), and along with the values in each of the registers, it has an additional output: PROGRAM_COUNTER
, or the address of the instruction to be fetched from IMEM to be executed the next clock cycle.
As you can see, there are many specifically-positioned wires connected to specific input/output pins on your CPU. Make sure that you do not edit the provided input/output pins or add new ones, as this will change the shape of the CPU circuit, and as a result the connections in the test files may no longer work properly.
Below the CPU, you'll see instruction memory. The hex for the addi
instructions has been loaded into instruction memory. Instruction memory takes in one input (called PROGRAM_COUNTER
) and outputs the instruction at that address. PROGRAM_COUNTER
is a 32-bit value, but because Logisim caps the size of ROM units at 2^16 bytes, we have to use a splitter to get only 14 bits from PROGRAM_COUNTER
(ignoring the bottommost two bits). Notice that PROGRAM_COUNTER
is a byte address, not a word address.
So what happens when the clock ticks? Each tick of the clock increments an input in the test file called Time_Step
. The clock will continue to tick until Time_Step
is equal to the halting constant for that test file (for this particular test file, the halting constant is 6). At that point, the test.py
script will print the values in each of the outputs to the respective .out
file. Our tests will compare this output to the expected; if your output is different, you will fail the test.
We've provided the autograder tests for addi
(Task 3) in the tests/part-a/addi/
directory. You can run these with:
Task 5: Part A README Update
Your last task for Part A is to fill in the README.md
. Write down how you implemented your circuits and components for this part (including ALU and RegFile, since you used them for addi!), and explain the reasoning behind your design choices. There's no specific format or length requirement here, so feel free to get creative!
Submission and Grading
Submit your repository to the Project 3A assignment on Gradescope. The autograder tests for Part A are the same as the tests you are running locally. Part A is worth 20% of your overall Project 3 grade.
- ALU (7)
- RegFile (8)
addi
(5)
Total: 20 points