Project 3: CAL-16 implemented in Logisim

Overview

This project involves creating a Logisim design of the CAL-16 processor for which you wrote an assembler in project 1. The functional behavior of the CAL-16 processor is described in Project 1. The format and the register transfer semantics of each instruction are the same as project 1 and given below. When you wrote your assembler you were only concerned with the format. Now you have learned the MIPS instruction set and used an instruction interpreter (MARS). You have also used a CAL-16 interpreter. A computer is just an instruction interpreter built out of gates and flip-flops.

When you review the CAL-16 you will notice that is similar in spirit to the MIPS, but different in detail, and much simpler. You can use your assembler to write test cases for your processor when you get tired of writing in binary. If there are any differences from the earlier instruction set, use what is described here. In order to load your object file into a Logisim RAM, it needs to begin with a format line containing "v2.0 raw". You can use the little script in ~cs61cl/code/ltool to insert that line in your .o file.

Administrative information

Submit a solution by 11:59pm on November, 15 (** extended to 11/16 **). Your submission directory, called proj3, should include a file named cal16.circ and a file named either readme.txt, readme.doc, or readme.pdf.

This is not a partnership assignment. Hand in your own work.

Along the way you will encounter three checkpoints in which you'll display progress on the project to your t.a. We will use them to keep track of your progress, avoid problems and eliminate any uncertainties in the project specification.

You have three grace days (72 grace hours) to apply toward late project submissions over the entire term. Don't use them all at once. Start early. When you get confused, ask for clarification. Get easy pieces working and then move forward. The time that you submit a solution online and not the time you actually get it evaluated by a reader or t.a. is what determines how many grace hours you are charged.

CAL-16 information

The CAL-16 is a load-store architecture like the MIPS, but the word width is only 16 bits. It has 16 general-purpose registers. Register $0 always contains 0, just like in MIPS. Memory addresses for loads, stores, jumps and branches are word-aligned and transfer a full 16-bit word. If execution encounters a illegal address or an invalid op code it should halt. We will not be implementing the instructions with opcodes 9, D, and E; treat them as illegal. Immediates are sign-extended to 16 bits as indicated by the Sx function in the RTL below. Note that the jump-register instruction is really JALR. There is no overflow detection. The instructions are pretty easy to read in hex because each field is one or more hex digits. The instructions are arranged to make them easy to decode.

OP	Q2	Q1	Q0	MNE	Semantics
0	a	d	b	add	R[d] := R[a] + R[b]	PC := PC+2
1	a	d	b	or	R[d] := R[a] \| R[b]	PC := PC+2
2	a	d	b	xor	R[d] := R[a] ^ R[b]	PC := PC+2
3	a	d	b	and	R[d] := R[a] & R[b]	PC := PC+2
4	a	d	off4	addi	R[d] := R[a] + Sx(off4)	PC := PC+2
5	a	d	off4	rotr	R[d] := R[a] rot off4	PC := PC+2
6	a	d	off4	st	Mem(R[a] + Sx(off4)) := R[d]	PC := PC+2
7	a	d	off4	ld	R[d] := Mem(R[a] + Sx(off4))	PC := PC+2
8	a	off8		ldi	R[a] := zeroExtend(off8)	PC := PC+2
9
A	a	off8		bneg		PC := (R[a] < 0) ? PC + (2 * Sx(off8)) : PC+2
B	a	off8		bz		PC := (R[a] == 0) ? PC + (2 * Sx(off8)) : PC+2
C	a	d	off4	jr	R[d] := PC	PC := R[a]+Sx(off4)
D
E
F	off12			jmp		PC := (PC & 0xE000) \| (2 * off12)

In addition to the registers and PC, your machine contains a single memory consisting of 1024 16-bit words, i.e., two kilobytes. This will be implemented using the Logisim built-in RAM module.

This machine has very few instructions. Many of the ones you are used to need to be implemented in software using combinations of these simple ones. There is no subtract; that is implemented in software using XOR, ADDI, and ADD. There is no LUI; you need to LDI and ROTR. Your assembler implemented llo and lhi to pull out the low and high part of the immediate value, but both generate a LDI instruction.
There is no store byte. You need to load, merge and store one aligned 16-bit word at a time. There are no pseudoinstructions (unless your assembler provides them).

An interpreter for the CAL-16 instructions is available in the directory ~cs61cl/code/CAL16.sim. You may find it useful in checking the behavior of your implementation.

An assembler is available in ~cs61cl/bin/arch/<PLATFORM>/asm16, where <PLATFORM> is i386 on the Macs and sun4v on the sparc servers.

To aid you in tackling this project we are breaking it into checkpoints. You can think of this as being similar to any development effort in industry. You are working on it over a period of time, but along the way you are generating clearly documented portions of the solution for feedback. You have actually already started the first checkpoint.

The first thing to do

Copy ~cs61cl/code/CAL-16-template.circ to a file called CAL16.circ. It already includes built-in libraries:

Memory
Arithmetic
Plexers
Input/output

Things to implement in the first checkpoint

The first checkpoint involves implementing several basic components that you will need for your datapath. They are listed below. You are given subcircuit templates for each of these. Do not change the orientation or relative placement of the input and output pins. Otherwise it will modify the symbol that appears in the main schematic.

Register File

Reg16: 16-bit register with asynchronous clr and selective load. (This is implemented for you. It has the clock connected internally.)

Inputs

D[15:0]
ld
clr

Outputs:

Q[15:0]

RegFile: 16x16-bit register file, 2 read ports and 1 write port. R0 always reads as 0. If no register write is desired, Dsel should be set to 0. (This should be implemented with reg16s.)

Inputs

D[15:0]
Asel[3:0]
Bsel[3:0]
Dsel[3:0]
clr

Outputs

A[15:0]
B[15:0]

ALU

The specification of the ALU is derived from the instruction set, but is also fairly general purpose.

Inputs

A[15:0]
B[15:0]
op[2:0]
Cin

Outputs

R[15:0]
Cout

The functional behavior is given by the following table.

op	operation
0 0 0	and	R = A & B
0 0 1	or	R = A \| B
0 1 0	xor	R = A ^ B
0 1 1	add	Cout, R = A + B + Cin
1 0 0	passB	R = B
1 0 1	passA	R = A

Rotate-Right Unit

The rotate-right subcircuit rotates the input right to the output. In C, assuming 16-bit ints, this would be

    out = in >> rot  |  in << 16-rot

Inputs

in[15:0]
rot[3:0]

Outputs

out[15:0]

Here's an example. Suppose register 5 contains 0xb39f = 1011001110001111 (base 2). Then rotating right by 1 gives 0xd9c7 = 1101100111000111 base 2. Rotating that by 5 gives 0x3ece = 0011111011001110 base 2, and rotating that by 10 gives the original value of 0xb39f.

Instruction Register and Basic Decode

It will help you to build a simple subcircuit that splits the instruction into its six possible fields: op, ra, rb, rd, im8, im12. This will simplify your wiring a lot later. Keep these signals in mind as you build up your datapath.

Checkpoint 1 (up to 6 points, awarded in lab on 11/3-4)

Present to your lab t.a. or reader three diagrams and tables:
- a datapath diagram analogous to P&H Figure 4.1 (fourth edition) or 5.1 (third edition), but for CAL16.
- a diagram analogous to P&H Figure 4.2 (fourth edition) or 5.2 (third edition) that shows control signals as well as data path;
- a truth table relating each op code (inputs) to corresponding control signal values (outputs).
These will end up as part of your readme file.
Tested implementations of three of the four components:
- the register file;
- the ALU;
- the rotate-right units; or
- the basic decoder.

Suggested approach for checkpoint 1

We encourage you to plan before "coding". One part of the plan is the two diagrams; the paper versions should guide your Logisim implementation. The second is the truth table, which relates op codes to control signals—not only those that drive various multiplexors, but also the inputs that specify ALU operations. P&H Figures 4.12 and 4.13 (fourth edition) or 5.12 and 5.13 (third edition) are good models for these. A spreadsheet is a good tool for this. Instruction opcodes define the rows, control points define the columns. Each cell is the output as a function of the state and the signal inputs.

As you finish each of the main components—register, register file, ALU, rotate-right, and decoder—drop it into your main circuit as a starting point for your design and build around it. Make sure each component has the pin labels. Add text in the middle of each component so that it is clear what it is.

Logically the rotate-right component sits in parallel with the ALU. You probably want to put one above the other and wire the outputs into a 2-1 16-bit Mux.

Checkpoint 2 (up to 6 points, awarded in lab on 11/5-6)

First, guided by your truth table, wire together the register file, ALU, rotate-right, and any supporting subcircuits or MUXes. Arrange them so that all the control signals run nicely down to the bottom. You now have most of what you need to test this on arithmetic and logical instructions. Drop an input on the control signals and drop in an instruction decoder with an input. Wire up your register file selectors. You can poke the decoder input to set things up. Poke the control inputs as appropriate and see that the contents of the register file update properly.

Then, include a memory and an instruction register. Refer to the Register and Memory lab to see how to tri-state the drivers on the data bus.

Finally, implement your datapath control logic. In a few cases this can be done by just wiring certain outputs of the instruction decoder. Where it is more complex, it is usually best to create a subcircuit that decodes fields from the instruction and produces the control signals for a portion of the datapath as outputs. Then you can move gates around in the decode logic without messing with your top-level schematic. Or you can use a MUX to provide "microprogrammable control". For example, it can have the opcode as the select input and each of the inputs are the control setting for that opcode. MUXs can have multiple, multi-bit data inputs.

If you need extra register transfer language to represent particular details of your implementation, include it in your readme file.

At this point you should be able to perform any single non-control-flow instruction by poking the instruction into the decoder and cycling the clock. You can poke at the RAM module to initialize words. You can also save and load it to or from a file.

Suggested approach for checkpoint 2

You will have to decide whether you want to drop MUXes and such into main or create subcircuits to do the work of routing data from place to place. Document this choice in your readme file.

The "probe" becomes your best friend here. You can attach it to a wire and use it to label the wire. You can set the radix of the display (binary, octal, hex, decimal, signed decimal) to make it easy to read. Arrange your circuit so that you can put probes on the inputs and outputs of the register file and ALU so that you can see what is going on. To test this, you will want to first perform the LI or ADDI instruction so that you can put values into the register file. Then you can test the arithmetic and logical instructions.

Add a reset pin that clears the register file and registers so that you can get your design easily to a known state.

RAM Modules

Logisim RAM modules can be found in the built-in memory library. To add the library to your project, select Project/Load Library/Built-in Library... and select the Memory module.

The best way to learn how these work is simply to play with them. In any case, here's a little bit of info to help you get started. The Logisim help page on RAM modules is not terribly helpful. A chooses which address will be accessed (if any). sel essentially determines whether or not the RAM module is active (if sel is low, D is undefined). The clock input provides synchronization for memory writes. out determines whether or not memory is being read or written. If out is high, then D will be driven with the contents of memory at address A. clr will instantly set all contents of memory to 0 if high. D acts as both data in and data out for this module. This means that you must use a controlled buffer on the input of D to prevent conflicts between data being driven in and the contents of memory. The "poke" tool can be used to modify the contents of the module. RAM modules can also be loaded from files using "right-click/Load Image..."

Checkpoint 3 (up to 6 points, awarded in lab on 11/12-13)

Augment your data path with a sequencer, including the PC and an instruction register (IR). Provide the interconnections required to update the PC for sequencing, jumps, indirect jumps, and branches. Add the interconnections required to fetch instructions, i.e., IR = MEM[PC]. (You won't need an nPC). Add the control logic and verify that you can execute sequences of instructions. Then add support for jumps and branches. You may find that you need to modify your datapath to support indirect jumps (JR) if you didn't pick up that case in your analysis for checkpoint 1.

Each instruction must take two cycles: one to fetch the instruction on the IR and one to decode and execute the instruction. The design must use a single RAM for instructions and data. This is true in the final submission as well.

Grading

Your grade will be based partly on correct behavior on test cases, partly on "beauty" (organization of your solution into subcircuits, neat layout, etc.), and partly on the readme file that explains various aspects of your solution.

Your submitted solution will be graded in lab the week of 11/17. T.a.s and readers will run your simulation on a sequence of test programs, each of which starts with given contents in RAM and leaves other contents in RAM. Testing will continue until you fail one of these tests, at which point you get a correctness score based on the number of tests you passed. The t.a. or reader will then give you a beauty score. Writeups will be graded separately.

As noted earlier, your readme file should include the datapath and control diagrams implemented by your simulation, as well as the truth table that relates op codes to control signals. Any register transfers that aren't already documented should appear in the readme. It should also explain and answer design questions such as

what design decisions you encountered, and how you resolved them;
what's good and what's not so good about your design;
what bugs you encountered; and
what you would have done differently.

Miscellaneous overall requirements

You may use any built-in Logisim library circuit components in implementing your CPU. They are already included in your template and we encourage you to make use of them.

You may use the Logisim tools to generate circuits from truth tables.

You must have your instruction/data RAM module visible from the main circuit of your Logisim project. The design uses a single RAM for instructions and data. It is 16-bits wide, byte addressed. All operations access 16-bit words - using the byte address of the word.

Logisim allows you to send a clock through a gate. Do not do that. In real life, this is rarely done and there are always better ways to accomplish the same effect of running a clock through a gate. Any solution that uses a gated clock will be penalized heavily. We encourage you to connect the clock element directly to flip-flops and registers. Thus all parts of the design are clocked together. The register in the template does this properly.

Use the label tool and layout wires and components so that it is easy to see the organization. It will make debugging a lot simpler.

You must use subcircuits. They make it easier for you, they make it easier for us. Give your subcircuits appropriate labels as well. Keep in mind that this won't be autograded, and humans will have to look at your schematics (and grade them!). Excessively cluttered implementations will lose points!

Use multi-bit buses in Logisim! You will have to use splitters to get individual bits from the wire or to combine the bits to get a multi-bit bus! If you don't know how, this is the time to learn! If you're not sure how to use these, look back at the labs. If you're still confused, ask us or your peers!

Use multi-bit inputs and outputs. It is difficult to debug if you have 16+ individual inputs scattered throughout your circuit.

Hints and comments

Logisim offers some functionality for automating circuit implementation given a truth table, or vice versa. You may use this, but make sure that you could do the transformation yourself. You will be expected to be able to generate Boolean logic and the circuits without such aids.

You are welcome to add input and output features to your design.

You will want to generate little test cases as you go along. When you get to checkpoint 3, these will be short sequences of instructions. We encourage you to publish your test cases in CAL-16 assembly or binary on the forum and to participate in the testing discussion. The RAM can be loaded with files containing words as hex values. This was the format we used for your assembler. You will be able to load the fully resolved instruction output of your assembler.

We will be providing a set of test files. You will load them into the RAM and run them till the machine stops. A correct solution will produce valid contents in the RAM. For example, a certain array of values will end up sorted or a value will contain the sum of an array.