UC Berkeley CS150

Checkpoint1: Pipelined MIPS150

Introduction

The first step of the project will be to implement a 3 stage pipelined MIPS. This MIPS will be used to coordinate the functionality of the entire project. A high level overview of the final system can be seen below.

MIPS150Top

Each of the four checkpoints is indicated as follows.

The ISA

The MIPS150 processor implements a useful subset of the entire MIPS instruction set architecture. Examples of what is not included are divide and multiple instructions, trap and exception instructions, coprocessor instructions, and floating point instructions. The entire MIPS150 encoding for the instruction set architecture is documented in the table below.

MIPS150ISA

The functionality of each instruction is shown below. Where,

Mnemonic RTL Description Notes
LB R[$rt] = SEXT(BMEM[(R[$rs]+SEXT(imm))[31:0]]) delayed
LH R[$rt] = SEXT(HMEM[(R[$rs]+SEXT(imm))[31:1]]) delayed
LW R[$rt] = WMEM[(R[$rs]+SEXT(imm))[31:2]] delayed
LBU R[$rt] = ZEXT(BMEM[(R[$rs]+SEXT(imm))[31:0]]) delayed
LHU R[$rt] = ZEXT(HMEM[(R[$rs]+SEXT(imm))[31:1]]) delayed
SB BMEM[(R[$rs]+SEXT(imm))[31:0]] = R[$rt][7:0]
SH HMEM[(R[$rs]+SEXT(imm))[31:1]] = R[$rt][15:0]
SW WMEM[(R[$rs]+SEXT(imm))[31:2]] = R[$rt]
ADDIU R[$rt] = R[$rs] + SEXT(imm)
SLTI R[$rt] = R[$rs] < SEXT(imm)
SLTIU R[$rt] = R[$rs] < SEXT(imm) unsigned compare
ANDI R[$rt] = R[$rs] & ZEXT(imm)
ORI R[$rt] = R[$rs] | ZEXT(imm)
XORI R[$rt] = R[$rs] ^ ZEXT(imm)
LUI R[$rt] = {imm, 16'b0}
SLL R[$rd] = R[$rt] << shamt
SRL R[$rd] = R[$rt] >> shamt
SRA R[$rd] = R[$rt] >>> shamt
SLLV R[$rd] = R[$rt] << R[$rs]
SRLV R[$rd] = R[$rt] >> R[$rs]
SRAV R[$rd] = R[$rt] >>> R[$rs]
ADDU R[$rd] = R[$rs] + R[$rt]
SUBU R[$rd] = R[$rs] - R[$rt]
AND R[$rd] = R[$rs] & R[$rt]
OR R[$rd] = R[$rs] | R[$rt]
XOR R[$rd] = R[$rs] ^ R[$rt]
NOR R[$rd] = ~R[$rs] & ~R[$rt]
SLT R[$rd] = R[$rs] < R[$rt]
SLTU R[$rd] = R[$rs] < R[$rt] unsigned compare
J PC = {PC[31:28], target, 2'b0} delayed
JAL R[31] = PC + 8; PC = {PC[31:28], target, 2'b0} delayed
JR PC = R[$rs] delayed
JALR R[$rd] = PC + 8; PC = R[$rs] delayed
BEQ PC = PC + 4 + (R[$rs] == R[$rt] ? SEXT(imm) << 2 : 0) delayed
BNE PC = PC + 4 + (R[$rs] != R[$rt] ? SEXT(imm) << 2 : 0) delayed
BLEZ PC = PC + 4 + (R[$rs] <= 0 ? SEXT(imm) << 2 : 0) delayed
BGTZ PC = PC + 4 + (R[$rs] > 0 ? SEXT(imm) << 2 : 0) delayed
BLTZ PC = PC + 4 + (R[$rs] < 0 ? SEXT(imm) << 2 : 0) delayed
BGEZ PC = PC + 4 + (R[$rs] >= 0 ? SEXT(imm) << 2 : 0) delayed

Pipelining

As stated earlier, the MIPS150 you design must have 3 pipeline stages. Although the stages have been arbitrarily labeled I, X, and M (meaning Instruction, Execute, and Memory), what each stage does is entirely up to you. A straightforward way to think of this pipeline is to consider the classic 5 stage MIPS studied in class, then consider removing two stages.

Delay Slots

The instructions indicated as delayed in the Notes section of the table above, have architectural delay slots. This means that the instruction directly following the instruction marked as delayed, is always executed, and furthermore has no dependencies on the execution of the current instruction. Fortunately, our compiler handles all of this behind the scenes. The C compiler will generate code with the correct delay slot fills, likewise the assembler will rearrange instructions when it can to fill delay slots. This means that, when writing assembly, do not explicitly fill delay slots, allow the assembler to do it.

Memory Interface

Your processor will use memory mapped IO to communicate with the different components of your system. Eventually these components will include a data cache, a serial interface, an Ethernet interface, and a graphics interface. For this checkpoint you will be responsible for memory mapping a data memory(not cache) and a serial interface. For the data and instruction memories you will be using block RAM generated with Coregen. These will be used to store the entirety of the data and instruction memory, not just the working set as would be the case with caches. In checkpoint 3 these block RAMs will be replaced with caches, that will use those block RAMs for the underlying cache storage.

Addresses Read/Write Function
0xFFFF0000 - 0xFFFF0000 Read Receiver control
0xFFFF0004 - 0xFFFF0004 Read Receiver data
0xFFFF0008 - 0xFFFF0008 Read Transmitter control
0xFFFF000C - 0xFFFF000C Write Transmitter data

What this means is that when the processor does an lw from the address 0xFFFF0000, it should not read in a word from memory, but instead a word whose lowest bit indicates whether or not the UART has received a byte. One way of encoding this would be, {31'bx, UARTDataOutValid}. Likewise, when the CPU reads from the address 0xFFFF0008, it should read a word whose lowest bit indicates the readyness of the UART for transmission, something like the following Verilog {31'bx, UARTDataInReady} might encode this. The other two address are for actually transferring the data between UART and CPU.

Before you proceed with acquiring the skeleton files, you must set up a ssh key to use with github. If you have already done this you may skip the following subsection.

Setting Up SSH Keys

Github authenticates you for access to your repository using ssh keys. You can generate a ssh key using the ssh-keygen command. Accepting the defaults should be fine.

Once you have a key generated, go to your github account settings. Then go to the SSH Public Keys menu. Here, click Add another public key. In this dialog insert the contents of your public key file in to the Key box. By default the public key file will be in ~/.ssh/id_rsa.pub. Make sure you don't accidentally copy the contents of the ~/.ssh/id_rsa file. This is your private key, which is equivalent to your password. After you paste your key in, give it a name and your are done.

Acquiring Skeleton Files, Setting Up Repository

The skeleton files for the project will be available through a git repository provided by the staff. The suggested way for initializing your repository with the skeleton files is as follows. First, set up your ssh keys as described above. Then,

% git clone git@github.com:CS150/skeleton.git
% cd skeleton
% git remote add my-repo git@github.com:CS150/teamX.git
% git push my-repo master

This will make a single commit to your repository with the base files, we suggest you then do the following,

% cd ..
% rm -rf skeleton
% git clone git@github.com:CS150/teamX.git
% cd teamX
% git remote add staff git@github.com:CS150/skeleton.git

This will delete the skeleton repository you cloned, clone your repository that now has a single commit, and add a remote repository named staff that points to the skeleton files to allow easy future merges of staff updates.

Toolchain

A GCC MIPS toolchain has been built and installed in the cs150 home directory, these binaries will run on any of the p380 machines in the 125 Cory lab. The most relevant pieces of the toolchain are given below,

The easiest way to use this toolchain will be to copy the example project in the software directory of the skeleton files. That might look something like the following,

% cd software
% cp -r example helloworld
% cd helloworld
% mv example.c helloworld.c
% mv example.ld helloworld.ld
% gedit Makefile

The only editing required in the Makefile is to change the TARGET variable to be helloworld. You can then edit helloworld.c, and run make to compile. Adding C or assembly files to the directory, with the proper extension will cause them to automatically be compiled. There are a few things to be aware of in the example project,

Block RAM Generation

At some point you will need memory to complete checkpoint 1, while the memory inference discusses in lecture does work, it is not perfect. One thing it does not allow is the generation of byte masked memories, to achieve this we use the Xilinx Core Generator (coregen). Most of this has already been done, and included in the skeleton files, to view the file that specifies the block RAM parameters do the following,

% cd hardware/src/blk_mem_gen_v4_3
% gedit blk_mem_gen_v4_3.xco

Feel free to tweak whatever parameters you like, but the ones of interest are,

To actually build the block RAM, execute the following

% ./build

likewise, to cleanup generated files

% ./clean