TA: Kevin Liston
Ida 2 is a 32-bit instruction set using a 24-bit address space. Ida 2 has a few key features. First, every instruction contains a slot for conditionally executing that instruction. Second, every instruction has a field used for registers or immediate values, the preference for which is given by a single bit within the instruction.
In Ida 2, all types of data are 32-bit words. Memory is word-addressed. Memory
addresses are 24 bits each. Ida 2 features separated read-only instruction memory
and read-write data memory. Only data memory can be loaded from or stored to,
specifically using the MLD
and MST
instructions.
Ida 2 uses three primary instruction types. The type of an instruction is simply defined by the number of arguments it requires. Each type of instruction additionally encodes a secondary format: whether it has a register or an immediate as its final argument.
When the immediate-indicating 24th bit is off, the final argument is a
register: $RI
. Note that all fields are in identical places for each of these
three types. Furthermore note that these formats contain unused slots.
TYPE | 3128 | 2725 | 24 | 2320 | 1916 | 154 | 30 |
1-ARG |
OPCODE | ?CQ |
0 | - | $RI |
||
2-ARG |
OPCODE | ?CQ |
0 | $RD |
- | $RI |
|
3-ARG |
OPCODE | ?CQ |
0 | $RD |
$RS |
- | $RI |
When the immediate-indicating 24th bit is on, the final argument is an immediate value. Note that all other aspects of the format exactly match the register formats described previously. However, these formats now contain no unused slots.
TYPE | 3128 | 2725 | 24 | 2320 | 1916 | 150 |
1-ARG |
OPCODE | ?CQ |
1 | IMM24 |
||
2-ARG |
OPCODE | ?CQ |
1 | $RD |
IMM20 |
|
3-ARG |
OPCODE | ?CQ |
1 | $RD |
$RS |
IMM16 |
The reason for having multiple primary types is to use as many bits as possible for
the immediate fields. In particular, any 1-ARG
instruction can reference an
entire 24-bit address using just its immediate field.
The purpose of the ?CQ
field in each instruction will be explained later in
this manual.
In the rest of this document, the following terms are used to refer to either
an instruction's $RI
register or to the sign-extended immediate value for
immediate instructions. Whether the following terms refer to $RI
or a
sign-extended immediate depends on whether the immediate bit (described previously)
is on or off.
TERM | REGISTER | IMMEDIATE |
RI16 | $RI | SignExt32←16(IMM16 ) |
RI20 | $RI | SignExt32←20(IMM20 ) |
RI24 | $RI | SignExt32←24(IMM24 ) |
It is worth underscoring that all immediate values are always sign-extended into 32-bit values, for every instruction.
INSTRUCTION | OPCODE | TYPE | SYNTAX | RTL OPERATION |
Shift Left Logical | 00 | 3-ARG | SHL ?CQ $RD $RS RI16 | $RD ← $RS << RI16 [4:0] |
Shift Right Logical | 01 | 3-ARG | SHR ?CQ $RD $RS RI16 | $RD ← $RS >> RI16 [4:0] |
And | 02 | 3-ARG | AND ?CQ $RD $RS RI16 | $RD ← $RS & RI16 |
Inclusive Or | 03 | 3-ARG | IOR ?CQ $RD $RS RI16 | $RD ← $RS | RI16 |
Exclusive Or | 04 | 3-ARG | XOR ?CQ $RD $RS RI16 | $RD ← $RS ^ RI16 |
Set Upper | 05 | 3-ARG | STU ?CQ $RD $RS RI16 | $RD ← {RI16 [15:0], $RS [15:0]} |
Multiplication | 06 | 3-ARG | MUL ?CQ $RD $RS RI16 | $RD ← $RS * RI16 |
Signed Division | 07 | 3-ARG | DIV ?CQ $RD $RS RI16 | $RD ← $RS / RI16 |
Subtraction | 08 | 3-ARG | SUB ?CQ $RD RI16 $RS | $RD ← RI16 - $RS |
Addition | 09 | 3-ARG | ADD ?CQ $RD $RS RI16 | $RD ← $RS + RI16 |
Memory Load | 10 | 3-ARG | MLD ?CQ $RD RI16 ($RS ) | $RD ← MEM (($RS + RI16 )[23:0]) |
Memory Store | 11 | 3-ARG | MST ?CQ $RD RI16 ($RS ) | MEM (($RS + RI16 )[23:0]) ← $RD |
Set Lower | 12 | 2-ARG | STL ?CQ $RD RI20 | $RD ← RI20 |
Signed Comparison | 13 | 2-ARG | CMP ?CQ $RD RI20 | $CR ← $RD ? RI20 |
Link To Offset From $PC | 14 | 2-ARG | LNK ?CQ $RD RI20 | $RD ← SignExt32←24($PC ) + RI20 |
Jump To Address | 15 | 1-ARG | JMP ?CQ RI24 | $PC ← RI24 [23:0] |
After any instruction executes, the program counter $PC
advances by
one (recall that the 24-bit memory addresses are word addresses). The only exception
to this rule is JMP
, in which the program counter is instead set
manually.
Every instruction that performs arithmetic operations (ignoring the bitwise
operations) performs a signed operation. This primarily affects
DIV
and CMP
. Addition, subtraction, and
multiplication are identical for signed and unsigned numbers in this 32-bit format.
In addition to all operations being signed, recall that all immediate are
sign-extended before being used.
Notice SUB
is computed in an unusual backwards way. This is mainly
due to the absence of an always-zero register (described in a later section). By
subtracting a register from a zero immediate, negation of register is still
possible. Subtracting two registers is still possible. Finally, subtracting an
immediate from a register is still technically possible, by adding a negative
immediate with ADD
. In other words, this backwards format
maintains all the functionality of a regular subtraction instruction, without
needing an always-zero register.
# SUBTRACTION FORMSSUB
$t1
2
$t0
# T1 == 2 - T0SUB
$t1
0
$t0
# T1 == - T0SUB
$t2
$t3
$t0
# T2 == T3 - T0ADD
$t3
$t1
-1
# T3 == T1 - 1
In the case that a DIV
instruction is attempted with a zero
divisor, the instruction avoids errors by instead using a divisor of 1 (in other
words, $RD
is set to $RS
without performing any division). Division by
zero within the ALU signals an error.
# DIVISION BY ZEROMOV
$t0
2
# T0 == 2DIV
$t1
$t0
0
# T1 == T0
In the case that the most negative 32-bit number is divided by -1, the result is the most negative 32-bit number. In general, divisions should round towards 0.
All cases of overflow are ignored. Overflow in addition or subtraction operations within the ALU signals an error.
It should be noted that STL
and STU
can be used
together in order to set the entire contents of a 32-bit register. However,
STL
(the more useful of the two instructions) can set more than
half the value of a register on its own.
# SET T8 TO 0x82347D76STL
$t8
0x7D76
# T8 == 0x00007D76STU
$t8
$t8
0x8234
# T8 == 0x82347D76 # SET T8 TO 0xDEADBEEFSTL
$t8
0x9BEEF
# T8 == 0xFFF9BEEF (remember sign-extension)STU
$t8
$t8
0xDEAD
# T8 == 0xDEADBEEF (upper bits were cleared, then replaced)
Every instruction can be conditionally executed depending on the current comparison
query. This is done by checking the most recent signed comparison stored in the
comparison register $CR
. Queries are tested by using bit masks. The
following table lists every bit mask combination.
NUMBER | BITS | NAME | COMPARISON |
?0 | 000 | ?NO | FALSE |
?1 | 001 | ?GT | A > B |
?2 | 010 | ?EQ | A == B |
?3 | 011 | ?GE | A >= B |
?4 | 100 | ?LT | A < B |
?5 | 101 | ?NE | A != B |
?6 | 110 | ?LE | A <= B |
?7 | 111 | ?OK | TRUE |
Using this scheme, each of the three bits has a specific meaning. The most significant bit signifies less than, the middle bit signifies equal to, and the least significant bit signifies greater than.
When a signed comparison occurs, the value of $CR
is updated to one of:
?LT
, ?EQ
, or ?GT
. This register then remembers this result
value until a new comparison occurs, at which point it is overwritten.
Each instruction contains a comparison query from this table. If not specified, an
instruction will assume the unconditional query of ?OK
, meaning that an
instruction will execute regardless of any previous comparisons. Before executing an
instruction, the field ?CQ
of the instruction is checked against the contents
of $CR
to determine if the instruction should execute. For example, in
the following code $t0
is set to -1
if it is greater than or equal
to $t1
, otherwise it is set to 0
.
CMP
$t0
$t1
MOV
?GE
$t0
-1
MOV
?LT
$t0
0
Before any comparisons have occurred (such as at the start of a program) the value
of $CR
is ?OK
. This means that every comparison query except
for ?NO
will automatically pass before the first comparison occurs. Notably,
after any comparison occurs, $CR
will never take on the values:
?OK
, ?NO
, ?GE
, ?LE
, or ?NE
. Phrased another
way, $CR
starts at a special value of ?OK
, and after any
comparison, it will only ever change to one of: ?LT
, ?EQ
, or
?GT
.
Ida 2 supports 16 general-purpose registers, usable in any register field of a given instruction, and uses 2 special-purpose registers that are indirectly accessible. Ida 2 has been designed to function without the need for an always-zero register.
NUMBER | NAME | BITS | PURPOSE |
$00 | $rv | 32 | Return Value |
$01 | $ra | 32 | Return Address |
$02 | $a0 | 32 | Argument 0 |
$03 | $a1 | 32 | Argument 1 |
$04 | $a2 | 32 | Argument 2 |
$05 | $a3 | 32 | Argument 3 |
$06 | $t0 | 32 | Temporary 0 |
$07 | $t1 | 32 | Temporary 1 |
$08 | $t2 | 32 | Temporary 2 |
$09 | $t3 | 32 | Temporary 3 |
$10 | $t4 | 32 | Temporary 4 |
$11 | $t5 | 32 | Temporary 5 |
$12 | $t6 | 32 | Temporary 6 |
$13 | $t7 | 32 | Temporary 7 |
$14 | $t8 | 32 | Temporary 8 |
$15 | $sp | 32 | Stack Pointer |
- | $PC | 24 | Program Counter |
- | $CR | 3 | Comparison Result |
Ida 2 is entirely case-insensitive. Spaces, tabs, commas, and parentheses can be
used to delimit instruction arguments. Comments begin with the the number sign and
continue until the end of their line. Number constants can be in binary (0b...),
hex (0x...), and decimal. All constants can be positive or negative. Numeric
comparison queries (?4
) and registers ($06
) are supported as well.
The final feature of the syntax are labels. Labels directly reference 24-bit addresses. They can be used to refer to any line, and they can be referenced inside the immediate field of some of the instructions, where they must begin with an at symbol. In general, labels can include any character that is not an argument delimit or comment character.
In most cases, labels evaluate to their address value. For example, consider the
following piece of code, in which the label is swapped out for its absolute address
by the assembler. Recall that this is possible because JMP
has a
24-bit immediate field. All 1-ARG
instructions treat labels as being their
absolute address values.
0x000001:JMP
@GOTO # == 0x000003 0x000002:MOV
$t0
0
0x000003: GOTO:MOV
$t1
0
Now consider the case of LNK
. The assembler replaces the label
with a relative address. All 2-ARG
instructions treat labels as being their
relative address values.
0x000001:LNK
$ra
@GOTO # == 2 0x000002:MOV
$t0
0
0x000003: GOTO:MOV
$t1
0
Note that 3-ARG
instructions cannot reference labels at all.
The standard behavior for the Ida 2 assembler is to abort immediately if any value is too wide to fit into its slot within its instruction.
A few pseudo-instructions are supported, most importantly the jump-and-link
instruction JAL
. Every pseudo-instruction works by directly mapping to
a set of true instructions. Thus every pseudo-instruction can be replaced with real
instructions (the only exceptions involve using SLI
or SUI
with labels).
In the following table, the new term IMM32
has been introduced, representing
any 32-bit immediate value. Labels can be used for this argument, in all cases
being treated as their absolute address values.
INSTRUCTION | GENERATES | NOTE |
NOP |
| No-op |
END ?CQ | SELF: | Halt program (using an infinite loop) |
JAL ?CQ RI24 |
| Jump and link |
RTN ?CQ RI20 |
| Return value |
NIL ?CQ $RI |
| Clear register |
MOV ?CQ $RD RI20 |
| Copy value (alias for STL ) |
SLI ?CQ $RD IMM32 | STL ?CQ $RD IMM32 [15:0] | Set low from immediate slice |
SUI ?CQ $RD $RS IMM32 | STU ?CQ $RD $RS IMM32 [31:16] | Set high from immediate slice |
STI ?CQ $RD IMM32 |
| Set immediate |
PSH ?CQ $RD |
| Push to stack |
TOP ?CQ $RD | MLD ?CQ $RD 0 ($sp ) | Top of stack |
POP ?CQ $RD |
| Pop from stack |
The following are some small sample programs to demonstrate the usage of Ida 2. The first sample program computes the nth fibonacci number.
MOV
$a0
9
# compute the nth fibonacci numberJAL
@FIBEND
# COMPUTE THE A0th FIBONACCI NUMBER, # ASSUMING A0 IS >= 0 FIB:CMP
$a0
1
# is A0 in a base case?JMP
?GT
@REC # if not, then skipMOV
?LE
$rv
$a0
# if A0 <= 1, return A0JMP
$ra
REC:ADD
$sp
$sp
-2
# allocate 2 slots on stackMST
$ra
0
($sp
) # backup RAMST
$a0
1
($sp
) # backup A0ADD
$a0
$a0
-1
# setup recursive callJAL
@FIBMLD
$a0
1
($sp
) # restore A0MST
$rv
1
($sp
) # backup RVADD
$a0
$a0
-2
# setup recursive callJAL
@FIBMLD
$t0
1
($sp
) # get 1st RVADD
$rv
$rv
$t0
# add together the 2 RVsMLD
$ra
0
($sp
) # restore RAADD
$sp
$sp
2
# clean up the stackJMP
$ra
# return
The second sample program performs a binary search on the first 1000 words in the assumedly sorted data memory.
MOV
$a0
0
# array is hereMOV
$a2
0
# from index 0MOV
$a3
1000
# to 1000MOV
$a1
0xab5a
# search for this numberJAL
@BINEND
# PERFORM BINARY SEARCH ON ARRAY A0, # TO FIND INDEX OF KEY A1, LOCATED # BETWEEN INDEX A2 AND A3 BIN:MOV
$rv
-1
CMP
$a3
$a2
# max ? midJMP
?LT
$ra
# if max < mid: return -1SUB
$t8
$a3
$a2
# mid = min+((max-min)/2)SHR
$t8
$t8
1
ADD
$t8
$a2
$t8
MLD
$t0
$t8
($a0
) # tmp = array[mid]CMP
$t0
$a1
# tmp ? keyADD
?LT
$a2
$t8
1
# if tmp < key: min = mid+1ADD
?GT
$a3
$t8
-1
# if tmp > key: max = mid-1JMP
?NE
@BIN # if tmp != key: next iterationMOV
$rv
$t8
# return midJMP
$ra