Ida Assembly is a hypothetical instruction set with a few key features. First, every instruction has a field used for registers or immediate values, the preference for which is given by a single bit within the instruction. Second, all jump instructions use truly absolute addressing (there are no relative branches).
There are a few special notations outlined here for reference.
NOTATION | MEANING | EXAMPLE |
{X, Y} | Concatenate X and Y together. | {10, 11, 011} == 1011011 |
(X)[B:A] | Slice bits A through B (inclusive) out of X. | (1100110111)[4:0] == 10111 |
SignExtendM←N (X) | Sign-extend X from N bits to M bits. | SignExtend8←4 (1001) == 11111001 |
MEM ((N)[24:0]) | The half-word quantity from data memory at 25-bit address N. |
Further note that bits are zero-indexed from least significant bit to most significant bit. For example, given a 32-bit value X, X[31:0] == X == {X[31:20], X[19:5], X[4:0]}.
In Ida Assembly, instruction memory and data memory are separate and have different attributes.
Instruction memory is composed of 32-bit word values, where each word is addressed by a 24-bit address. This means that in order to go from the current instruction to the next instruction, the instruction address is incremented by 1. Here is an example diagram of instruction memory:
ADDRESS | INSTRUCTION |
0x000000 | 0x39c00009 |
0x000001 | 0x68840004 |
... | ... |
0xfffffe | 0x784e0001 |
0xffffff | 0xc9c40002 |
Data memory is composed of 16-bit half-word values, where each half word is addressed by a 25-bit address. This means that in order to allocate space on the stack for a word value, the stack pointer must be decremented by 2. Here is an example diagram of data memory:
ADDRESS | DATA |
0x0000000 | 0x48fc |
0x0000001 | 0x5514 |
... | ... |
0x1fffffe | 0x67f9 |
0x1ffffff | 0x6a9e |
Ida Assembly supports 32 general-purpose registers, usable in any register field of a
given instruction. Additionally, there is the program counter PC
, which
contains the address of the instruction being fetched and executed.
NUMBER | NAME | BITS | PURPOSE |
$0 | $zero | 32 | Always Zero |
$1 | $jc | 32 | Jump Condition |
$2 | $sp | 32 | Stack Pointer |
$3 | $fp | 32 | Frame Pointer |
$4 - $5 | $v0 - $v1 | 32 | Return Values |
$6 | $ra | 32 | Return Address |
$7 - $11 | $a0 - $a4 | 32 | Arguments |
$12 - $31 | $t0 - $t19 | 32 | Temporaries |
PC | 24 | Program Counter |
The PC
, $sp
, $fp
, and $zero
registers always start at an initial value
of zero. For all other registers, the initial values are undefined.
The R-Type format is the primary instruction format. Note that the field I
(the 27th bit) is a toggle bit that indicates whether to use $RI
or
IMM
in executing the instruction.
TYPE | 3128 | 27 | 2622 | 2117 | 165 | 40 |
R-Type | OP |
I=0 | $RD |
$RS |
- | $RI |
R-Type | OP |
I=1 | $RD |
$RS |
IMM |
The J-Type format is the instruction format for jumps. Note again that the field
I (the 27th bit) tells whether to use $RI
or ADDR
in executing the
jump. Additionally, there is a jump code JC
field, which indicates which type
of jump is performed. The different types are listed later.
TYPE | 3128 | 27 | 2624 | 235 | 40 |
J-Type | OP |
I=0 | JC |
- | $RI |
J-Type | OP |
I=1 | JC |
ADDR |
OP | SYNTAXES | RTL OPERATIONS |
00 | SL $RD $RS $RI |
$RD ← $RS << $RI [4:0] |
SL $RD $RS IMM |
$RD ← $RS << IMM [4:0] |
This shift should smear the least significant bit of the value being shifted. For example, here are some shifts (using 16-bit values instead of 32-bit values):
0b0000000000001101 << 3 = 0b0000000001101111
0b0000000000011010 << 2 = 0b0000000001101000
Note that this is not a logical left shift, and is instead the left-shifting equivalent of an arithmetic right shift.
OP | SYNTAXES | RTL OPERATIONS |
01 | SR $RD $RS $RI |
$RD ← $RS >> $RI [4:0] |
SR $RD $RS IMM |
$RD ← $RS >> IMM [4:0] |
This shift should smear the most significant bit of the value being shifted. For example, here are some shifts (using 16-bit values instead of 32-bit values):
0b1111111100011101 >> 4 = 0b1111111111110001
0b0000000000011010 >> 2 = 0b0000000000000110
Note that this is an arithmetic right shift.
OP | SYNTAXES | RTL OPERATIONS |
02 | AND $RD $RS $RI |
$RD ← $RS & $RI |
AND $RD $RS IMM |
$RD ← $RS & SignExtend32←17 (IMM ) |
OP | SYNTAXES | RTL OPERATIONS |
03 | OR $RD $RS $RI |
$RD ← $RS | $RI |
OR $RD $RS IMM |
$RD ← $RS | SignExtend32←17 (IMM ) |
OP | SYNTAXES | RTL OPERATIONS |
04 | REV $RD $RS $RI |
$RD ← $RS <> $RI [4:0] |
REV $RD $RS IMM |
$RD ← $RS <> IMM [4:0] |
Reverse swaps pairs of bits, given a 5-bit reverse pattern. Specifically, if the nth bit of the pattern is 1, then bit pairs of size n will be swapped. Here is an example of what this means, given a 8-bit number (with bits as ABCDEFGH) and a 3-bit reverse pattern:
OP | SYNTAXES | RTL OPERATIONS |
05 | XOR $RD $RS $RI |
$RD ← $RS ^ $RI |
XOR $RD $RS IMM |
$RD ← $RS ^ SignExtend32←17 (IMM ) |
OP | SYNTAXES | RTL OPERATIONS |
06 | ADD $RD $RS $RI |
$RD ← $RS + $RI |
ADD $RD $RS IMM |
$RD ← $RS + SignExtend32←17 (IMM ) |
Overflow should be ignored, and should require no special consideration.
OP | SYNTAXES | RTL OPERATIONS |
07 | SUB $RD $RS $RI |
$RD ← $RS - $RI |
SUB $RD $RS IMM |
$RD ← $RS - SignExtend32←17 (IMM ) |
Overflow should be ignored, and should require no special consideration.
OP | SYNTAXES | RTL OPERATIONS |
08 | MUL $RD $RS $RI |
$RD ← $RS * $RI |
MUL $RD $RS IMM |
$RD ← $RS * SignExtend32←17 (IMM ) |
Note that this only keeps the lower bits of the multiply result, meaning overflow should be ignored, and should require no special consideration.
OP | SYNTAXES | RTL OPERATIONS |
09 | DIV $RD $RS $RI |
if($RI != 0) $RD ← $RS / $RI else if( $RS > 0) $RD ← MAX_VALUEelse if( $RS < 0) $RD ← MIN_VALUEelse $RD ← 0 |
DIV $RD $RS IMM |
if(SignExtend32←17 (IMM ) != 0) $RD ← $RS / SignExtend32←17 (IMM )else if( $RS > 0) $RD ← MAX_VALUEelse if( $RS < 0) $RD ← MIN_VALUEelse $RD ← 0 |
Division is performed as a regular signed division. If the divisor is zero, then the result should be the maximum positive 32-bit 2's complement value if the dividend is positive and the minimum negative 32-bit 2's complement value if the dividend is negative. This mimics rounding to positive or negative infinity when dividing by zero. Finally, zero divided by zero should be zero.
OP | SYNTAXES | RTL OPERATIONS |
10 | LH $RD $RI ($RS ) |
$RD ← SignExtend32←16 (MEM (($RS + $RI )[24:0])) |
LH $RD IMM ($RS ) |
$RD ← SignExtend32←16 (MEM (($RS + SignExtend32←17 (IMM ))[24:0])) |
OP | SYNTAXES | RTL OPERATIONS |
11 | SH $RD $RI ($RS ) |
MEM (($RS + $RI )[24:0]) ← $RD [15:0] |
SH $RD IMM ($RS ) |
MEM (($RS + SignExtend32←17 (IMM ))[24:0]) ← $RD [15:0] |
OP | SYNTAXES | RTL OPERATIONS |
12 | LW $RD $RI ($RS ) |
$RD [15:0] ← MEM (($RS + $RI )[24:0])$RD [31:16] ← MEM (($RS + $RI + 1)[24:0]) |
LW $RD IMM ($RS ) |
$RD [15:0] ← MEM (($RS + SignExtend32←17 (IMM ))[24:0])$RD [31:16] ← MEM (($RS + SignExtend32←17 (IMM ) + 1)[24:0]) |
Note that load addresses do not have to be word-aligned addresses, and in fact, a word can loaded from any valid half-word address, including the maximum half-word address (in this case, the second half of the word is loaded from the minimum half-word address).
OP | SYNTAXES | RTL OPERATIONS |
13 | SW $RD $RI ($RS ) |
MEM (($RS + $RI )[24:0]) ← $RD [15:0]MEM (($RS + $RI + 1)[24:0]) ← $RD [31:16] |
SW $RD IMM ($RS ) |
MEM (($RS + SignExtend32←17 (IMM ))[24:0]) ← $RD [15:0]MEM (($RS + SignExtend32←17 (IMM ) + 1)[24:0]) ← $RD [31:16] |
Note that just like load addresses, store addresses do not have to be word-aligned. Again, even the maximum half-word address can be stored to (in this case, the second half of the word is stored to the minimum half-word address).
OP | SYNTAXES | RTL OPERATIONS |
14 | STU $RD $RS $RI |
$RD ← {$RI [16:0], $RS [14:0]} |
STU $RD $RS IMM |
$RD ← {IMM , $RS [14:0]} |
Note that this operation combines the lower 15 bits of its first argument with the lower 17 bits of its second argument.
Conditional jump instructions depend on the jump condition register $jc
. These
jumps are intended to follow a subtraction instruction, but they do not have
to.
# jump if t0 > t19
SUB
$jc
$t0
$t19
# jc = t0 - t19
JGE
@GOTO # (jc >= 0) <==> (t0 >= t19)
# jump if t0 != 0
OR
$jc
$t0
$zero
# jc = t0
JNE
@GOTO # (jc != 0) <==> (t0 != 0)
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 0 | J $RI |
PC ← $RI [23:0] |
J ADDR |
PC ← ADDR |
Note that this is an unconditional jump.
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 1 | JGT $RI |
if($jc > 0) PC ← $RI [23:0] |
JGT ADDR |
if($jc > 0) PC ← ADDR |
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 2 | JEQ $RI |
if($jc == 0) PC ← $RI [23:0] |
JEQ ADDR |
if($jc == 0) PC ← ADDR |
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 3 | JLT $RI |
if($jc < 0) PC ← $RI [23:0] |
JLT ADDR |
if($jc < 0) PC ← ADDR |
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 4 | JLE $RI |
if($jc <= 0) PC ← $RI [23:0] |
JLE ADDR |
if($jc <= 0) PC ← ADDR |
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 5 | JNE $RI |
if($jc != 0) PC ← $RI [23:0] |
JNE ADDR |
if($jc != 0) PC ← ADDR |
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 6 | JGE $RI |
if($jc >= 0) PC ← $RI [23:0] |
JGE ADDR |
if($jc >= 0) PC ← ADDR |
OP | JC | SYNTAXES | RTL OPERATIONS |
15 | 7 | JAL $RI |
tmp ← $RI [23:0]$ra ← {00000000, PC + 1}PC ← tmp |
JAL ADDR |
tmp ← ADDR $ra ← {00000000, PC + 1}PC ← tmp |
Note that this is an unconditional jump. Also, notice that the next program
counter value is PC
+1, because instruction memory is word-addressed.
After any instruction executes, the program counter PC
advances by 1 (recall
that the 24-bit instruction memory addresses are word addresses). The only
exception to this rule are the jump instructions, in which the program counter is
instead set explicitly. If the program counter overflows, it should wrap
around silently.
Ida Assembly is entirely case-insensitive. Spaces, tabs, commas, and parentheses can be used to delimit instruction arguments. Comments begin with the a number sign and continue until the end of their line. Number constants can be in unsigned hex (such as 0xD12F), unsigned binary (such as 0b1010), signed decimal, or unsigned decimal.
The final feature of the syntax are labels. Labels directly reference 24-bit
instruction addresses. They can be used to refer to any line, and they can be
referenced inside the ADDR
field of jump instructions, where they must
begin with an at symbol. In general, labels can include any character that is not
an argument delimit or comment character.
0x000001: J
@GOTO # uses: 0x000223
...
0x000223: GOTO: ...
The standard behavior for the Ida Assembly assembler is to abort if any value is too wide to fit into its slot within its instruction, rather than to fit only part of the value into the slot.