Ida Assembly

Ida Assembly is a hypothetical instruction set with a few key features. First, every instruction has a field used for registers or immediate values, the preference for which is given by a single bit within the instruction. Second, all jump instructions use truly absolute addressing (there are no relative branches).

Notation

There are a few special notations outlined here for reference.

NOTATIONMEANINGEXAMPLE
{X, Y}Concatenate X and Y together.{10, 11, 011} == 1011011
(X)[B:A]Slice bits A through B (inclusive) out of X.(1100110111)[4:0] == 10111
SignExtendM←N(X)Sign-extend X from N bits to M bits.SignExtend8←4(1001) == 11111001
MEM((N)[24:0])The half-word quantity from data memory at 25-bit address N.

Further note that bits are zero-indexed from least significant bit to most significant bit. For example, given a 32-bit value X, X[31:0] == X == {X[31:20], X[19:5], X[4:0]}.

Memory

In Ida Assembly, instruction memory and data memory are separate and have different attributes.

Instruction Memory

Instruction memory is composed of 32-bit word values, where each word is addressed by a 24-bit address. This means that in order to go from the current instruction to the next instruction, the instruction address is incremented by 1. Here is an example diagram of instruction memory:

ADDRESSINSTRUCTION
0x0000000x39c00009
0x0000010x68840004
......
0xfffffe0x784e0001
0xffffff0xc9c40002

Data Memory

Data memory is composed of 16-bit half-word values, where each half word is addressed by a 25-bit address. This means that in order to allocate space on the stack for a word value, the stack pointer must be decremented by 2. Here is an example diagram of data memory:

ADDRESSDATA
0x00000000x48fc
0x00000010x5514
......
0x1fffffe0x67f9
0x1ffffff0x6a9e

Registers

Ida Assembly supports 32 general-purpose registers, usable in any register field of a given instruction. Additionally, there is the program counter PC, which contains the address of the instruction being fetched and executed.

NUMBERNAMEBITSPURPOSE
$0$zero32Always Zero
$1$jc32Jump Condition
$2$sp32Stack Pointer
$3$fp32Frame Pointer
$4 - $5$v0 - $v132Return Values
$6$ra32Return Address
$7 - $11$a0 - $a432Arguments
$12 - $31$t0 - $t1932Temporaries
PC24Program Counter

The PC, $sp, $fp, and $zero registers always start at an initial value of zero. For all other registers, the initial values are undefined.

Encodings

R-Type Format

The R-Type format is the primary instruction format. Note that the field I (the 27th bit) is a toggle bit that indicates whether to use $RI or IMM in executing the instruction.

TYPE 3128 27 2622 2117 165 40
R-Type OP I=0 $RD $RS - $RI
R-Type OP I=1 $RD $RS IMM

J-Type Format

The J-Type format is the instruction format for jumps. Note again that the field I (the 27th bit) tells whether to use $RI or ADDR in executing the jump. Additionally, there is a jump code JC field, which indicates which type of jump is performed. The different types are listed later.

TYPE 3128 27 2624 235 40
J-Type OP I=0 JC - $RI
J-Type OP I=1 JC ADDR

Instruction Set

R-Type Instructions

Shift Left

OPSYNTAXESRTL OPERATIONS
00 SL $RD $RS $RI $RD$RS << $RI[4:0]
SL $RD $RS IMM $RD$RS << IMM[4:0]

This shift should smear the least significant bit of the value being shifted. For example, here are some shifts (using 16-bit values instead of 32-bit values):

0b0000000000001101 << 3 = 0b0000000001101111
0b0000000000011010 << 2 = 0b0000000001101000

Note that this is not a logical left shift, and is instead the left-shifting equivalent of an arithmetic right shift.

Shift Right

OPSYNTAXESRTL OPERATIONS
01 SR $RD $RS $RI $RD$RS >> $RI[4:0]
SR $RD $RS IMM $RD$RS >> IMM[4:0]

This shift should smear the most significant bit of the value being shifted. For example, here are some shifts (using 16-bit values instead of 32-bit values):

0b1111111100011101 >> 4 = 0b1111111111110001
0b0000000000011010 >> 2 = 0b0000000000000110

Note that this is an arithmetic right shift.

And

OPSYNTAXESRTL OPERATIONS
02 AND $RD $RS $RI $RD$RS & $RI
AND $RD $RS IMM $RD$RS & SignExtend32←17(IMM)

Inclusive Or

OPSYNTAXESRTL OPERATIONS
03 OR $RD $RS $RI $RD$RS | $RI
OR $RD $RS IMM $RD$RS | SignExtend32←17(IMM)

Reverse

OPSYNTAXESRTL OPERATIONS
04 REV $RD $RS $RI $RD$RS <> $RI[4:0]
REV $RD $RS IMM $RD$RS <> IMM[4:0]

Reverse swaps pairs of bits, given a 5-bit reverse pattern. Specifically, if the nth bit of the pattern is 1, then bit pairs of size n will be swapped. Here is an example of what this means, given a 8-bit number (with bits as ABCDEFGH) and a 3-bit reverse pattern:

Exclusive Or

OPSYNTAXESRTL OPERATIONS
05 XOR $RD $RS $RI $RD$RS ^ $RI
XOR $RD $RS IMM $RD$RS ^ SignExtend32←17(IMM)

Add

OPSYNTAXESRTL OPERATIONS
06 ADD $RD $RS $RI $RD$RS + $RI
ADD $RD $RS IMM $RD$RS + SignExtend32←17(IMM)

Overflow should be ignored, and should require no special consideration.

Subtract

OPSYNTAXESRTL OPERATIONS
07 SUB $RD $RS $RI $RD$RS - $RI
SUB $RD $RS IMM $RD$RS - SignExtend32←17(IMM)

Overflow should be ignored, and should require no special consideration.

Multiply

OPSYNTAXESRTL OPERATIONS
08 MUL $RD $RS $RI $RD$RS * $RI
MUL $RD $RS IMM $RD$RS * SignExtend32←17(IMM)

Note that this only keeps the lower bits of the multiply result, meaning overflow should be ignored, and should require no special consideration.

Divide

OPSYNTAXESRTL OPERATIONS
09 DIV $RD $RS $RI if($RI != 0) $RD$RS / $RI
else if($RS > 0) $RD ← MAX_VALUE
else if($RS < 0) $RD ← MIN_VALUE
else $RD ← 0
DIV $RD $RS IMM if(SignExtend32←17(IMM) != 0) $RD$RS / SignExtend32←17(IMM)
else if($RS > 0) $RD ← MAX_VALUE
else if($RS < 0) $RD ← MIN_VALUE
else $RD ← 0

Division is performed as a regular signed division. If the divisor is zero, then the result should be the maximum positive 32-bit 2's complement value if the dividend is positive and the minimum negative 32-bit 2's complement value if the dividend is negative. This mimics rounding to positive or negative infinity when dividing by zero. Finally, zero divided by zero should be zero.

Load Half Word

OPSYNTAXESRTL OPERATIONS
10 LH $RD $RI($RS) $RDSignExtend32←16(MEM(($RS + $RI)[24:0]))
LH $RD IMM($RS) $RDSignExtend32←16(MEM(($RS + SignExtend32←17(IMM))[24:0]))

Store Half Word

OPSYNTAXESRTL OPERATIONS
11 SH $RD $RI($RS) MEM(($RS + $RI)[24:0]) ← $RD[15:0]
SH $RD IMM($RS) MEM(($RS + SignExtend32←17(IMM))[24:0]) ← $RD[15:0]

Load Word

OPSYNTAXESRTL OPERATIONS
12 LW $RD $RI($RS) $RD[15:0] ← MEM(($RS + $RI)[24:0])
$RD[31:16] ← MEM(($RS + $RI + 1)[24:0])
LW $RD IMM($RS) $RD[15:0] ← MEM(($RS + SignExtend32←17(IMM))[24:0])
$RD[31:16] ← MEM(($RS + SignExtend32←17(IMM) + 1)[24:0])

Note that load addresses do not have to be word-aligned addresses, and in fact, a word can loaded from any valid half-word address, including the maximum half-word address (in this case, the second half of the word is loaded from the minimum half-word address).

Store Word

OPSYNTAXESRTL OPERATIONS
13 SW $RD $RI($RS) MEM(($RS + $RI)[24:0]) ← $RD[15:0]
MEM(($RS + $RI + 1)[24:0]) ← $RD[31:16]
SW $RD IMM($RS) MEM(($RS + SignExtend32←17(IMM))[24:0]) ← $RD[15:0]
MEM(($RS + SignExtend32←17(IMM) + 1)[24:0]) ← $RD[31:16]

Note that just like load addresses, store addresses do not have to be word-aligned. Again, even the maximum half-word address can be stored to (in this case, the second half of the word is stored to the minimum half-word address).

Set Upper

OPSYNTAXESRTL OPERATIONS
14 STU $RD $RS $RI $RD ← {$RI[16:0], $RS[14:0]}
STU $RD $RS IMM $RD ← {IMM, $RS[14:0]}

Note that this operation combines the lower 15 bits of its first argument with the lower 17 bits of its second argument.

J-Type Instructions

Conditional jump instructions depend on the jump condition register $jc. These jumps are intended to follow a subtraction instruction, but they do not have to.

# jump if t0 > t19
SUB $jc $t0 $t19 # jc = t0 - t19
JGE @GOTO        # (jc >= 0) <==> (t0 >= t19)
 
# jump if t0 != 0
OR  $jc $t0 $zero # jc = t0
JNE @GOTO         # (jc != 0) <==> (t0 != 0)

Jump

OPJCSYNTAXESRTL OPERATIONS
15 0 J $RI PC$RI[23:0]
J ADDR PCADDR

Note that this is an unconditional jump.

Jump If Greater Than Zero

OPJCSYNTAXESRTL OPERATIONS
15 1 JGT $RI if($jc > 0) PC$RI[23:0]
JGT ADDR if($jc > 0) PCADDR

Jump If Equal To Zero

OPJCSYNTAXESRTL OPERATIONS
15 2 JEQ $RI if($jc == 0) PC$RI[23:0]
JEQ ADDR if($jc == 0) PCADDR

Jump If Less Than Zero

OPJCSYNTAXESRTL OPERATIONS
15 3 JLT $RI if($jc < 0) PC$RI[23:0]
JLT ADDR if($jc < 0) PCADDR

Jump If Less Than Or Equal To Zero

OPJCSYNTAXESRTL OPERATIONS
15 4 JLE $RI if($jc <= 0) PC$RI[23:0]
JLE ADDR if($jc <= 0) PCADDR

Jump If Not Equal To Zero

OPJCSYNTAXESRTL OPERATIONS
15 5 JNE $RI if($jc != 0) PC$RI[23:0]
JNE ADDR if($jc != 0) PCADDR

Jump If Greater Than Or Equal To Zero

OPJCSYNTAXESRTL OPERATIONS
15 6 JGE $RI if($jc >= 0) PC$RI[23:0]
JGE ADDR if($jc >= 0) PCADDR

Jump And Link

OPJCSYNTAXESRTL OPERATIONS
15 7 JAL $RI tmp ← $RI[23:0]
$ra ← {00000000, PC + 1}
PC ← tmp
JAL ADDR tmp ← ADDR
$ra ← {00000000, PC + 1}
PC ← tmp

Note that this is an unconditional jump. Also, notice that the next program counter value is PC+1, because instruction memory is word-addressed.

After Each Instruction

After any instruction executes, the program counter PC advances by 1 (recall that the 24-bit instruction memory addresses are word addresses). The only exception to this rule are the jump instructions, in which the program counter is instead set explicitly. If the program counter overflows, it should wrap around silently.

Assembly Details

Ida Assembly is entirely case-insensitive. Spaces, tabs, commas, and parentheses can be used to delimit instruction arguments. Comments begin with the a number sign and continue until the end of their line. Number constants can be in unsigned hex (such as 0xD12F), unsigned binary (such as 0b1010), signed decimal, or unsigned decimal.

The final feature of the syntax are labels. Labels directly reference 24-bit instruction addresses. They can be used to refer to any line, and they can be referenced inside the ADDR field of jump instructions, where they must begin with an at symbol. In general, labels can include any character that is not an argument delimit or comment character.

0x000001:       J @GOTO # uses: 0x000223
                ...
0x000223: GOTO: ...

The standard behavior for the Ida Assembly assembler is to abort if any value is too wide to fit into its slot within its instruction, rather than to fit only part of the value into the slot.