Ida 2 Assembly

TA: Kevin Liston

Introduction

Ida 2 is a 32-bit instruction set using a 24-bit address space. Ida 2 has a few key features. First, every instruction contains a slot for conditionally executing that instruction. Second, every instruction has a field used for registers or immediate values, the preference for which is given by a single bit within the instruction.

In Ida 2, all types of data are 32-bit words. Memory is word-addressed. Memory addresses are 24 bits each. Ida 2 features separated read-only instruction memory and read-write data memory. Only data memory can be loaded from or stored to, specifically using the MLD and MST instructions.

Encodings

Ida 2 uses three primary instruction types. The type of an instruction is simply defined by the number of arguments it requires. Each type of instruction additionally encodes a secondary format: whether it has a register or an immediate as its final argument.

Register Formats

When the immediate-indicating 24th bit is off, the final argument is a register: $RI. Note that all fields are in identical places for each of these three types. Furthermore note that these formats contain unused slots.

TYPE 3128 2725 24 2320 1916 154 30
1-ARG OPCODE ?CQ 0 - $RI
2-ARG OPCODE ?CQ 0 $RD - $RI
3-ARG OPCODE ?CQ 0 $RD $RS - $RI

Immediate Formats

When the immediate-indicating 24th bit is on, the final argument is an immediate value. Note that all other aspects of the format exactly match the register formats described previously. However, these formats now contain no unused slots.

TYPE 3128 2725 24 2320 1916 150
1-ARG OPCODE ?CQ 1 IMM24
2-ARG OPCODE ?CQ 1 $RD IMM20
3-ARG OPCODE ?CQ 1 $RD $RS IMM16

Note

The reason for having multiple primary types is to use as many bits as possible for the immediate fields. In particular, any 1-ARG instruction can reference an entire 24-bit address using just its immediate field.

The purpose of the ?CQ field in each instruction will be explained later in this manual.

Instruction Set

Register-Immediates

In the rest of this document, the following terms are used to refer to either an instruction's $RI register or to the sign-extended immediate value for immediate instructions. Whether the following terms refer to $RI or a sign-extended immediate depends on whether the immediate bit (described previously) is on or off.

TERMREGISTERIMMEDIATE
RI16$RISignExt32←16(IMM16)
RI20$RISignExt32←20(IMM20)
RI24$RISignExt32←24(IMM24)

It is worth underscoring that all immediate values are always sign-extended into 32-bit values, for every instruction.

Instructions

INSTRUCTIONOPCODETYPESYNTAXRTL OPERATION
Shift Left Logical003-ARGSHL ?CQ $RD $RS RI16$RD$RS << RI16[4:0]
Shift Right Logical013-ARGSHR ?CQ $RD $RS RI16$RD$RS >> RI16[4:0]
And023-ARGAND ?CQ $RD $RS RI16$RD$RS & RI16
Inclusive Or033-ARGIOR ?CQ $RD $RS RI16$RD$RS | RI16
Exclusive Or043-ARGXOR ?CQ $RD $RS RI16$RD$RS ^ RI16
Set Upper053-ARGSTU ?CQ $RD $RS RI16$RD ← {RI16[15:0], $RS[15:0]}
Multiplication063-ARGMUL ?CQ $RD $RS RI16$RD$RS * RI16
Signed Division073-ARGDIV ?CQ $RD $RS RI16$RD$RS / RI16
Subtraction083-ARGSUB ?CQ $RD RI16 $RS$RDRI16 - $RS
Addition093-ARGADD ?CQ $RD $RS RI16$RD$RS + RI16
Memory Load103-ARGMLD ?CQ $RD RI16($RS)$RDMEM(($RS + RI16)[23:0])
Memory Store113-ARGMST ?CQ $RD RI16($RS)MEM(($RS + RI16)[23:0]) ← $RD
Set Lower122-ARGSTL ?CQ $RD RI20$RDRI20
Signed Comparison132-ARGCMP ?CQ $RD RI20$CR$RD ? RI20
Link To Offset From $PC142-ARGLNK ?CQ $RD RI20$RD ← SignExt32←24($PC) + RI20
Jump To Address151-ARGJMP ?CQ RI24$PCRI24[23:0]

After Each Instruction

After any instruction executes, the program counter $PC advances by one (recall that the 24-bit memory addresses are word addresses). The only exception to this rule is JMP, in which the program counter is instead set manually.

Signed-ness

Every instruction that performs arithmetic operations (ignoring the bitwise operations) performs a signed operation. This primarily affects DIV and CMP. Addition, subtraction, and multiplication are identical for signed and unsigned numbers in this 32-bit format. In addition to all operations being signed, recall that all immediate are sign-extended before being used.

Subtraction Format

Notice SUB is computed in an unusual backwards way. This is mainly due to the absence of an always-zero register (described in a later section). By subtracting a register from a zero immediate, negation of register is still possible. Subtracting two registers is still possible. Finally, subtracting an immediate from a register is still technically possible, by adding a negative immediate with ADD. In other words, this backwards format maintains all the functionality of a regular subtraction instruction, without needing an always-zero register.

# SUBTRACTION FORMS
SUB $t1 2 $t0   # T1 == 2 - T0
SUB $t1 0 $t0   # T1 == - T0
SUB $t2 $t3 $t0 # T2 == T3 - T0
ADD $t3 $t1 -1  # T3 == T1 - 1

Division Fringe Cases

In the case that a DIV instruction is attempted with a zero divisor, the instruction avoids errors by instead using a divisor of 1 (in other words, $RD is set to $RS without performing any division). Division by zero within the ALU signals an error.

# DIVISION BY ZERO
MOV $t0 2     # T0 == 2
DIV $t1 $t0 0 # T1 == T0

In the case that the most negative 32-bit number is divided by -1, the result is the most negative 32-bit number. In general, divisions should round towards 0.

Overflow

All cases of overflow are ignored. Overflow in addition or subtraction operations within the ALU signals an error.

Set Low And Set High

It should be noted that STL and STU can be used together in order to set the entire contents of a 32-bit register. However, STL (the more useful of the two instructions) can set more than half the value of a register on its own.

# SET T8 TO 0x82347D76
STL $t8 0x7D76     # T8 == 0x00007D76
STU $t8 $t8 0x8234 # T8 == 0x82347D76

# SET T8 TO 0xDEADBEEF
STL $t8 0x9BEEF    # T8 == 0xFFF9BEEF (remember sign-extension)
STU $t8 $t8 0xDEAD # T8 == 0xDEADBEEF (upper bits were cleared, then replaced)

Comparisons

Every instruction can be conditionally executed depending on the current comparison query. This is done by checking the most recent signed comparison stored in the comparison register $CR. Queries are tested by using bit masks. The following table lists every bit mask combination.

NUMBERBITSNAMECOMPARISON
?0000?NOFALSE
?1001?GTA > B
?2010?EQA == B
?3011?GEA >= B
?4100?LTA < B
?5101?NEA != B
?6110?LEA <= B
?7111?OKTRUE

Using this scheme, each of the three bits has a specific meaning. The most significant bit signifies less than, the middle bit signifies equal to, and the least significant bit signifies greater than.

When a signed comparison occurs, the value of $CR is updated to one of: ?LT, ?EQ, or ?GT. This register then remembers this result value until a new comparison occurs, at which point it is overwritten.

Each instruction contains a comparison query from this table. If not specified, an instruction will assume the unconditional query of ?OK, meaning that an instruction will execute regardless of any previous comparisons. Before executing an instruction, the field ?CQ of the instruction is checked against the contents of $CR to determine if the instruction should execute. For example, in the following code $t0 is set to -1 if it is greater than or equal to $t1, otherwise it is set to 0.

CMP $t0 $t1
MOV ?GE $t0 -1
MOV ?LT $t0 0

Before any comparisons have occurred (such as at the start of a program) the value of $CR is ?OK. This means that every comparison query except for ?NO will automatically pass before the first comparison occurs. Notably, after any comparison occurs, $CR will never take on the values: ?OK, ?NO, ?GE, ?LE, or ?NE. Phrased another way, $CR starts at a special value of ?OK, and after any comparison, it will only ever change to one of: ?LT, ?EQ, or ?GT.

Registers

Ida 2 supports 16 general-purpose registers, usable in any register field of a given instruction, and uses 2 special-purpose registers that are indirectly accessible. Ida 2 has been designed to function without the need for an always-zero register.

NUMBERNAMEBITSPURPOSE
$00$rv32Return Value
$01$ra32Return Address
$02$a032Argument 0
$03$a132Argument 1
$04$a232Argument 2
$05$a332Argument 3
$06$t032Temporary 0
$07$t132Temporary 1
$08$t232Temporary 2
$09$t332Temporary 3
$10$t432Temporary 4
$11$t532Temporary 5
$12$t632Temporary 6
$13$t732Temporary 7
$14$t832Temporary 8
$15$sp32Stack Pointer
-$PC24Program Counter
-$CR3Comparison Result

Assembly Details

Ida 2 is entirely case-insensitive. Spaces, tabs, commas, and parentheses can be used to delimit instruction arguments. Comments begin with the the number sign and continue until the end of their line. Number constants can be in binary (0b...), hex (0x...), and decimal. All constants can be positive or negative. Numeric comparison queries (?4) and registers ($06) are supported as well.

The final feature of the syntax are labels. Labels directly reference 24-bit addresses. They can be used to refer to any line, and they can be referenced inside the immediate field of some of the instructions, where they must begin with an at symbol. In general, labels can include any character that is not an argument delimit or comment character.

In most cases, labels evaluate to their address value. For example, consider the following piece of code, in which the label is swapped out for its absolute address by the assembler. Recall that this is possible because JMP has a 24-bit immediate field. All 1-ARG instructions treat labels as being their absolute address values.

0x000001:       JMP @GOTO # == 0x000003
0x000002:       MOV $t0 0
0x000003: GOTO: MOV $t1 0

Now consider the case of LNK. The assembler replaces the label with a relative address. All 2-ARG instructions treat labels as being their relative address values.

0x000001:       LNK $ra @GOTO # == 2
0x000002:       MOV $t0 0
0x000003: GOTO: MOV $t1 0

Note that 3-ARG instructions cannot reference labels at all.

The standard behavior for the Ida 2 assembler is to abort immediately if any value is too wide to fit into its slot within its instruction.

Pseudo-Instructions

A few pseudo-instructions are supported, most importantly the jump-and-link instruction JAL. Every pseudo-instruction works by directly mapping to a set of true instructions. Thus every pseudo-instruction can be replaced with real instructions (the only exceptions involve using SLI or SUI with labels).

In the following table, the new term IMM32 has been introduced, representing any 32-bit immediate value. Labels can be used for this argument, in all cases being treated as their absolute address values.

INSTRUCTIONGENERATESNOTE
NOP
SHL ?NO $rv $rv 0
No-op
END ?CQ
SELF: JMP ?CQ @SELF
Halt program (using an infinite loop)
JAL ?CQ RI24
LNK ?CQ $ra 2
JMP ?CQ RI24
Jump and link
RTN ?CQ RI20
STL ?CQ $rv RI20
JMP ?CQ $ra
Return value
NIL ?CQ $RI
STL ?CQ $RI 0
Clear register
MOV ?CQ $RD RI20
STL ?CQ $RD RI20
Copy value (alias for STL)
SLI ?CQ $RD IMM32STL ?CQ $RD IMM32[15:0]Set low from immediate slice
SUI ?CQ $RD $RS IMM32STU ?CQ $RD $RS IMM32[31:16]Set high from immediate slice
STI ?CQ $RD IMM32
SLI ?CQ $RD IMM32
SUI ?CQ $RD $RD IMM32
Set immediate
PSH ?CQ $RD
ADD ?CQ $sp $sp -1
MST ?CQ $RD 0($sp)
Push to stack
TOP ?CQ $RDMLD ?CQ $RD 0($sp)Top of stack
POP ?CQ $RD
MLD ?CQ $RD 0($sp)
ADD ?CQ $sp $sp 1
Pop from stack

Examples

The following are some small sample programs to demonstrate the usage of Ida 2. The first sample program computes the nth fibonacci number.

     MOV $a0 9 # compute the nth fibonacci number
     JAL @FIB
     END  
  
     # COMPUTE THE A0th FIBONACCI NUMBER,
     # ASSUMING A0 IS >= 0
FIB: CMP $a0 1       # is A0 in a base case?
     JMP ?GT @REC    # if not, then skip
     MOV ?LE $rv $a0 # if A0 <= 1, return A0
     JMP $ra
REC: ADD $sp $sp -2  # allocate 2 slots on stack
     MST $ra 0($sp)  # backup RA
     MST $a0 1($sp)  # backup A0
     ADD $a0 $a0 -1  # setup recursive call
     JAL @FIB
     MLD $a0 1($sp)  # restore A0 
     MST $rv 1($sp)  # backup RV
     ADD $a0 $a0 -2  # setup recursive call
     JAL @FIB
     MLD $t0 1($sp)  # get 1st RV
     ADD $rv $rv $t0 # add together the 2 RVs
     MLD $ra 0($sp)  # restore RA
     ADD $sp $sp 2   # clean up the stack
     JMP $ra         # return

The second sample program performs a binary search on the first 1000 words in the assumedly sorted data memory.

     MOV $a0 0      # array is here
     MOV $a2 0      # from index 0
     MOV $a3 1000   # to 1000
     MOV $a1 0xab5a # search for this number
     JAL @BIN
     END  
  
     # PERFORM BINARY SEARCH ON ARRAY A0,
     # TO FIND INDEX OF KEY A1, LOCATED
     # BETWEEN INDEX A2 AND A3
BIN: MOV $rv -1
     CMP $a3 $a2        # max ? mid
     JMP ?LT $ra        # if max < mid: return -1
     SUB $t8 $a3 $a2    # mid = min+((max-min)/2)
     SHR $t8 $t8 1
     ADD $t8 $a2 $t8
     MLD $t0 $t8($a0)   # tmp = array[mid]
     CMP $t0 $a1        # tmp ? key
     ADD ?LT $a2 $t8 1  # if tmp < key: min = mid+1
     ADD ?GT $a3 $t8 -1 # if tmp > key: max = mid-1
     JMP ?NE @BIN       # if tmp != key: next iteration
     MOV $rv $t8        # return mid
     JMP $ra