# Ida Assembly

Ida Assembly is a hypothetical instruction set with a few key features. First, every instruction has a field used for registers or immediate values, the preference for which is given by a single bit within the instruction. Second, all jump instructions use truly absolute addressing (there are no relative branches).

## Notation

There are a few special notations outlined here for reference.

 NOTATION MEANING EXAMPLE {X, Y} Concatenate X and Y together. {10, 11, 011} == 1011011 (X)[B:A] Slice bits A through B (inclusive) out of X. (1100110111)[4:0] == 10111 `SignExtendM←N`(X) Sign-extend X from N bits to M bits. `SignExtend8←4`(1001) == 11111001 `MEM`((N)[24:0]) The half-word quantity from data memory at 25-bit address N.

Further note that bits are zero-indexed from least significant bit to most significant bit. For example, given a 32-bit value X, X[31:0] == X == {X[31:20], X[19:5], X[4:0]}.

## Memory

In Ida Assembly, instruction memory and data memory are separate and have different attributes.

### Instruction Memory

Instruction memory is composed of 32-bit word values, where each word is addressed by a 24-bit address. This means that in order to go from the current instruction to the next instruction, the instruction address is incremented by 1. Here is an example diagram of instruction memory:

 ADDRESS INSTRUCTION 0x000000 0x39c00009 0x000001 0x68840004 ... ... 0xfffffe 0x784e0001 0xffffff 0xc9c40002

### Data Memory

Data memory is composed of 16-bit half-word values, where each half word is addressed by a 25-bit address. This means that in order to allocate space on the stack for a word value, the stack pointer must be decremented by 2. Here is an example diagram of data memory:

 ADDRESS DATA 0x0000000 0x48fc 0x0000001 0x5514 ... ... 0x1fffffe 0x67f9 0x1ffffff 0x6a9e

## Registers

Ida Assembly supports 32 general-purpose registers, usable in any register field of a given instruction. Additionally, there is the program counter `PC`, which contains the address of the instruction being fetched and executed.

 NUMBER NAME BITS PURPOSE `\$0` `\$zero` 32 Always Zero `\$1` `\$jc` 32 Jump Condition `\$2` `\$sp` 32 Stack Pointer `\$3` `\$fp` 32 Frame Pointer `\$4` - `\$5` `\$v0` - `\$v1` 32 Return Values `\$6` `\$ra` 32 Return Address `\$7` - `\$11` `\$a0` - `\$a4` 32 Arguments `\$12` - `\$31` `\$t0` - `\$t19` 32 Temporaries `PC` 24 Program Counter

The `PC`, `\$sp`, `\$fp`, and `\$zero` registers always start at an initial value of zero. For all other registers, the initial values are undefined.

## Encodings

### R-Type Format

The R-Type format is the primary instruction format. Note that the field I (the 27th bit) is a toggle bit that indicates whether to use `\$RI` or `IMM` in executing the instruction.

 TYPE 3128 27 2622 2117 165 40 R-Type `OP` I=0 `\$RD` `\$RS` - `\$RI` R-Type `OP` I=1 `\$RD` `\$RS` `IMM`

### J-Type Format

The J-Type format is the instruction format for jumps. Note again that the field I (the 27th bit) tells whether to use `\$RI` or `ADDR` in executing the jump. Additionally, there is a jump code `JC` field, which indicates which type of jump is performed. The different types are listed later.

 TYPE 3128 27 2624 235 40 J-Type `OP` I=0 `JC` - `\$RI` J-Type `OP` I=1 `JC` `ADDR`

## Instruction Set

### R-Type Instructions

#### Shift Left

 OP SYNTAXES RTL OPERATIONS 00 `SL` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` << `\$RI`[4:0] `SL` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` << `IMM`[4:0]

This shift should smear the least significant bit of the value being shifted. For example, here are some shifts (using 16-bit values instead of 32-bit values):

`0b0000000000001101 << 3 = 0b0000000001101111`
`0b0000000000011010 << 2 = 0b0000000001101000`

Note that this is not a logical left shift, and is instead the left-shifting equivalent of an arithmetic right shift.

#### Shift Right

 OP SYNTAXES RTL OPERATIONS 01 `SR` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` >> `\$RI`[4:0] `SR` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` >> `IMM`[4:0]

This shift should smear the most significant bit of the value being shifted. For example, here are some shifts (using 16-bit values instead of 32-bit values):

`0b1111111100011101 >> 4 = 0b1111111111110001`
`0b0000000000011010 >> 2 = 0b0000000000000110`

Note that this is an arithmetic right shift.

#### And

 OP SYNTAXES RTL OPERATIONS 02 `AND` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` & `\$RI` `AND` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` & `SignExtend32←17`(`IMM`)

#### Inclusive Or

 OP SYNTAXES RTL OPERATIONS 03 `OR` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` | `\$RI` `OR` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` | `SignExtend32←17`(`IMM`)

#### Reverse

 OP SYNTAXES RTL OPERATIONS 04 `REV` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` <> `\$RI`[4:0] `REV` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` <> `IMM`[4:0]

Reverse swaps pairs of bits, given a 5-bit reverse pattern. Specifically, if the nth bit of the pattern is 1, then bit pairs of size n will be swapped. Here is an example of what this means, given a 8-bit number (with bits as ABCDEFGH) and a 3-bit reverse pattern:

• 0b000: A B C D E F G H (nothing changed)
• 0b001: B A D C F E H G (groups of 20=1 bits were swapped)
• 0b010: C D A B G H E F (groups of 21=2 bits were swapped)
• 0b011: D C B A H G F E (groups of 1 bits, then groups of 2 bits were swapped)
• 0b100: E F G H A B C D (groups of 22=4 bits were swapped)
• 0b101: F E H G B A D C (groups of 1 bits, then groups of 4 bits were swapped)
• 0b110: G H E F C D A B (groups of 2 bits, then groups of 4 bits were swapped)
• 0b111: H G F E D C B A (groups of 1 bits, groups of 2 bits, then groups of 4 bits were swapped)

#### Exclusive Or

 OP SYNTAXES RTL OPERATIONS 05 `XOR` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` ^ `\$RI` `XOR` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` ^ `SignExtend32←17`(`IMM`)

 OP SYNTAXES RTL OPERATIONS 06 `ADD` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` + `\$RI` `ADD` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` + `SignExtend32←17`(`IMM`)

Overflow should be ignored, and should require no special consideration.

#### Subtract

 OP SYNTAXES RTL OPERATIONS 07 `SUB` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` - `\$RI` `SUB` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` - `SignExtend32←17`(`IMM`)

Overflow should be ignored, and should require no special consideration.

#### Multiply

 OP SYNTAXES RTL OPERATIONS 08 `MUL` `\$RD` `\$RS` `\$RI` `\$RD` ← `\$RS` * `\$RI` `MUL` `\$RD` `\$RS` `IMM` `\$RD` ← `\$RS` * `SignExtend32←17`(`IMM`)

Note that this only keeps the lower bits of the multiply result, meaning overflow should be ignored, and should require no special consideration.

#### Divide

 OP SYNTAXES RTL OPERATIONS 09 `DIV` `\$RD` `\$RS` `\$RI` if(`\$RI` != 0) `\$RD` ← `\$RS` / `\$RI` else if(`\$RS` > 0) `\$RD` ← MAX_VALUE else if(`\$RS` < 0) `\$RD` ← MIN_VALUE else `\$RD` ← 0 `DIV` `\$RD` `\$RS` `IMM` if(`SignExtend32←17`(`IMM`) != 0) `\$RD` ← `\$RS` / `SignExtend32←17`(`IMM`) else if(`\$RS` > 0) `\$RD` ← MAX_VALUE else if(`\$RS` < 0) `\$RD` ← MIN_VALUE else `\$RD` ← 0

Division is performed as a regular signed division. If the divisor is zero, then the result should be the maximum positive 32-bit 2's complement value if the dividend is positive and the minimum negative 32-bit 2's complement value if the dividend is negative. This mimics rounding to positive or negative infinity when dividing by zero. Finally, zero divided by zero should be zero.

 OP SYNTAXES RTL OPERATIONS 10 `LH` `\$RD` `\$RI`(`\$RS`) `\$RD` ← `SignExtend32←16`(`MEM`((`\$RS` + `\$RI`)[24:0])) `LH` `\$RD` `IMM`(`\$RS`) `\$RD` ← `SignExtend32←16`(`MEM`((`\$RS` + `SignExtend32←17`(`IMM`))[24:0]))

#### Store Half Word

 OP SYNTAXES RTL OPERATIONS 11 `SH` `\$RD` `\$RI`(`\$RS`) `MEM`((`\$RS` + `\$RI`)[24:0]) ← `\$RD`[15:0] `SH` `\$RD` `IMM`(`\$RS`) `MEM`((`\$RS` + `SignExtend32←17`(`IMM`))[24:0]) ← `\$RD`[15:0]

 OP SYNTAXES RTL OPERATIONS 12 `LW` `\$RD` `\$RI`(`\$RS`) `\$RD`[15:0] ← `MEM`((`\$RS` + `\$RI`)[24:0]) `\$RD`[31:16] ← `MEM`((`\$RS` + `\$RI` + 1)[24:0]) `LW` `\$RD` `IMM`(`\$RS`) `\$RD`[15:0] ← `MEM`((`\$RS` + `SignExtend32←17`(`IMM`))[24:0]) `\$RD`[31:16] ← `MEM`((`\$RS` + `SignExtend32←17`(`IMM`) + 1)[24:0])

#### Store Word

 OP SYNTAXES RTL OPERATIONS 13 `SW` `\$RD` `\$RI`(`\$RS`) `MEM`((`\$RS` + `\$RI`)[24:0]) ← `\$RD`[15:0] `MEM`((`\$RS` + `\$RI` + 1)[24:0]) ← `\$RD`[31:16] `SW` `\$RD` `IMM`(`\$RS`) `MEM`((`\$RS` + `SignExtend32←17`(`IMM`))[24:0]) ← `\$RD`[15:0] `MEM`((`\$RS` + `SignExtend32←17`(`IMM`) + 1)[24:0]) ← `\$RD`[31:16]

Note that just like load addresses, store addresses do not have to be word-aligned. Again, even the maximum half-word address can be stored to (in this case, the second half of the word is stored to the minimum half-word address).

#### Set Upper

 OP SYNTAXES RTL OPERATIONS 14 `STU` `\$RD` `\$RS` `\$RI` `\$RD` ← {`\$RI`[16:0], `\$RS`[14:0]} `STU` `\$RD` `\$RS` `IMM` `\$RD` ← {`IMM`, `\$RS`[14:0]}

Note that this operation combines the lower 15 bits of its first argument with the lower 17 bits of its second argument.

### J-Type Instructions

Conditional jump instructions depend on the jump condition register `\$jc`. These jumps are intended to follow a subtraction instruction, but they do not have to.

`# jump if t0 > t19`
``SUB` `\$jc` `\$t0` `\$t19` # jc = t0 - t19`
``JGE` @GOTO        # (jc >= 0) <==> (t0 >= t19)`
` `
`# jump if t0 != 0`
``OR`  `\$jc` `\$t0` `\$zero` # jc = t0`
``JNE` @GOTO         # (jc != 0) <==> (t0 != 0)`

#### Jump

 OP JC SYNTAXES RTL OPERATIONS 15 0 `J` `\$RI` `PC` ← `\$RI`[23:0] `J` `ADDR` `PC` ← `ADDR`

Note that this is an unconditional jump.

#### Jump If Greater Than Zero

 OP JC SYNTAXES RTL OPERATIONS 15 1 `JGT` `\$RI` if(`\$jc` > 0) `PC` ← `\$RI`[23:0] `JGT` `ADDR` if(`\$jc` > 0) `PC` ← `ADDR`

#### Jump If Equal To Zero

 OP JC SYNTAXES RTL OPERATIONS 15 2 `JEQ` `\$RI` if(`\$jc` == 0) `PC` ← `\$RI`[23:0] `JEQ` `ADDR` if(`\$jc` == 0) `PC` ← `ADDR`

#### Jump If Less Than Zero

 OP JC SYNTAXES RTL OPERATIONS 15 3 `JLT` `\$RI` if(`\$jc` < 0) `PC` ← `\$RI`[23:0] `JLT` `ADDR` if(`\$jc` < 0) `PC` ← `ADDR`

#### Jump If Less Than Or Equal To Zero

 OP JC SYNTAXES RTL OPERATIONS 15 4 `JLE` `\$RI` if(`\$jc` <= 0) `PC` ← `\$RI`[23:0] `JLE` `ADDR` if(`\$jc` <= 0) `PC` ← `ADDR`

#### Jump If Not Equal To Zero

 OP JC SYNTAXES RTL OPERATIONS 15 5 `JNE` `\$RI` if(`\$jc` != 0) `PC` ← `\$RI`[23:0] `JNE` `ADDR` if(`\$jc` != 0) `PC` ← `ADDR`

#### Jump If Greater Than Or Equal To Zero

 OP JC SYNTAXES RTL OPERATIONS 15 6 `JGE` `\$RI` if(`\$jc` >= 0) `PC` ← `\$RI`[23:0] `JGE` `ADDR` if(`\$jc` >= 0) `PC` ← `ADDR`

 OP JC SYNTAXES RTL OPERATIONS 15 7 `JAL` `\$RI` tmp ← `\$RI`[23:0] `\$ra` ← {00000000, `PC` + 1} `PC` ← tmp `JAL` `ADDR` tmp ← `ADDR` `\$ra` ← {00000000, `PC` + 1} `PC` ← tmp

Note that this is an unconditional jump. Also, notice that the next program counter value is `PC`+1, because instruction memory is word-addressed.

### After Each Instruction

After any instruction executes, the program counter `PC` advances by 1 (recall that the 24-bit instruction memory addresses are word addresses). The only exception to this rule are the jump instructions, in which the program counter is instead set explicitly. If the program counter overflows, it should wrap around silently.

## Assembly Details

Ida Assembly is entirely case-insensitive. Spaces, tabs, commas, and parentheses can be used to delimit instruction arguments. Comments begin with the a number sign and continue until the end of their line. Number constants can be in unsigned hex (such as 0xD12F), unsigned binary (such as 0b1010), signed decimal, or unsigned decimal.

The final feature of the syntax are labels. Labels directly reference 24-bit instruction addresses. They can be used to refer to any line, and they can be referenced inside the `ADDR` field of jump instructions, where they must begin with an at symbol. In general, labels can include any character that is not an argument delimit or comment character.

`0x000001:       `J` @GOTO # uses: 0x000223`
`                ...`
`0x000223: GOTO: ...`

The standard behavior for the Ida Assembly assembler is to abort if any value is too wide to fit into its slot within its instruction, rather than to fit only part of the value into the slot.