Venus Reference

This page (WIP) covers usage of the Venus CLI and web interfaces.

Venus Web Interface

The Venus web interface is available at https://venus.cs61c.org.

The "Editor" Tab

  • Enter your code in the "Editor" tab
  • Programs start at the first line of assembly code regardless of the label, unless the main function is marked with .globl (see below). That means that the main function must be put first if it is not declared .globl.
    • Note: Sometimes, we want to pre-allocate some items in memory before the program starts executing. Since this allocation isn't actual code, we can place it before the main function.
  • Programs end with an ecall with argument value 10. This signals for the program to exit. The ecall instructions are analogous to "System Calls" and allow us to do things such as print to the console or request chunks of memory from the heap.
    • The exception to the ecall exit rule is if your main is marked global, in which case you can either exit with an ecall or return the program's exit code in a0 (like the return value of the main function in C).
  • Labels end with a colon (:).
  • Comments start with a pound sign (#).
  • You CANNOT put more than one instruction per line.
  • You can use the save keyboard shortcut (Cmd + s / Ctrl + s depending on your platform, it will save the file). You can check the name of the active file at the bottom of the editor: if it is unnamed, it will prompt you to download the file, and if you hit save rapidly, you will receive this prompt as well.
  • When you are done editing, click the "Simulator" tab to prepare for execution.

The "Simulator" Tab

  • When you first reach the "Simulator" tab from the "Editor" tab, you'll see an "Assemble and Simulate from Editor" button that will report errors in your code or reassemble it.
  • Breakpoints can be set by clicking on the desired line of code before you hit "Run". You can also insert an ebreak instruction in your code to automatically force Venus to pause.
    • You can simulate conditional breakpoints by placing an ebreak within a branch, like so:
      # ...code
      
      li t0, 3 # Stops execution if a0 == 3
      bne a0, t0, after_breakpoint
      ebreak
      
      after_breakpoint:
      # ...code
      
  • The "Run" button runs your program until you reach a breakpoint, much like GDB's "run" command.
  • The "Step" button will go to the next assembly instruction.
  • The "Prev" button reverts a single assembly instruction.
  • The "Reset" button ends the current run, clears all breakpoints, and returns to the beginning of execution.
  • "Dump" gives you a dump of the hexadecimal representation of every instruction in your program.
  • You can see the contents of registers, memory, and caches through the sidebar on the right.
    • You can manually poke values in registers to affect the execution of your program.
    • In the "Memory" tab, use the "Jump to" or "Address" boxes to quickly navigate to a certain address in your program.
    • For all menus, use "Display Settings" at the bottom of the pane to toggle between hex, ASCII, two's complement decimal, and unsigned decimal interpretations of your values.
  • The "Trace" button will dump values in registers and memory based off a specified format, described in more detail in the traces section.

The "Chocopy" Tab

You can ignore this tab. Chocopy is a variant of Python used in CS 164 (Programming Languages and Compilers), so it's not really relevant to this course. The long and short of it is that you can actually compile a subset of Python into RISC-V assembly that you can then execute.

The "Venus" Tab

Most of the basic functionality you need for writing assembly programs exists in the "Editor" and "Simulator" tabs. The "Venus" tab exposes a terminal-like interface that lets us work with multiple files to build more complex programs.

Terminal Commands

  • Typing help in the terminal will show the list of commands (you may need to scroll the terminal to the right to see all of them), and help COMMAND will give you information about COMMAND.
  • A few common commands are listed here:
    • File manipulation:
      • ls: Lists the contents of the specified directory (or the current directory if no arguments are provided.
      • cd: Changes the directory your terminal is in. Like with a normal terminal, you can use Tab to autocomplete the names of subfolders.
      • touch: Creates a new empty file that we can modify later.
      • xxd: Prints the contents of a file in hexadecimal.
      • Other useful commands (similar to what you'd find in a standard shell): cat, clear, cp, mkdir, mv, rm, pwd, zip, unzip
    • Venus editor and simulator:
      • edit: Opens up the specified file in the "Editor" tab (for example, edit file.s). Note that the Cmd + s and Ctrl + s shortcuts will work to save the file in the virtual filesystem if you choose to edit a different file later. But, to ensure that your work is saved, we recommend that you periodically save local copies of every file. If you've mounted your local filesystem, then saving will reflect the changes in your local filesystem automatically.
      • run: Runs the provided assembly file(s) to completion in the terminal. Output will be displayed in the terminal as well.
      • vdb: Links and assembles the provided assembly file(s) and opens the result in the "Simulator" tab for debugging.
    • Connecting to a local file system:
      • upload: Opens a prompt to upload files from our local computer to the virtual filesystem. You can hold down the Cmd or Ctrl keys to select multiple files to upload. Once a file is uploaded, we can edit it in the Venus editor using the edit command.
      • download: Downloads the specified files locally. You can specify multiple files to download, e.g. download file1.s file2.s, and each one will have a corresponding prompt to ask you where to download the files.
      • mount: "Mounts" your local file system to the Venus virtual file system. This command takes in the URL at which the Venus jar webserver is running, followed by the name you want to assign to the mounted folder in the virtual file system. For example, if you ran the Venus jar with java -jar tools/venus.jar . -dm --port 6162 on your local machine, and wanted the mounted folder to be accessible at vmfs in the virtual file system, you would run mount http://localhost:6162 vmfs in the Venus terminal.
        • If you omit the --port flag, the port defaults to 6161. Running mount local will default to this port.
        • If you omit the path argument to the mount command in the Venus terminal, it will place it at drive by default.
        • You may be prompted to provide an encryption key in the browser. The key should be shown in the last line of the Java server's terminal output, which you should then copy/paste. You can also provide the key within the Venus web terminal, with something like mount local vmfs <encryption key>.
      • umount: "Unmounts" a folder that was mounted via the mount command.

Passing arguments to your program

Use the "Simulator Default Args" field to set the arguments to your program. These will then be reflected when you run your program in the "Simulator" tab: a0 will be initialized to the equivalent of argc and a1 will be initialized to the equivalent of argv.

Settings

Configuration options for Venus can be found in the "Settings" pane under the "Venus" tab.

General

  • "Simulator Default Args" - passes arguments to your program
  • "Text Start" - identifies the start of the text segment
  • "Max History" - specifies the maximum number of execution steps that can be reverted; leave negative for no limit
  • "Aligned Addressing" - if on, forces all memory accesses to be aligned to the size of the data type, and will raise a runtime error if a memory access is not
  • "Mutable Text" - if on, allows the text segment (where instructions are stored) to be overwritten during the execution of your program
  • "Only Ecall Exit" - if on, requires main to terminate the program via ecall, as opposed to terminating when the PC exceeds the end of the text section
  • "Default Reg States" - if on, initializes a0 to argc, a1 to argv, gp to the start of the static section of memory, sp to the top of the stack, and ra to the return address of the main function if a global main label is found; if off, all registers are initialized to 0
  • "Allow Access" - if on, allows memory accesses to occur at addresses between the stack and heap
  • "Max number of steps" - sets the limit for the number of steps your program can execute; leave negative for no limit
  • "Dark Mode" - activates dark mode

Calling Convention

Toggles the Calling Convention Checker (see Tools for explanation). The checker's output will be dumped into the simulator console when enabled.

Tracer

Allows register and memory values to be printed during program execution. See traces for more details.

The Venus CLI

The latest version of the Venus CLI can be downloaded at https://venus.cs61c.org/jvm/venus-jvm-latest.jar. It requires a working Java install to run.

Passing arguments to your program

Everything passed after the file name will be passed to your program as command line arguments. For example, if you wanted to pass arguments arg1, arg2, and arg3 to test.s, you would run the following:

java -jar venus.jar test.s arg1 arg2 arg3

a0 will be initialized to the equivalent of argv and a1 will be initialized to the equivalent of argc.

To provide simulator or assembler options to Venus, place your flags between the JAR and the name of the assembly file. For example, if you wanted to run test.s with immutable text (-it) and the calling convention checker (-cc), you would run this command:

java -jar venus.jar -it -cc test.s arg1 arg2 arg3

Note: Flags to configure Venus assembly or simulation behavior can be passed after the name of the file. These arguments will be consumed by the simulator and not passed to the program; this behavior may change in the future with improvements to Venus's argument parser. If you wish to pass an argument to your program that shares a name with a Venus flag, then add -- before your program arguments. For example, to pass the string "-it" as the first argument to test.s, you would run

java -jar venus.jar test.s -- -it

Common Venus Errors

  • Ran for more than max allowed steps!: Venus will automatically terminate if your program runs for too many steps. This is expected for large MNIST sized inputs, and you can workaround it with the -ms flag. If you're getting this for small inputs, you might have an infinite loop.
  • Attempting to access uninitialized memory between the stack and heap.: Your code is trying to read or write to a memory address between the stack and heap pointers, which is causing a segmentation fault. Check that you are allocating enough memory, and that you are accessing the correct addresses.
  • The magic value for this malloc node is incorrect! This means you are overriding malloc metadata OR have specified the address of an incorrect malloc node!: Your code is modifying the metadata of a malloc node. This error can come from any of the alloc or free commands as venus implements a linked list form of malloc. The metadata is right below (lower address) the pointer to that location so if you write to the wrong location, you may corrupt that data. Venus has a method to detect some corruptions which is why you get this error. Check that you are correctly indexing any malloced data and that you are not writing out of bounds of what you allocated.

Tools

Calling Convention Checker

The RISC-V calling convention specifies which registers are saved by the caller of a function, and which are saved by the callee. Bugs resulting from a failure to comply with the calling convention checker can be difficult to track down on their own, so Venus provides the -cc (or --callingConvention) flag to automatically detect certain kinds of calling convention errors.

Before you use the calling convention checker, you should be aware of two things:

  1. Venus cannot detect all calling convention violations - the fact that function calls in assembly are essentially just jumps to labels means Venus has to be very conservative in choosing what to detect as a function call. The CC checker serves as an excellent sanity check (much like Valgrind does for C memory bugs), but there will be some bugs that you ultimately need to figure out manually.
  2. Related to the above, the CC checker will only reliably examine calling convention violations within the body of functions exported with the .globl directive. You may sometimes see errors reported within functions that aren't exported, but this is usually because they are being called by a function that's declared .globl.

Violation Messages

There are currently three types of error messages that the CC checker can produce, each explained below.

In each example, func refers to the function where the error message was reported. You can run these examples yourself in Venus by modifying the following code snippet:

.globl func
main:
    jal func
    li a0, 10 # Code for exit ecall
    ecall

func:
    # Paste the example definition of func
"Setting of a saved register (s0) which has not been saved!"

func overwrote a callee-saved register (any of s0 through s11). To fix this, make sure you func stores the values of any registers it overwrites by clearing space on the stack and storing them in memory in the prologue, and then restoring their values and restoring the stack pointer in the epilogue.

Example with error:

func:
    li s0, 100 # === Error reported here ===
    li s1, 128 # === Error reported here ===
    ret

Fixed example:

func:
    # BEGIN PROLOGUE
    # Each clobbered register (4 bytes each) needs to be stored
    addi sp, sp, -8
    sw s0, 0(sp)
    sw s1, 4(sp)
    # END PROLOGUE
    li s0, 100
    li s1, 128
    # BEGIN EPILOGUE
    lw s1, 4(sp)
    lw s0, 0(sp)
    addi sp, sp, 8
    # END EPILOGUE
    ret
"Save register s0 not correctly restored before return! Expected <hex value>, Actual <hex value>"

The value of s0 in the function that called func was different before and after func was called. This error message complements the previous one, but will instead be reported on the line of the ret (or jalr ra) instruction.

Example with error:

func:
    # BEGIN PROLOGUE
    addi sp, sp, -8
    sw s0, 0(sp)
    sw s1, 4(sp)
    # END PROLOGUE
    li s0, 100
    li s1, 128
    # BEGIN EPILOGUE
    # Forgot to restore the values of s0 and s1!
    addi sp, sp, 8
    # END EPILOGUE
    ret # === Error reported twice here ===

Fixed example: Same as for the previous error. Just make sure to include both the prologue and epilogue.

"Usage of unset register t0"

func attempted to read from a register that wasn't set by the caller, and wasn't yet initialized within the body of func. Even if the function that calls func set values in a register like t0 or s0, attempting to use their values within func would be violating an abstraction barrier: func shouldn't need to know anything about the values of registers in its caller, with the exception of the argument, stack pointer, and return address.

Example with error:

func:
    mv a0, t0 # === Error reported here ===
    ret

Fixed example:

This sort of error usually originates due to a typo (the instruction is reading from the wrong register, or the order of arguments to the instruction was flipped accidentally), or a violation of abstraction barriers as described above. Fixes will vary depending on what the function was intended to do.

Traces

You can set "traces" in both the web and CLI versions of Venus to print values in registers and memory during the execution of your program. In the web interface, you can enable traces in the "Traces" part of the Settings pane; in the command line you can enable and customize traces with the --trace and --tracepattern flags.

The trace patterns support a few special symbols, somewhat like a C format string. They are as follows:

  • \t: tab indent
  • \n: newline
  • %0% through %31%: the value in the corresponding register
  • %line%: the line number of the code being executed
  • %pc%: the program counter of the current instruction
  • %inst%: the code of the current instruction
  • %output%: the output of an ecall message
  • %decode%: the decoded machine code of the current instruction

Writing larger RISC-V programs

Once we begin to write more complex RISC-V programs, it will become necessary to split our code into multiple files and debug/test them individually. The "Venus" tab of the web editor provides access to a terminal (and corresponding virtual filesystem) that lets us edit and test programs built from multiple files.

Note: To ensure that you don't lose any of your work, make sure to save local copies of your files frequently. To be doubly sure, also save your local work with git, and push those saves frequently!

Working with Multiple Files

The .globl directive identifies functions that we want to export to other files, similar to including a function in a header file in C.

Venus supports the .import directive (not technically part of the RISC-V spec) to access other assembly files, similar to the C include directive. It will only make labels marked .globl available.

Behavior not defined in the RISC-V spec

Null Pointers

Unlike most real-world programs (and the RISC-V spec), dereferencing the null pointer DOES NOT segfault. This is because by default, the start of the text segment is set to 0x0000_0000, whereas a real system would likely set it to something like 0x10000_0000. This value can be configured in the settings pane of the web interface.

System Calls

The ecall instruction is used to perform system calls or request other privileged operations such as accessing the file system or writing output to console.

To perform a syscall, load the appropriate syscall number into the a0 register, then place the syscall's arguments in order in the remaining argument registers (a1, a2, etc.) as needed. For example, the following assembly snippet would print the number 1024 to stdout:

li a0, 1    # syscall number for printing integer
li a1, 1024 # the integer we're printing
ecall       # issue system call

Syscalls that return values will place the value in a0, as with any other function.

Note: The calling convention for RISC-V on Linux (described in the Linux manpages under "Architecture calling conventions") actually specifies that the syscall number be passed through the a7 register. Venus uses a0 for simplicity, and follows in the tradition of the SPIM simulator.

List of System Calls

See the Venus wiki for the full list of available calls.