Lab 9: Parallelism III - OpenMPI

Deadline: Thursday, August 3, 11:59:59 PM PT

Setup

You must complete this lab on the hive machines (not your local machine). See Lab 0 if you need to set up the hive machines again.

In your labs directory on the hive machine, pull any changes you may have made in past labs:

git pull origin main

Still in your labs directory on the hive machine, pull the files for this lab with:

git pull starter main

If you run into any git errors, please check out the common errors page.


Open MPI

The Open MPI project provides a way of writing programs which can be run on multiple processes. We can use its C libraries by calling their functions. Then, when we run the program, Open MPI will create a bunch of processes and run a copy of the code on each process. Here is a list of the most important functions for this class:

  • int MPI_Init(int* argc, char*** argv) should be called at the start of the program, passing in the addresses of argc and argv.
  • int MPI_Finalize() should be called at the end of the program.
  • int MPI_Comm_size(MPI_COMM_WORLD, int *size) gets the total number of processes running the program, and puts it in size.
  • int MPI_Comm_rank(MPI_COMM_WORLD, int *rank) gets the ID of the current process (0 ∼ total number of processes - 1) and puts it in rank.
  • int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, 0, MPI_COMM_WORLD) (Man page)
    • buf is where the message to send should be stored
    • count is the number of elements within the message to send
    • datatype is the datatype of each element within the message
    • dest is the process ID of the recipient of the message
    • tag is the tag of the message
      • For the scope of this lab, feel free to use 0.
    • comm is the communication context of the message
      • For the scope of this class, use MPI_COMM_WORLD
  • int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) (Man page)
    • buf is used to store the received message(s)
    • count is the maximum number of elements to receive
    • datatype is the datatype of each element within the received message
    • source is the source where the message should be from
      • If you want to receive a message from any source, set the source to be MPI_ANY_SOURCE.
      • The source of the message can be found in the MPI_SOURCE field of the outputted status struct.
    • tag is the tag that the message must be sent with
      • If you want to receive a message with any tag, set the tag to be MPI_ANY_TAG.
    • comm is the communication context that the message must be sent within
      • For the scope of this class, use MPI_COMM_WORLD
    • status will store a status struct with additional information
      • If you don’t need the information in the status struct (e.g. because you already know the source of the message), set the status address to MPI_STATUS_IGNORE.

For Open MPI, make sure you use mpicc (instead of gcc) to compile and mpirun to run your programs.

Exercise 1: Hello Open MPI

  1. In ex1.c, use the above functions to write code to run a program that prints Hello world on multiple processes. For this exercise, since there's only one type of messages, feel free to use 0 as the tag and MPI_COMM_WORLD as comm.
  2. Compile the program with mpicc ex1.c -o ex1
  3. Run the program with mpirun ./ex1 [num tasks], where [num tasks] is the number of tasks to run.
    • You can set the number of processes to use with mpirun -np [num processes] ./ex1 [num tasks]
    • If you choose to run with more processes than physical cores (4 for the hive machines), you must pass the --oversubscribe flag (e.g. mpirun --oversubscribe -np 100 ./ex1 100)
  4. Play around with the different number of threads and tasks and see how the output changes!

Feel free to refer to discussion 10 for an example on writing Open MPI code! The code structure will be extremely similar.


Submission

Save, commit, and push your work, then submit to the Lab 9 assignment on Gradescope.