Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Worker registration

Table of contents

  1. Registration
  2. Heartbeats

In this part, you will be implementing worker registration, where a worker connects to the coordinator gRPC server and requests a unique ID that the coordinator will use to identify the worker in the future. You will also implement heartbeat messages that allow the newly registered worker to signal to the coordinator that it is still alive and working.

Registration

Implement the Register RPC for registering workers and assigning them a unique ID (this can be done by using a simple counter). Keep your IDs as small as possible and nonnegative since these IDs are used to determine the port (an unsigned 16-bit integer) that the workers listen on. You will need to modify src/coordinator/mod.rs. For an example of how to implement an RPC call, take a look at the Example RPC stub.

Keep in mind that your RPC stub needs to be asynchronous and thread safe, so it may help to take a look at the documentation for std::sync::Arc and tokio::sync::Mutex. We do not expect you to use fine-grained synchronization, so feel free to protect all of your mutable state with a single mutex. The suggested way to do this is shown beow:

pub struct CoordinatorState {
  // Put mutable state here
}

pub struct Coordinator {
  // Put immutable state here
  inner: Arc<Mutex<CoordinatorState>>
}

Note: If you choose to use multiple mutexes to improve performance, you will need to be incredibly careful that there are no race conditions in your code. We will not help debug logic with multiple mutexes and will instead suggest you switch over to using a single mutex if you run into issues.

To test if your implementation works, we recommend adding log::info! statements to your coordinator stub to check if the coordinator is receiving registration requests when new workers are started and that each new worker receives a unique ID.

You do not need to persist worker IDs. Crashed workers are restarted by starting a new worker process, which should receive a new ID when it registers with the coordinator.

Heartbeats

Implement the Heartbeat RPC. The coordinator should keep track of the time of the most recent heartbeat from each registered worker. While we will not be testing this explicitly, it will be necessary for implementing fault tolerance in later parts of the homework.

To test if your implementation works, we recommend adding log::info! statements in your heartbeat RPC to verify that the coordinator receives heartbeats periodically after starting a new worker and that these heartbeats stop after the worker has been terminated.