Rust Potpurri

class: center, middle

# Rust Potpurri

Rahul Kumar, Edward Zeng

---

# Agenda

1. Stack/heap allocations and boxing
2. Traits
3. Generics
4. From/Into
5. Dereferencing
6. Reference counting
7. Destructors
8. Traits and bounds
9. Option/Result composition
10. Iterators

---

# Goals

* To explain things you may have used in HW 3 without fully understanding
* To better prepare you for HW 5

???

We also want to introduce terminology to you, so you can
Google for things more easily.

---

# Stack and Heap Allocation

How to allocate something on the stack?

- Just make local variable!

How to allocate something on the heap?

- Can't call `malloc`. Not allowed in __safe Rust__ (the type of Rust you've been
learning).

---

# Heap Allocation

Method 1: use unsafe Rust!

```
unsafe {
    let a = 10;
    let ptr = alloc(a);
}
```

Probably not a good idea to write too much unsafe Rust code. With unsafe Rust,
lots of our compiler guarantees are gone, so it is easy to make mistakes.

(HW 5 will not allow unsafe code.)

---

# Heap Allocation

Method 2: use `Box` (or `Arc/Rc`)

```
let a = Box::new(10);
assert_eq!(*a, 10); // use * to get the inner value
```
--

`Box` uses some unsafe code to allocate space but wraps everything under a safe
interface.

(Aside: any library function that allocates data on the heap, like `Vec` and
`String`, must run some unsafe code.)

---

# Traits

**Traits** define shared behavior. They are similar to interfaces in other languages.

Define a trait:

```
trait Shape {
  fn area(&self) -> i64;
  fn perimeter(&mut self) -> i64;
}
```

---

# Traits

Implement a trait on a type:

```
struct Rect {
  x0: i64,
  y0: i64,
  x1: i64,
  y1: i64,
}

// Assumes that x1 >= x0 and y1 >= y0
impl Shape for Rect {
  fn area(&self) -> i64 {
    (self.x1 - self.x0) * (self.y1 - self.y0)
  }

fn perimeter(&mut self) -> i64 {
    2 * (self.x1 - self.x0 + self.y1 - self.y0)
  }
}
```

---

# Traits

We can clean up the previous code a bit:

```
impl Rect {
  fn width(&self) -> i64 {
    self.x1 - self.x0
  }
  fn height(&self) -> i64 {
    self.y1 - self.y0
  }
}

impl Shape for Rect {
  fn area(&self) -> i64 {
    self.width() * self.height()
  }

fn perimeter(&mut self) -> i64 {
    2 * (self.width() + self.height())
  }
}
```

Traits are very useful in combination with generics and boxing, which we'll discuss later.

---

# Generics

Like many other languages, Rust has generics.
Using generics lets you write code that works for many types.

We'll first show you the syntax for generics,
and then we'll describe how to use them in practice.

Generic functions look like this:

```
fn example1<T1>(arg1: T1) {
  // arg1 has type T1
}

fn example2<T1, T2>(arg1: T1, arg2: T2) {
  // arg1 has type T1; arg2 has type T2
}
```

---

# Generics

Generic structs look like this:

```
struct Rect<T> {
  x0: T,
  y0: T,
  x1: T,
  y1: T,
}

// The generic type parameter T is inferred.
let r = Rect { x0: 1, y0: 2, x1: 3, y1: 5 };
let r = Rect { x0: 1.5, y0: 2.5, x1: 3.5, y1: 5.5 };
```

This won't work:
```
let r = Rect { x0: 1.5, y0: 2, x1: 3, y1: 5 };
```

The problem is that all the fields of `Rect` must have the same type `T`.
But here, were trying to use a float for `x0` and integers for the other fields.

---

# Generics

If you want to explicitly specify the type parameter(s), use **turbofish syntax**:
```
let r = Rect::<i64> {
  x0: 1,
  y0: 2,
  x1: 3,
  y1: 5,
};
```

Why "turbofish"? Because `::<_>` sort of looks like a fish.

Alternatively, add a type annotation:
```
let r: Rect<i64> = Rect {
  x0: 1,
  y0: 2,
  x1: 3,
  y1: 5,
};
```

---

# Generics

You can have multiple generic parameters. Here's an example:

```
struct Rect<X, Y> {
  x0: X,
  y0: Y,
  x1: X,
  y1: Y,
}
```

Now you can use integers for the x-coordinates and floats for the y-coordinates:

```
let r = Rect { x0: 1, x1: 5, y0: 0.12, y1: 1.64 };
```

---

# Generics

`impl` blocks for generic types look like this:

```
impl<X, Y> Rect<X, Y> {
  fn copy(&mut self, other: Rect<X, Y>) {
    self.x0 = other.x0;
    self.x1 = other.x1;
    self.y0 = other.y0;
    self.y1 = other.y1;
  }

fn transpose(self) -> Rect<Y, X> {
    Rect { x0: self.y0, x1: self.y1, y0: self.x0, y1: self.x1 }
  }
}
```

Functions in `impl` blocks can have their own, distinct generic types:
```
impl<X, Y> Rect<X, Y> {
  fn set_x<Z>(self, x0: Z, x1: Z) -> Rect<Z, Y> {
    Rect { x0, x1, y0: self.y0, y1: self.y1 }
  }
}
```

---

# Generics

You can also have generic enums!

You've seen this already, in the form of `Option` and `Result`:

```
pub enum Option<T> {
    None,
    Some(T),
}

pub enum Result<T, E> {
    Ok(T),
    Err(E),
}
```

Traits can also have generics, as we'll see next.

---

# From/Into

`From` and `Into` are generic traits.
They are useful when you want to convert between types.

This is the definition of the `From` trait:

```
pub trait From<T> {
    fn from(T) -> Self;
}
```

This is the definition of `Into`:

```
pub trait Into<T> {
    fn into(self) -> T;
}
```

---

# From/Into

Let's look at an example.

```
enum Apple {
  Gala,
  Fuji,
  // ...
}

enum Fruit {
  Apple(Apple),
  // ...
}
```

It should be easy to convert an `Apple` into a `Fruit`.
(But not the other way around).

Here's how you might do that:
```
impl Fruit {
  fn from_apple(apple: Apple) -> Self {
    Self::Apple(apple)
  }
}
```

---

# From/Into

But the more idiomatic way to do this in Rust is to use the `From` trait:
```
impl From<Apple> for Fruit {
  fn from(apple: Apple) -> Self {
    Self::Apple(apple)
  }
}
```

This makes it clear to people reading/using your code that an `Apple` can be converted into a `Fruit`.

The Rust standard library also _automatically_ implements the `Into` trait for you.
Specifically, it implements `Into<Fruit>` for `Apple`. So generally you should prefer to implement `From` rather than `Into`.
This allows the following code:

```
let my_apple = Apple::Gala;
let my_fruit: Fruit = my_apple.into();
```

Note that the compiler usually won't be able to figure out what type you want to convert into,
so you may need to add type annotations, as we did for `my_fruit`.

---

# The Deref Trait

Many types in Rust act like "smart pointers".

Like a regular pointer, they can be dereferenced via `*` (the dereference operator),
but they also have some extra logic.

For example, an `Arc<T>` is an atomically reference counted smart pointer. It can be dereferenced to
get some underlying data (of type `T`), but it has extra functionality: the underlying data is freed when the
reference count reaches 0.

The "pointer-like" behavior is usually provided by implementing the **`Deref` trait**.

---

# The Deref Trait

This is the definition of the `std::ops::Deref` trait:
```
pub trait Deref {
  type Target: ?Sized;

fn deref(&self) -> &Self::Target;
}
```

* `Target` is an **associated type** of the `Deref` trait.
--

* Types that implement `Deref` must be able to provide a reference to something of type `Target`.
--

* Don't worry too much about `?Sized` – it just means that `Target` need not have a size known at compile-time.
--

* Associated types are just a different way of writing generic code; `Deref` could conceivably been written
with a generic parameter, eg. `Deref<T>`.
Instead, it is written as `Deref<Target = T>`.

---

# The Deref Trait

The Rust compiler only knows how to dereference `&T` and `&mut T`.

If `ptr` is some other type that implements `Deref`, then `*ptr` gets implicitly converted to `*Deref::deref(&ptr)`.
The `deref()` method returns an `&Target`, which the compiler _does_ know how to dereference.

The rust compiler can insert repeated calls to `deref` to try to coerce one type into another. This is known as **deref coercion**.

For example, an `&Box<String>` can be deref coerced into an `&str`:
1. Dereference the `Box` to get an `&String`.
2. Dereference the `String` to get an `&str`.

This is why helper methods generally take in an `&str` rather than a `String`!
You can convert a `String` to `&str` inexpensively (but not the other way around).

---

# Deref Mut

This is the definition of the `DerefMut` trait:

```
pub trait DerefMut: Deref {
    fn deref_mut(&mut self) -> &mut Self::Target;
}
```

`DerefMut` is a **subtrait** of `Deref`, which means anything that implements
`DerefMut` must also implement `Deref`.

(This is about the closest you'll get to inheritance in Rust.)

Everything we said about `Deref` also applies to `DerefMut`;
the only difference is that you get a mutable reference instead of an immutable one.

So if `ptr` is a value of a type that implements `DerefMut`,
`*ptr = ...` is implicitly `*DerefMut::deref_mut(&ptr) = ...`.

---

# Method Resolution

Rust will automatically `deref` things for you when resolving methods.

This is why you can call `.lock()` on a value of type `Arc<Mutex<T>>`.
The `Arc` is dereferenced to get a `Mutex`.

The precise rules for method resolution are somewhat complex; you can read about them [here](https://doc.rust-lang.org/reference/expressions/method-call-expr.html).

---

# Reference Counting

An `Arc<T>` is an atomically reference counted pointer to some data of type `T`.

An `Rc<T>` is also a reference counted pointer, but it uses non-atomic operations and is not thread-safe.
If performance matters and you don't need thread-safety, use an `Rc<T>` instead of an `Arc<T>`.

An `Arc` allows you to share data between threads while guaranteeing that the inner data is not freed
until all threads are done accessing it.

`Arc`s are cheaply cloneable. Calling `Arc::clone(&arc)` only increments a reference count;
it does not copy the underlying data. You can then send the new `Arc` to another thread.

---

# Destructors via Drop

Remember that "dropping" in Rust is somewhat analogous to "freeing" in C.

The `std::ops::Drop` trait allows you to run some code when a value goes out of scope or is no longer in use.

This is typically used if you need to implement custom logic for cleaning up resources.

This is the definition of the `Drop` trait:

```
pub trait Drop {
    fn drop(&mut self);
}
```

Occasional annoyance: `drop` is not `async`.

---

# Destructors via Drop

Note that `drop` takes in `&mut self`, not `self`.

You might think you could do this:
```
let mut x = ...;
x.drop();
x.drop();
```

That could be a double-free error!

To prevent you from misusing `Drop`, you cannot call `drop` manually.

You can only call `std::mem::drop`, which takes ownership
of the value being dropped and then calls `std::ops::Drop::drop` on it.

---

# Destructors via Drop

Things a destructor might do:
* Freeing memory
* Closing a file
* Decrementing a reference count
* Killing a child process

After a destructor is run, Rust will recursively call `drop` on all struct/enum fields.

---

# Drop Example

Suppose you want to force things to be dropped in a certain order. Here's how you could do it:

```
struct DropOrder<T, U>(Option<T>, Option<U>);

impl<T, U> Drop for DropOrder<T, U> {
  fn drop(&mut self) {
    drop(self.0.take());
    drop(self.1.take());
  }
}
```

This drops things in the following order:
1. The value of type `T`
2. The value of type `U`
3. The `DropOrder` struct itself

Note: `take` sets an `Option` to `None` and returns the value that was there previously.

---

# Trait bounds

Traits and generics are very powerful when used together!

The generics we've described so far are useful for specifying field in structs/enums/function arguments.
But they don't let us really _do_ anything with those fields.

Here's an example:

```
// This will NOT compile
fn say_hello<T>(x: T) -> String {
  format!("Hello, {}!", x)
}
```

```text
error[E0277]: `T` doesn't implement `std::fmt::Display`
```

The problem is that not all types can be printed out ("displayed"),
and we haven't told the compiler that we want `say_hello`
to only apply to types that can be displayed.

---

# Trait bounds

Let's fix this by adding a **bound**. A **bound** puts a restriction on
what types can be used in a generic function/struct/enum/etc.

```
// This WILL compile
fn say_hello<T: std::fmt::Display>(x: T) -> String {
  format!("Hello, {}!", x)
}
```

This says that the type parameter `T` must implement
the `std::fmt::Display` trait. We can now print out `x`!

You can also put bounds in a **where clause**.

This usually looks better when you have multiple bounds (or a few complicated ones).

```
// This is completely equivalent to the example above
fn say_hello<T>(x: T) -> String
where
    T: std::fmt::Display,
{
    format!("Hello, {}!", x)
}
```

---

# Trait bounds

You can write fairly sophisticated bounds:

```
use std::ops::Deref;
use std::fmt::Display;

// This compiles
fn say_hello<X, T>(x: X) -> String
where
  X: Deref<Target = T>,
  T: Display,
{
    format!("Hello, {}!", *x)
}
```

---

# impl Trait

Let's go back to our example of trait bounds:

```
fn say_hello<T: std::fmt::Display>(x: T) -> String {
  format!("Hello, {}!", x)
}
```

If we don't care about explicitly naming the type `T`, we can use `impl Display` as the argument type:
```
fn say_hello(x: impl std::fmt::Display) -> String {
  format!("Hello, {}!", x)
}
```

---

# impl Trait

`impl Trait` can also be used as the return type of a function.

Here's a silly example; there's no reason to do this. Just return a `String`.

```
use std::fmt::Display;

fn say_hello(x: impl Display) -> impl Display {
  format!("Hello, {}!", x)
}
```

Returning an `impl Trait` type makes more sense
when dealing with iterators (where types can sometimes
be very long) and closures (where it is usually impossible
to write out a specific type).

---

# Multiple bounds

Sometimes you need a type to implement multiple traits:

```
// This will NOT compile
fn compare_formats<T>(x: T) -> String {
  format!("Display: {}\nDebug: {:?}\n", &x, &x)
}
```

We need `T` to implement two traits: `Debug` (which formats a value
for debugging) and `Display` (which formats a value
in a way that should look nice).

---

# Multiple bounds

You can specify multiple trait bounds by separating each bound by a `+` sign.

Here are a few ways to do this (`use` statements omitted):

```
fn compare_formats<T: Display + Debug>(x: T) -> String {
  format!("Display: {}\nDebug: {:?}\n", &x, &x)
}
```

```
fn compare_formats(x: T) -> String
where
  T: Display + Debug
{
  format!("Display: {}\nDebug: {:?}\n", &x, &x)
}
```

```
fn compare_formats(x: impl Display + Debug) -> String {
  format!("Display: {}\nDebug: {:?}\n", &x, &x)
}
```

---

# Marker Traits

**Marker traits** have no associated functions. Instead, the compiler gives them special meaning.

One example is the `Copy` trait: it tells the compiler that all assignments
of a `Copy` type copy the value, rather than moving it.

Other important marker traits:
* `Send`: indicates that a type is safe to _send_ to another thread.
* `Sync`: indicates that a type is safe to _share_ with another thread.
* `Sized`: indicates that a type has a size known at compile time.

---

# Trait objects

We've discussed generics, which implement **static dispatch**.

The compiler looks over your code at compile time, figures out which types
are used in generic structs/functions/enums/etc, then generates multiple
objects that encode that specific type.

So if you have something like this:
```
struct Container<T>(T);
```

And you use a `Container<i32>` and `Container<u64>`, the compiler would emit code like this:
```
struct Container_i32(i32);
struct Container_u64(u64);
```
(This is a bit of an oversimplification, but this is the general idea.)

This process is called **monomorphization**.

---

# Trait objects

Sometimes, you don't want monomorphization. Or maybe generics are cluttering your code too much.
Or maybe you can't use generics, because you are interfacing with someone else's code, or
because you need a collection of values of possibly different types.

(The third option is the case in HW 5).

You can instead use **trait objects**:

```
use std::fmt::Display;
fn say_hello(x: Box<dyn Display>) -> String {
  format!("Hello, {}!", x)
}
```

The compiler will insert code to perform **dynamic dispatch**.

That is, it will insert code that checks the type of `x` at _runtime_,
and call the appropriate implementation of `Display` on whatever
type `x` happens to be.

Dynamic dispatch incurs a small performance penalty,
as the code has to search a structure called a `vtable`
to find the location of the `Display` implementation.

---

# Trait objects

Values of type `Box<dyn Trait>` are called trait objects.
You can also have things like `Arc<dyn Trait>` or `Rc<dyn Trait>`.

The `Trait` in `Box<dyn Trait>` is called the **base trait**.
Not all traits can be used as a base trait. The traits that _can_
are called **object safe traits**.

You can read the rules about what makes a trait object safe [here](https://doc.rust-lang.org/reference/items/traits.html#object-safety).

You can require that a type also implement marker traits:
```
Box<dyn Trait + Send + Sync>
```

The additional traits can only be marker traits. This won't work:
```
// This will NOT compile.
Box<dyn Debug + Display>
```

```text
error[E0225]: only auto traits can be used as additional traits in a trait object
```

---

# Trait objects

If you really need a trait object that implements multiple types, you can define a subtrait with no methods:
```
trait DebugAndDisplay: Debug + Display {}
type TraitObject = Box<dyn DebugAndDisplay>;
```

Your trait object must then implement both `Debug` and `Display`.

But you will have to add an empty `impl DebugAndDisplay` block:

```
impl DebugAndDisplay for MyType {}
```

---

# Options and Results

You've seen options and results in the HTTP homework. Here's more about them.

---

# `?` Operator

`?` works for options as well!

```
fn a()->Option<i32>{
    let x = b()?;
    let y = c()?;
    println!("{}", &x);
    Some(y)
}

fn b()->Option<String> {
    Some("Done".to_string())
}

fn c()->Option<i32> {
    None
}
```

---

# Some Advice

* Avoid calling `unwrap` or `expect` unless you are _absolutely sure_ you want to panic on failure.
* If you find yourself using `match` statements everywhere, try using the methods defined on `Option`/`Result`.

On HW 5, you will fail tests if you panic on certain errors.

---

# Options and Results

Some useful methods (a rough description):

- `ok_or`: transforms `Option<T>` to `Result<T, E>`
- `unwrap_or`: returns the contained `Some` value or a provided default
- `map`: transforms `Option<T>` to `Option<U>` by applying a function to the
    underlying value
- `take`: takes a value out of an option, leaving `None` in its place
- `as_ref`: converts from `&Option<T>` to `Option<&T>`
- `as_mut`: converts from `&mut Option<T>` to `Option<&mut T>`
- `and_then`: similar to `map`, but expects function to return an option

---

# Examples

(This example is taken from [here](https://www.ameyalokare.com/).)

```
struct FullName {
    first: String,
    middle: Option<String>,
    last: String,
}
let alice = FullName {...}
```

Let's try to print Alice's middle initial, if it exists.

---

# Try 1

```
struct FullName {
    first: String,
    middle: Option<String>,
    last: String,
}
let alice = FullName {...}

let a = match alice.middle {
    None => "",
    Some(m) => &m[0..1]
};
println!("{}", a);
```
--

There's a subtle bug, can you spot it?

---

# Try 2

```
struct FullName {
    first: String,
    middle: Option<String>,
    last: String,
}
let alice = FullName {...}

let a = match alice.middle {
    None => "",
    Some(ref m) => &m[0..1]
};
println!("{}", a);
```
--

Instead of `match` statement, can do

```
let a = alice.middle.as_ref().map(|m| &m[0..1]).unwrap_or("")
```

---

# Examples

A more complicated example:

```
let mut bt = BTreeMap::new();
bt.insert(20u8, "foo");
bt.insert(42u8, "bar");
let res = [0u8, 1, 11, 200, 22]
```

To each element of `res`:
1. Subtract 1 from each element (`checked_sub(1)`)
2. Then multiply by 2. (`checked_mul(2)`)
3. Then index into `bt`.
4. Output "error" if underflow/overflow or key does not exist.

---

# Solution:

```

let mut bt = BTreeMap::new();
bt.insert(20u8, "foo");
bt.insert(42u8, "bar");
let res = [0u8, 1, 11, 200, 22]
    .into_iter()
    .map(|x| {
        // `checked_sub()` returns `None` on error
        x.checked_sub(1)
            // same with `checked_mul()`
            .and_then(|x| x.checked_mul(2))
            // `BTreeMap::get` returns `None` on error
            .and_then(|x| bt.get(&x).copied())
            // Substitute an error message if we have `None` so far
            .or(Some("error!"))
            // Won't panic because we unconditionally used `Some` above
            .unwrap()
    })
    .collect::<Vec<_>>();
assert_eq!(res, ["error!", "error!", "foo", "error!", "bar"]);
```

- Note `map` here is the for iterators, not options.
- Don't worry about iterators for now.

---

# `anyhow::Result`

- `anyhow::Result<T>` = `std::result::Result<T, anyhow::Error>`

--
- `anyhow::Error` is a popular error type.
--

- Can be created from any error type.
- (It's roughly equivalent to `Box<dyn std::error::Error>`)
--

- Provides backtrace.

---

# Helper functions

Use helper functions (frequently)!

```
 fn a()->Result<i32, i32>{
    b().ok_or(0)?;
    c().ok_or(0)?;
    d().ok_or(0)?;
    Ok(1)
}

fn b()->Option<i32> {...}
fn c()->Option<i32> {...}
fn d()->Option<i32> {...}
```
can be replaced with

```
fn a()->Result<i32, i32>{
    helper().ok_or(0)?;
    Ok(1)
}

fn helper()->Option<i32> {
    b()?;
    c()?;
    d()
}
...
```

---

# Iterators

An interator is a thing that produces a sequence of values, usually for a loop
to operate on.

```
fn triangle(n: i32) -> i32 {
    let mut sum = 0;
    for i in 1..=n {
        sum += i;
    }
    sum
}
```

- `1..=n` is a `RangeInclusive<i32>` value.
- `RangeInclusive<i32>` is an iterator that produces the integers from its
start value to its end value.

---

# Iterators

Formal definition: an interator is any value that implements the
`std::iter::Iterator` trait.

```
trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
    ... // many other methods
}
```

---

# Iterables

- Many types, like `Vec` and `HashMap`, are __iterables.__
--

- This means you can get an iterator from them using the `.into_iter()` method.
--

- Under the hood, `for` loops make calls to `.into_iter()` and `Iterator`
methods.

---

# Iterables

The code snippet
```
let v = vec!["Hello", "There", "General", "Kenobi"];

for word in &v {
    println!("{}", word);
}
```
is really
```
let v = vec!["Hello", "There", "General", "Kenobi"];

let mut iterator = (&v).into_iter();
while let Some(word) = iterator.next() {
    println!("{}", word);
}
```
under the hood.

(Aside: in addition to `.into_iter()`, some types also implement `.iter()` and
`.iter_mut()`, which do [similar
things](https://stackoverflow.com/q/34733811).)

---

# Iterator trait

The `Iterator` trait provides many useful functions that you should look into!

Adaptor methods (methods that consume an iterator and produce another) are
especially useful:
- `map`
- `filter`
- `flatten`
- `take`
- and many more!

---

# References

Some examples taken from

- Programming Rust (O'Reilly)
- Rust official documentation
- https://www.ameyalokare.com/