class: center, middle # Rust Potpurri Rahul Kumar, Edward Zeng --- # Agenda 1. Stack/heap allocations and boxing 2. Traits 3. Generics 4. From/Into 5. Dereferencing 6. Reference counting 7. Destructors 8. Traits and bounds 9. Option/Result composition 10. Iterators --- # Goals * To explain things you may have used in HW 3 without fully understanding * To better prepare you for HW 5 ??? We also want to introduce terminology to you, so you can Google for things more easily. --- # Stack and Heap Allocation How to allocate something on the stack? -- - Just make local variable! -- How to allocate something on the heap? -- - Can't call `malloc`. Not allowed in __safe Rust__ (the type of Rust you've been learning). --- # Heap Allocation Method 1: use unsafe Rust! ``` unsafe { let a = 10; let ptr = alloc(a); } ``` -- Probably not a good idea to write too much unsafe Rust code. With unsafe Rust, lots of our compiler guarantees are gone, so it is easy to make mistakes. -- (HW 5 will not allow unsafe code.) --- # Heap Allocation Method 2: use `Box` (or `Arc/Rc`) ``` let a = Box::new(10); assert_eq!(*a, 10); // use * to get the inner value ``` -- `Box` uses some unsafe code to allocate space but wraps everything under a safe interface. (Aside: any library function that allocates data on the heap, like `Vec` and `String`, must run some unsafe code.) --- # Traits **Traits** define shared behavior. They are similar to interfaces in other languages. Define a trait: ``` trait Shape { fn area(&self) -> i64; fn perimeter(&mut self) -> i64; } ``` --- # Traits Implement a trait on a type: ``` struct Rect { x0: i64, y0: i64, x1: i64, y1: i64, } // Assumes that x1 >= x0 and y1 >= y0 impl Shape for Rect { fn area(&self) -> i64 { (self.x1 - self.x0) * (self.y1 - self.y0) } fn perimeter(&mut self) -> i64 { 2 * (self.x1 - self.x0 + self.y1 - self.y0) } } ``` --- # Traits We can clean up the previous code a bit: ``` impl Rect { fn width(&self) -> i64 { self.x1 - self.x0 } fn height(&self) -> i64 { self.y1 - self.y0 } } impl Shape for Rect { fn area(&self) -> i64 { self.width() * self.height() } fn perimeter(&mut self) -> i64 { 2 * (self.width() + self.height()) } } ``` Traits are very useful in combination with generics and boxing, which we'll discuss later. --- # Generics Like many other languages, Rust has generics. Using generics lets you write code that works for many types. -- We'll first show you the syntax for generics, and then we'll describe how to use them in practice. -- Generic functions look like this: ``` fn example1
(arg1: T1) { // arg1 has type T1 } fn example2
(arg1: T1, arg2: T2) { // arg1 has type T1; arg2 has type T2 } ``` --- # Generics Generic structs look like this: ``` struct Rect
{ x0: T, y0: T, x1: T, y1: T, } // The generic type parameter T is inferred. let r = Rect { x0: 1, y0: 2, x1: 3, y1: 5 }; let r = Rect { x0: 1.5, y0: 2.5, x1: 3.5, y1: 5.5 }; ``` -- This won't work: ``` let r = Rect { x0: 1.5, y0: 2, x1: 3, y1: 5 }; ``` The problem is that all the fields of `Rect` must have the same type `T`. But here, were trying to use a float for `x0` and integers for the other fields. --- # Generics If you want to explicitly specify the type parameter(s), use **turbofish syntax**: ``` let r = Rect::
{ x0: 1, y0: 2, x1: 3, y1: 5, }; ``` -- Why "turbofish"? Because `::<_>` sort of looks like a fish. -- Alternatively, add a type annotation: ``` let r: Rect
= Rect { x0: 1, y0: 2, x1: 3, y1: 5, }; ``` --- # Generics You can have multiple generic parameters. Here's an example: ``` struct Rect
{ x0: X, y0: Y, x1: X, y1: Y, } ``` -- Now you can use integers for the x-coordinates and floats for the y-coordinates: ``` let r = Rect { x0: 1, x1: 5, y0: 0.12, y1: 1.64 }; ``` --- # Generics `impl` blocks for generic types look like this: ``` impl
Rect
{ fn copy(&mut self, other: Rect
) { self.x0 = other.x0; self.x1 = other.x1; self.y0 = other.y0; self.y1 = other.y1; } fn transpose(self) -> Rect
{ Rect { x0: self.y0, x1: self.y1, y0: self.x0, y1: self.x1 } } } ``` -- Functions in `impl` blocks can have their own, distinct generic types: ``` impl
Rect
{ fn set_x
(self, x0: Z, x1: Z) -> Rect
{ Rect { x0, x1, y0: self.y0, y1: self.y1 } } } ``` --- # Generics You can also have generic enums! You've seen this already, in the form of `Option` and `Result`: ``` pub enum Option
{ None, Some(T), } pub enum Result
{ Ok(T), Err(E), } ``` -- Traits can also have generics, as we'll see next. --- # From/Into `From` and `Into` are generic traits. They are useful when you want to convert between types. -- This is the definition of the `From` trait: ``` pub trait From
{ fn from(T) -> Self; } ``` -- This is the definition of `Into`: ``` pub trait Into
{ fn into(self) -> T; } ``` --- # From/Into Let's look at an example. ``` enum Apple { Gala, Fuji, // ... } enum Fruit { Apple(Apple), // ... } ``` -- It should be easy to convert an `Apple` into a `Fruit`. (But not the other way around). -- Here's how you might do that: ``` impl Fruit { fn from_apple(apple: Apple) -> Self { Self::Apple(apple) } } ``` --- # From/Into But the more idiomatic way to do this in Rust is to use the `From` trait: ``` impl From
for Fruit { fn from(apple: Apple) -> Self { Self::Apple(apple) } } ``` This makes it clear to people reading/using your code that an `Apple` can be converted into a `Fruit`. -- The Rust standard library also _automatically_ implements the `Into` trait for you. Specifically, it implements `Into
` for `Apple`. So generally you should prefer to implement `From` rather than `Into`. This allows the following code: ``` let my_apple = Apple::Gala; let my_fruit: Fruit = my_apple.into(); ``` -- Note that the compiler usually won't be able to figure out what type you want to convert into, so you may need to add type annotations, as we did for `my_fruit`. --- # The Deref Trait Many types in Rust act like "smart pointers". Like a regular pointer, they can be dereferenced via `*` (the dereference operator), but they also have some extra logic. -- For example, an `Arc
` is an atomically reference counted smart pointer. It can be dereferenced to get some underlying data (of type `T`), but it has extra functionality: the underlying data is freed when the reference count reaches 0. -- The "pointer-like" behavior is usually provided by implementing the **`Deref` trait**. --- # The Deref Trait This is the definition of the `std::ops::Deref` trait: ``` pub trait Deref { type Target: ?Sized; fn deref(&self) -> &Self::Target; } ``` -- * `Target` is an **associated type** of the `Deref` trait. -- * Types that implement `Deref` must be able to provide a reference to something of type `Target`. -- * Don't worry too much about `?Sized` – it just means that `Target` need not have a size known at compile-time. -- * Associated types are just a different way of writing generic code; `Deref` could conceivably been written with a generic parameter, eg. `Deref
`. Instead, it is written as `Deref
`. --- # The Deref Trait The Rust compiler only knows how to dereference `&T` and `&mut T`. If `ptr` is some other type that implements `Deref`, then `*ptr` gets implicitly converted to `*Deref::deref(&ptr)`. The `deref()` method returns an `&Target`, which the compiler _does_ know how to dereference. -- The rust compiler can insert repeated calls to `deref` to try to coerce one type into another. This is known as **deref coercion**. -- For example, an `&Box
` can be deref coerced into an `&str`: 1. Dereference the `Box` to get an `&String`. 2. Dereference the `String` to get an `&str`. -- This is why helper methods generally take in an `&str` rather than a `String`! You can convert a `String` to `&str` inexpensively (but not the other way around). --- # Deref Mut This is the definition of the `DerefMut` trait: ``` pub trait DerefMut: Deref { fn deref_mut(&mut self) -> &mut Self::Target; } ``` -- `DerefMut` is a **subtrait** of `Deref`, which means anything that implements `DerefMut` must also implement `Deref`. (This is about the closest you'll get to inheritance in Rust.) -- Everything we said about `Deref` also applies to `DerefMut`; the only difference is that you get a mutable reference instead of an immutable one. -- So if `ptr` is a value of a type that implements `DerefMut`, `*ptr = ...` is implicitly `*DerefMut::deref_mut(&ptr) = ...`. --- # Method Resolution Rust will automatically `deref` things for you when resolving methods. This is why you can call `.lock()` on a value of type `Arc
>`. The `Arc` is dereferenced to get a `Mutex`. -- The precise rules for method resolution are somewhat complex; you can read about them [here](https://doc.rust-lang.org/reference/expressions/method-call-expr.html). --- # Reference Counting An `Arc
` is an atomically reference counted pointer to some data of type `T`. -- An `Rc
` is also a reference counted pointer, but it uses non-atomic operations and is not thread-safe. If performance matters and you don't need thread-safety, use an `Rc
` instead of an `Arc
`. -- An `Arc` allows you to share data between threads while guaranteeing that the inner data is not freed until all threads are done accessing it. -- `Arc`s are cheaply cloneable. Calling `Arc::clone(&arc)` only increments a reference count; it does not copy the underlying data. You can then send the new `Arc` to another thread. --- # Destructors via Drop Remember that "dropping" in Rust is somewhat analogous to "freeing" in C. The `std::ops::Drop` trait allows you to run some code when a value goes out of scope or is no longer in use. -- This is typically used if you need to implement custom logic for cleaning up resources. -- This is the definition of the `Drop` trait: ``` pub trait Drop { fn drop(&mut self); } ``` -- Occasional annoyance: `drop` is not `async`. --- # Destructors via Drop Note that `drop` takes in `&mut self`, not `self`. -- You might think you could do this: ``` let mut x = ...; x.drop(); x.drop(); ``` That could be a double-free error! -- To prevent you from misusing `Drop`, you cannot call `drop` manually. You can only call `std::mem::drop`, which takes ownership of the value being dropped and then calls `std::ops::Drop::drop` on it. --- # Destructors via Drop Things a destructor might do: * Freeing memory * Closing a file * Decrementing a reference count * Killing a child process After a destructor is run, Rust will recursively call `drop` on all struct/enum fields. --- # Drop Example Suppose you want to force things to be dropped in a certain order. Here's how you could do it: ``` struct DropOrder
(Option
, Option
); impl
Drop for DropOrder
{ fn drop(&mut self) { drop(self.0.take()); drop(self.1.take()); } } ``` -- This drops things in the following order: 1. The value of type `T` 2. The value of type `U` 3. The `DropOrder` struct itself Note: `take` sets an `Option` to `None` and returns the value that was there previously. --- # Trait bounds Traits and generics are very powerful when used together! -- The generics we've described so far are useful for specifying field in structs/enums/function arguments. But they don't let us really _do_ anything with those fields. -- Here's an example: ``` // This will NOT compile fn say_hello
(x: T) -> String { format!("Hello, {}!", x) } ``` -- ```text error[E0277]: `T` doesn't implement `std::fmt::Display` ``` -- The problem is that not all types can be printed out ("displayed"), and we haven't told the compiler that we want `say_hello` to only apply to types that can be displayed. --- # Trait bounds Let's fix this by adding a **bound**. A **bound** puts a restriction on what types can be used in a generic function/struct/enum/etc. -- ``` // This WILL compile fn say_hello
(x: T) -> String { format!("Hello, {}!", x) } ``` -- This says that the type parameter `T` must implement the `std::fmt::Display` trait. We can now print out `x`! -- You can also put bounds in a **where clause**. This usually looks better when you have multiple bounds (or a few complicated ones). -- ``` // This is completely equivalent to the example above fn say_hello
(x: T) -> String where T: std::fmt::Display, { format!("Hello, {}!", x) } ``` --- # Trait bounds You can write fairly sophisticated bounds: ``` use std::ops::Deref; use std::fmt::Display; // This compiles fn say_hello
(x: X) -> String where X: Deref
, T: Display, { format!("Hello, {}!", *x) } ``` --- # impl Trait Let's go back to our example of trait bounds: ``` fn say_hello
(x: T) -> String { format!("Hello, {}!", x) } ``` -- If we don't care about explicitly naming the type `T`, we can use `impl Display` as the argument type: ``` fn say_hello(x: impl std::fmt::Display) -> String { format!("Hello, {}!", x) } ``` --- # impl Trait `impl Trait` can also be used as the return type of a function. -- Here's a silly example; there's no reason to do this. Just return a `String`. ``` use std::fmt::Display; fn say_hello(x: impl Display) -> impl Display { format!("Hello, {}!", x) } ``` -- Returning an `impl Trait` type makes more sense when dealing with iterators (where types can sometimes be very long) and closures (where it is usually impossible to write out a specific type). --- # Multiple bounds Sometimes you need a type to implement multiple traits: ``` // This will NOT compile fn compare_formats
(x: T) -> String { format!("Display: {}\nDebug: {:?}\n", &x, &x) } ``` -- We need `T` to implement two traits: `Debug` (which formats a value for debugging) and `Display` (which formats a value in a way that should look nice). --- # Multiple bounds You can specify multiple trait bounds by separating each bound by a `+` sign. -- Here are a few ways to do this (`use` statements omitted): ``` fn compare_formats
(x: T) -> String { format!("Display: {}\nDebug: {:?}\n", &x, &x) } ``` -- ``` fn compare_formats(x: T) -> String where T: Display + Debug { format!("Display: {}\nDebug: {:?}\n", &x, &x) } ``` -- ``` fn compare_formats(x: impl Display + Debug) -> String { format!("Display: {}\nDebug: {:?}\n", &x, &x) } ``` --- # Marker Traits **Marker traits** have no associated functions. Instead, the compiler gives them special meaning. -- One example is the `Copy` trait: it tells the compiler that all assignments of a `Copy` type copy the value, rather than moving it. -- Other important marker traits: * `Send`: indicates that a type is safe to _send_ to another thread. * `Sync`: indicates that a type is safe to _share_ with another thread. * `Sized`: indicates that a type has a size known at compile time. --- # Trait objects We've discussed generics, which implement **static dispatch**. The compiler looks over your code at compile time, figures out which types are used in generic structs/functions/enums/etc, then generates multiple objects that encode that specific type. -- So if you have something like this: ``` struct Container
(T); ``` And you use a `Container
` and `Container
`, the compiler would emit code like this: ``` struct Container_i32(i32); struct Container_u64(u64); ``` (This is a bit of an oversimplification, but this is the general idea.) This process is called **monomorphization**. --- # Trait objects Sometimes, you don't want monomorphization. Or maybe generics are cluttering your code too much. Or maybe you can't use generics, because you are interfacing with someone else's code, or because you need a collection of values of possibly different types. (The third option is the case in HW 5). -- You can instead use **trait objects**: ``` use std::fmt::Display; fn say_hello(x: Box
) -> String { format!("Hello, {}!", x) } ``` -- The compiler will insert code to perform **dynamic dispatch**. That is, it will insert code that checks the type of `x` at _runtime_, and call the appropriate implementation of `Display` on whatever type `x` happens to be. -- Dynamic dispatch incurs a small performance penalty, as the code has to search a structure called a `vtable` to find the location of the `Display` implementation. --- # Trait objects Values of type `Box
` are called trait objects. You can also have things like `Arc
` or `Rc
`. -- The `Trait` in `Box
` is called the **base trait**. Not all traits can be used as a base trait. The traits that _can_ are called **object safe traits**. You can read the rules about what makes a trait object safe [here](https://doc.rust-lang.org/reference/items/traits.html#object-safety). -- You can require that a type also implement marker traits: ``` Box
``` -- The additional traits can only be marker traits. This won't work: ``` // This will NOT compile. Box
``` ```text error[E0225]: only auto traits can be used as additional traits in a trait object ``` --- # Trait objects If you really need a trait object that implements multiple types, you can define a subtrait with no methods: ``` trait DebugAndDisplay: Debug + Display {} type TraitObject = Box
; ``` -- Your trait object must then implement both `Debug` and `Display`. -- But you will have to add an empty `impl DebugAndDisplay` block: ``` impl DebugAndDisplay for MyType {} ``` --- # Options and Results You've seen options and results in the HTTP homework. Here's more about them. --- # `?` Operator -- `?` works for options as well! ``` fn a()->Option
{ let x = b()?; let y = c()?; println!("{}", &x); Some(y) } fn b()->Option
{ Some("Done".to_string()) } fn c()->Option
{ None } ``` --- # Some Advice * Avoid calling `unwrap` or `expect` unless you are _absolutely sure_ you want to panic on failure. * If you find yourself using `match` statements everywhere, try using the methods defined on `Option`/`Result`. On HW 5, you will fail tests if you panic on certain errors. --- # Options and Results Some useful methods (a rough description): - `ok_or`: transforms `Option
` to `Result
` - `unwrap_or`: returns the contained `Some` value or a provided default - `map`: transforms `Option
` to `Option
` by applying a function to the underlying value - `take`: takes a value out of an option, leaving `None` in its place - `as_ref`: converts from `&Option
` to `Option<&T>` - `as_mut`: converts from `&mut Option
` to `Option<&mut T>` - `and_then`: similar to `map`, but expects function to return an option --- # Examples (This example is taken from [here](https://www.ameyalokare.com/).) ``` struct FullName { first: String, middle: Option
, last: String, } let alice = FullName {...} ``` -- Let's try to print Alice's middle initial, if it exists. --- # Try 1 ``` struct FullName { first: String, middle: Option
, last: String, } let alice = FullName {...} let a = match alice.middle { None => "", Some(m) => &m[0..1] }; println!("{}", a); ``` -- There's a subtle bug, can you spot it? -- ``` | Some(m) => &m[0..1] | ^ - `m` dropped here while still borrowed | | | borrowed value does not live long enough ``` --- # Try 2 ``` struct FullName { first: String, middle: Option
, last: String, } let alice = FullName {...} let a = match alice.middle { None => "", Some(ref m) => &m[0..1] }; println!("{}", a); ``` -- Instead of `match` statement, can do ``` let a = alice.middle.as_ref().map(|m| &m[0..1]).unwrap_or("") ``` --- # Examples A more complicated example: ``` let mut bt = BTreeMap::new(); bt.insert(20u8, "foo"); bt.insert(42u8, "bar"); let res = [0u8, 1, 11, 200, 22] ``` To each element of `res`: 1. Subtract 1 from each element (`checked_sub(1)`) 2. Then multiply by 2. (`checked_mul(2)`) 3. Then index into `bt`. 4. Output "error" if underflow/overflow or key does not exist. --- # Solution: ``` let mut bt = BTreeMap::new(); bt.insert(20u8, "foo"); bt.insert(42u8, "bar"); let res = [0u8, 1, 11, 200, 22] .into_iter() .map(|x| { // `checked_sub()` returns `None` on error x.checked_sub(1) // same with `checked_mul()` .and_then(|x| x.checked_mul(2)) // `BTreeMap::get` returns `None` on error .and_then(|x| bt.get(&x).copied()) // Substitute an error message if we have `None` so far .or(Some("error!")) // Won't panic because we unconditionally used `Some` above .unwrap() }) .collect::
>(); assert_eq!(res, ["error!", "error!", "foo", "error!", "bar"]); ``` -- - Note `map` here is the for iterators, not options. - Don't worry about iterators for now. --- # `anyhow::Result` - `anyhow::Result
` = `std::result::Result
` -- - `anyhow::Error` is a popular error type. -- - Can be created from any error type. - (It's roughly equivalent to `Box
`) -- - Provides backtrace. --- # Helper functions Use helper functions (frequently)! -- ``` fn a()->Result
{ b().ok_or(0)?; c().ok_or(0)?; d().ok_or(0)?; Ok(1) } fn b()->Option
{...} fn c()->Option
{...} fn d()->Option
{...} ``` can be replaced with ``` fn a()->Result
{ helper().ok_or(0)?; Ok(1) } fn helper()->Option
{ b()?; c()?; d() } ... ``` --- # Iterators An interator is a thing that produces a sequence of values, usually for a loop to operate on. -- ``` fn triangle(n: i32) -> i32 { let mut sum = 0; for i in 1..=n { sum += i; } sum } ``` -- - `1..=n` is a `RangeInclusive
` value. - `RangeInclusive
` is an iterator that produces the integers from its start value to its end value. --- # Iterators Formal definition: an interator is any value that implements the `std::iter::Iterator` trait. ``` trait Iterator { type Item; fn next(&mut self) -> Option
; ... // many other methods } ``` --- # Iterables - Many types, like `Vec` and `HashMap`, are __iterables.__ -- - This means you can get an iterator from them using the `.into_iter()` method. -- - Under the hood, `for` loops make calls to `.into_iter()` and `Iterator` methods. --- # Iterables The code snippet ``` let v = vec!["Hello", "There", "General", "Kenobi"]; for word in &v { println!("{}", word); } ``` is really ``` let v = vec!["Hello", "There", "General", "Kenobi"]; let mut iterator = (&v).into_iter(); while let Some(word) = iterator.next() { println!("{}", word); } ``` under the hood. (Aside: in addition to `.into_iter()`, some types also implement `.iter()` and `.iter_mut()`, which do [similar things](https://stackoverflow.com/q/34733811).) --- # Iterator trait The `Iterator` trait provides many useful functions that you should look into! Adaptor methods (methods that consume an iterator and produce another) are especially useful: - `map` - `filter` - `flatten` - `take` - and many more! --- # References Some examples taken from - Programming Rust (O'Reilly) - Rust official documentation - https://www.ameyalokare.com/