Borrowing & Owning

CIS 198 Lecture 17


Borrowing vs. Owning

  • Consider this struct:
  1. struct StrToken<'a> {
  2. raw: &'a str,
  3. }
  4. impl<'a> StrToken<'a> {
  5. pub fn new(raw: &'a str) -> StrToken<'a> {
  6. StrToken { raw: raw, }
  7. }
  8. }
  9. // ...
  10. let secret: String = load_secret("api.example.com");
  11. let token = StrToken::new(&secret[..]);

*Code and examples taken from From &str to Cow

???

This struct works fine for &'static strs, but is pretty inflexible. The use case this was presented in was one where StrToken represents an authentication token or application secret; in this case, you’re basically forced to store the secret key in the application’s binary, since this doesn’t let you load it ad-hoc at runtime, e.g. from a file. Also, in the last two lines of code, we’ve forced the token to only live as long as secret, so token can’t escape the current stack frame, as secret will no longer be valid if it does. This whole implementation is silly and we’d like to be able to do it differently…


Borrowing vs. Owning

  1. struct StringToken {
  2. raw: String,
  3. }
  4. impl StringToken {
  5. pub fn new(raw: String) -> StringToken {
  6. StringToken { raw: raw, }
  7. }
  8. }

???

This basically solves the problem we had before by switching out an &str for a String— now we can load our secret at runtime and we don’t have to worry about lifetimes. Because StringToken owns its contents, it is guaranteed that its contents will always be valid. This is convenient, but is also not the best implementation. For one, we can’t make StringTokens out of raw strings without first coercing them into Strings, which is not great. We could design this to do the &str -> String coercion for us, but that’s also not a good solution, since we’d either have to make an additional constructor, or force an additional heap allocation in certain cases. If we make one constructor that just takes an &str, anyone who already has a String has to slice the String up into an &str and then allocate it on the heap again, even though they’ve already done that by having a String. If we make two constructors, then we’re basically duplicating code with only minor differences, which is unergonomic. The whole thing is silly and inefficient, and there really should be a way around it without having to settle for either of these solutions.


std::convert::Into & std::convert::From

  1. pub trait Into<T> {
  2. fn into(self) -> T;
  3. }
  4. pub trait From<T> {
  5. fn from(T) -> Self;
  6. }
  7. impl<T, U> Into<U> for T where U: From<T> { /* ... */ }

???

Into is, guess what, a trait for converting types into other types by consuming the original value and producing a new one. Some standard library types implement Into in various ways, e.g. String implements Into<Vec<u8>> so you can get at the raw bytes behind a String. There’s also a symmetric trait From, that is the opposite of Into. In fact, if T can be converted Into<U>, then U can be created From<T>. We’re going to use these traits to build up a new abstraction that will let us more effectively solve the &str-String duality we were just dealing with.


Borrowing \/ Owning

  1. struct Token {
  2. raw: String,
  3. }
  4. impl Token {
  5. pub fn new<S: Into<String>>(raw: S) -> Token
  6. Token { raw: raw.into(), }
  7. }
  8. }

???

With the help of Into, we ensure our argument can always be converted into a String, and then convert it in the constructor. Into<T> is always implemented for T itself, so this works fine with Strings. &str and String have From/Into impls for each other because of the last generic impl on the previous slide; if you look for them in the standard library, you won’t find them directly. So, great, we’ve made this much more convenient to use! However, we still haven’t solved the allocation problem: an &str argument still has to be converted into a String, which costs us.


Bovine Intervention!

  1. pub enum Cow<'a, B> where B: 'a + ToOwned + ?Sized {
  2. Borrowed(&'a B),
  3. Owned(B::Owned),
  4. }

???

Cow, short for “clone-on-write”, is a type of smart pointer enum thing. It’s a little bit hard to describe. The type is constrained by some lifetime 'a and a generic type B. B is constrained by 'a + ToOwned + ?Sized, which means this:

  1. - `'a`: `B` cannot contain a lifetime shorter than `'a`
  2. - `ToOwned`: `B` must implement the `ToOwned` trait, which describes how
  3. to convert it to an owned value
  4. - `?Sized`: `B` is allowed to be unsized at compile time, which
  5. basically just means you can use trait objects with `Cow`s

Cow has two variants:

  1. - `Borrowed(&'a B)`: a reference to a `B`, bound to the lifetime of the
  2. trait bound
  3. - `Owned(B::Owned)`: the associated type from the `ToOwned` trait; this
  4. variant owns its value (obviously)

When you want to mutate a Cow (gosh that sounds mad science-y), call to_mut on it. This causes its internal representation to be converted to the Owned variant, hence the clone-on-write semantics. Note that this causes an allocation.

Aside: there’s a now-deprecated trait called IntoCow which I think is probably the best trait name in std.


Bovine Intervention!

  1. struct Token<'a> {
  2. raw: Cow<'a, str>,
  3. }
  4. impl<'a> Token<'a> {
  5. pub fn new<S: Into<Cow<'a, str>>(raw: S) -> Token<'a>
  6. Token { raw: raw.into(), }
  7. }
  8. }
  9. // ...
  10. let token = Token::new(Cow::Borrowed("it's a secret"));
  11. let secret: String = load_secret("api.example.com");
  12. let token = Token::new(Cow::Owned(secret));

???

Now we can pass either a String or an &str into our struct just fine, and &strs won’t trigger an allocation. The lifetime bound on an &str still needs to be 'static for it to escape the current stack frame, but we can’t really solve that problem anyway. We can also send Tokens across threads, so long as they’re 'static. Pretty nice, right?


Recap

  • If you want to have the potential to own or borrow a value in a struct, Cow is right for you.
  • Cow can be used with more than just &str/String, though this is a very common usage.
  • When you mutate a Cow, it might have to allocate and become its Owned variant.
  • The type inside of the Cow must be convertible to some owned form.