Rust might not be the most beginner-friendly language, but after you’ve battled and defeated the borrow checker, things become way easier and your code slowly begins to flow more and more.
This is however the time when you really need to learn more about Rust, and especially idiomatic code, and there are definitely a few traits that you need to know about because they will make your APIs so much better to use and even ease your own life up a lot.
The traits I’m talking about are:
- From and Into
- TryFrom and TryInto
- AsRef and AsMut
These traits are the base of converting between types in Rust, and you should really be aware of and use them actively.
So, let’s take a look at them and find out how they work, why you need them, and where you can use them.
Just in case you enjoy YouTube videos more than an article, you can also watch the video above instead!
Table of Contents
Open Table of Contents
From and Into
Let’s first talk about From and Into because they are two sides of the same medal and Into is actually just the reciprocal of the From trait, but more on that later.
Conversion happens all the time in software. Often, we have one type of data that we need to convert into another type of data. Database results need to be converted into API responses, API requests need to be converted to an internal representation our software can work with, and so on.
In Rust, there are two types of data. There is owned data and borrowed data, and it’s one of the core principles of Rust. When a code block owns some data, it’s also responsible for freeing the corresponding memory when it has finished whatever it wanted to do. If a code block only borrows it, it just has to “return the borrowed data” (so to say) and let someone else take care of freeing that block of memory later. As Rust has these two ways of dealing with data, it’s also no wonder that there are at least two ways of converting it.
From and Into are traits that deal with the conversion of data we want to get ownership of, which means that after the conversion happened, ownership is passed from the caller (whoever that might be) to us, or better our logic. This means that we decide when the lifetime of that converted data ends.
If we take a look at the implementation of the From trait, we can quickly see what it focuses on:
It takes a generic source type T
and returns itself, which refers to whatever type implements From for itself. As the argument is not borrowed because there is no ampersand (&) in front of it, that target type is moved or in other words consumed, and thus gives ownership to from.
As the function itself only returns Self
it means that we deal with a conversion that can’t fail. From should always return a result, so it’s not suited for conversions that can potentially fail due to constraints.
A conversion modeled with From can or can not be computationally expensive, but in general, we should assume that it is expensive, which is important to keep in mind when we use it.
Next to From, there is Into, which is the reciprocal of From, and if we take a look at it, we can see that Into takes the type that implements it, and returns a target type U
, which is just another way of the conversion From does:
Looking at the signature, we can see that From implies Into. The target type is required to implement From for Into’s source type, and the blanket implementation of Into makes use of exactly that by only calling from
on the target type, passing self
as the argument.
By the way: Like with From, the argument, self
in this case, is moved and thus consumed.
To make clear that Into is really the reciprocal of From (and to help you understand it), let’s take a look at how we can construct a String
from an str
in Rust:
We can either create that String by calling String::from
, passing an str into it, call from
on the From trait, but then we have to explicitly state our type, call into
on the str and explicitly type our variable as String, or by calling into
on the Into trait, also explicitly typing our variable.
All lines do exactly the same. In the end, we have a dynamic, heap-allocated String at our hands, created from an immutable utf-8 byte sequence (an str) somewhere within our program’s memory. Which of the ways we use is usually a matter of taste, but we will mostly find versions 1 and 3 out in the wild, which is either calling from on the target type and calling into directly on the value of the type we have at hand.
This standardized way of converting values is already valuable on its own. If we stick to it, Rust devs will know what to search for and won’t need to go and see which method they might need to use to convert from one type to another. If our type implements From, other devs will know how to use it.
That, however, was a lot of theory, so let’s see how we can effectively use these two trait besides standardizing conversions with them.
Imagine that we have a simple struct that models the metadata of a YouTube video:
This struct has at least two properties, a title, and a description, both of type String. If we now want to create a factory function for that struct, we could just let our users pass two Strings to it. It’s already clear that we want to take ownership of that data because whatever we put inside the struct needs to live as long as the struct itself does:
This API doesn’t have a lot of ergonomics, though. If we want to construct an instance of that struct it only accepts two Strings and that’s it:
We can always let the users of our API convert their types themselves, like strs for example but that leaves some ugly code behind:
If we change the signature of our factory method to accepting an Into<String>
, however, the ergonomics of our factory suddenly become way better:
Our factory is now generic over a type that must implement Into<String>
, and it doesn’t matter which type that is.
Now the call with an str simply looks like this:
That’s way better for users of such an API because they don’t have to explicitly convert between types anymore. They can implement Into<String>
for their types, if they wish to, and then pass them to our function.
We can extend this example to anything else, but in general, whenever we want to accept data that we need to take ownership of, we should at least consider making our function generic and accepting an Into of that type because it can improve the ergonomics of our API. Doing so also makes sense for internal types because it allows us to implement From for some of them exactly once for each type we need, get the Into implementation for free, and save ourselves a lot of work when we pass data of different types around within the internals of our software.
TryFrom and TryInto
Until now, we have only talked and learned about conversions that cannot fail, but what if they can? What if they are fallible?
The signature of From and Into doesn’t allow for any errors, and panics should generally be avoided because no one likes software that just shuts down whenever it hits a small inconvenience. And this is why TryFrom and TryInto exist.
If we take a look at TryFrom’s source code, we can see that it looks a little more complicated than From, but it actually only has one more advanced feature inside it, a so-called associated type, Error
:
TryFrom is generic over type T
, which means that T
is the source, and Self
is the target type that we implement TryFrom for. This, however, leaves one problem: How can we make the error generic? It’s not as easy as simply adding another generic type parameter, which is why the associated type exists. This associated type is just a simple placeholder that we can specify when implementing the trait, which allows us to specify the type of error we want to return in case the conversion fails.
Before we look at an example, though, we can pay a short visit to TryInto’s source code and identify that like Into for From, TryInto also has a blanket implementation, which only requires us to implement TryFrom, and get TryInto for free:
Everything within TryInto is directly linked to TryFrom. Even its associated type is bound to the one within TryFrom, so there is really no work for us to do, unless we have to (in really rare circumstances).
With that done, let’s now look at an example scenario:
http and https both work over tcp (at least up to version 2). They are mainly the foundation of the internet as it is today, with https being the preferred standard for public traffic. If we were to implement an http and https (yes, both combined) server, we would have to think in byte streams that our application receives, and correctly interpret those bytes.
At the beginning, there is a new connection, which any client opens to our server on its first request. That’s the moment we have to determine whether we are dealing with http or https traffic. One way to distinguish one from the other is by looking at exactly the first byte of the first tcp packet. In case of https, that byte will always be 22 because of the TLS handshake that occurs in the beginning. If the connection is only through http, we receive any byte between 32 and 127 inclusive.
A way to model this is to use an enum with exactly two members. One that signals that the type of the connection is http, and the other that it’s https:
This enum alone doesn’t help us, though. We deal with a byte stream, which is just a buffer of bytes, that are several u8 in Rust’s case, and this already screams for some some form of conversion.
Although we are already talking about the “try” versions of From and Into, it’s still good to understand why they are necessary in this case. You probably still remember that we have a very limited set of acceptable values for our enum. To be precise, we only accept 96 possible values of an u8 that can take any number from 0 to 255 inclusive. This leaves many bytes that don’t fit into our enum, and what do we do if we cannot deal with certain values? Correct. We return an error if we receive one we can’t work with.
We can model exactly that by implementing TryFrom<u8>
for our enum TrafficType
:
As our error, we use one that we define ourselves (with friendly help from thiserror), and within the function, we use a match statement that checks the value we convert from for all possible states.
If the byte we receive is 22, we deal with https traffic, if the byte we receive is in the range between 32 and 127 inclusive, we deal with http traffic, and if we receive anything else, we return an Err with our self-made ParseError
, which signals that we got an invalid first byte. With this implementation, we’ve created a way to convert the first byte of a tcp stream into an enum that tells us what type of connection we deal with. And with that, we can subsequently implement logic to deal with each individual type.
This is, of course, not the end of it. We can use TryFrom and TryInto as a function argument the same way we already did with From and Into. If we wanted to create a boolean predicate function to tell us whether we deal with http or https, or none of them at all, we can do it by accepting a TryInto<TrafficType>
as our parameter and then base our logic on the result of the conversion:
Admittedly, the use cases for TryInto as an argument are fewer than those for Into, but especially in crate-internal code, you can make good use of it, so it’s still a very valid use case.
AsRef
Up to now, we have only talked about conversions that take ownership, but more often than not, we don’t need any ownership to do what we want to do. In fact, we can even argue that the majority of the time, borrowing a value is more than enough. This, however, rules out using From, Into, TryFrom, and TryInto.
For cases like this, AsRef exists, and we can take a quick look at its source code and try to make a little sense of it:
AsRef is a generic trait with exactly one function to implement, as_ref
. It borrows itself, so it doesn’t move its self
like the other conversions do, and returns a reference to a target type T
.
In the end, AsRef is meant for cheap conversions of references. If a type A contains a type B, implementing AsRef<B>
for type A allows us to return the nested reference of B within A, which is very cheap to do. And further it also saves users of our APIs to pass huge call chains into our functions, like From, Into, and the others do.
But let’s, as an example, look at how we can work with files in Rust. To make it even more precise, let’s imagine that we are dealing with renaming files, which is a common operation. Whenever we download a file, for example, our browsers tend to create a temporary file until that download is finished and rename it to its real name afterward.
If we want to do the same in Rust, we can use a sweet little helper method from the standard library within the fs module, called rename:
It accepts two Paths, the source, and the target, and when it has finished its execution, the source file has the name of the target we supplied.
This function doesn’t actually need ownership of the arguments. It only needs the Paths to identify the source file, and to get a name, and an absolute path to the target file, so borrowing makes perfect sense.
Even more interestingly, it accepts more than only Paths:
We can even mix and match arguments as we like, and rename
still works, and it still only borrows its arguments.
If we now take a look at rename
’s source code, we can spot how the magic works:
It makes use of AsRef, and requires both of its parameters to implement AsRef<Path>
. That’s it.
The actual magic, however, is nested within the individual implementations of AsRef for multiple types within the standard library:
There are quite a few types that implement AsRef<Path>
, which is why you can call rename
with many different types of data. The cheap conversion from whatever type to a Path reference happens under the hood, actually even within Path’s factory function, which converts any type it receives to a reference of an OsStr to create a Path from:
Let’s now look at another example, to get an even better idea of how AsRef is intended to work, and how we can make use of it. For that, let’s imagine we build a blog system with various data. The most basic form of that data we need is a Post that has a few crucial properties, like seen below:
There are also advanced types of Posts, like video guides that we need to model. As Rust does, however, lack inheritance (for good reason), we usually model it with the new type-principle and create a new type, adding the additional properties we need, and nesting the original type within it:
Now we want a function to notify all of our regular readers that a new post has been published. It needs some metadata from the post to create an email with, which is why borrowing is enough, and then does some magical work to notify all readers of the new content:
Passing the references is perfectly fine and fast, but the deeper a post is nested within advanced types, the longer the call chains become. Calling the post property on the video guide is fine, but it’s already looking a little ugly. This is where we can bring AsRef in, and implement AsRef<Post>
for all our types:
Then we can change the signature of our notify function to accept any type that implements AsRef<Post>
, which then allows us to get a cheap reference to a Post, no matter how deeply it is nested within a random type:
This also beautifies the calls to our function because we can now just pass any type that implements AsRef<Post>
:
But, wait a second. Have you noticed something? If not, don’t worry. Let’s take a closer look.
notify
does magically work by passing a reference to a Post and a VideoGuide, although we have only implemented AsRef<Post>
for Post and VideoGuide and not for &Post
and &VideoGuide
.
That’s another blanket implementation within the standard library doing some magic for us:
This blanket implementation automatically gifts us with an implementation of AsRef<Post>
for both &Post
and &VideoGuide
as soon as we implement it for the non-reference types. Oh, and even better, the standard library even goes one step further and implements AsRef<T>
for us for any mutable reference:
This means that we can also automatically pass mutable reference to any function that expects an AsRef<T>
without any compiler errors:
That’s some time saved, and it improves the usability of our API even more because now users don’t even have to think about what they need to pass. They can pass us what they have at hand.
AsMut
There is one last thing missing now: Borrowing a mutable reference. That’s something AsRef can’t cover, which is why AsMut exists.
A look at its source code reveals that it’s mainly the same as AsRef, with the only difference being that it requires a mutable self, and returns a mutable reference, together with a different function name:
This makes AsMut the way to go if we don’t want an immutable, but a mutable borrow somewhere within our APIs. Its function name is different from as_ref
, which allows us to implement both traits, and this then allows us to make explicit calls that also mark our intention. as_ref
if we want an immutable reference, and as_mut
if we want a mutable one.
Using the same blog example we already used for AsRef, we could also implement a function that can mutate our Post, to store the date we notified our users, for example:
This is nearly the same as our AsRef example, just that we can work with mutable references this time. Oh, and who would have thought, the standard library once again gives us a blanket implementation for mutable references, which auto-dereferences multiple layers of mutable references, for us. Or, in other words: Even a &mut &mut T
can be passed as an AsMut<T>
, and is automatically de-referenced:
Once again, something that saves us time, and makes APIs more usable. So, overall a win-win situation.
Phew, that was a lot to grasp. Take some time to process that first. And after that, you will soon realize how much better your code becomes if you begin to consequently use these traits in your daily work or your side projects. Even if you don’t have the time or opportunities to use them yourselves, you will still profit from that knowledge because you now probably understand a few crates and their APIs, as well as the standard library, better than you did before.
Now enjoy your newly gained knowledge, and go test it out!