Overview#

Writing code that generates code is known as metaprogramming. Some languages have no native support for this, while others accomplish it textual substitution, and still others have the ability to operate on syntax as data. This last concept comes from Lisp and is called a macro. In short, a macro is a thing which takes code as input and produces code as output. Rust has a set of macro features depending on what you are trying to accomplish and how much work you want to undertake. These are:

  • declarative macros

  • procedural custom derive macros

  • procedural attribute macros

  • procedural function-like macros

Macros serve to reduce the amount of code you need to write, but it is important to understand that anything you can do with a macro you could have done by hand. There is nothing magic going on in the world of macros even if it sometimes feels otherwise. That is not to understate the enormous power of metaprogramming. Generating code can be a significant multiplier to your productivity and is absolutely the right tool for some jobs.

But, as always, with great power comes great responsibility. Code that generates code can be hard to understand, hard to change, and hard to debug. Frequently macros are not the right tool to reach for. If you can get away with using a function or some other abstraction mechanism for reducing repetition, then you should almost certainly use that instead of a macro.

There are two primary reasons to reach for a macro. The first is if you want a function with a variable number of arguments. The second reason is if you want to do something that must happen at compile time, like implement a trait. It is impossible to write a normal function that does either of these in regular Rust, but both can be accomplished with macros.

We are going to cover declarative macros in a limited fashion as they are the less powerful but better documented type of macros. If you can accomplish your task easily with a declarative macro, then fantastic. However, they can get hairy very quickly. Moreover, as we will see, the more powerful procedural macros are actually closer to writing regular Rust and are therefore more approachable for bigger problems.

Declarative Macros#

The name of this type of macro comes from the fact that the language used to implement these is a declarative style language inspired by Scheme. You can think of it as being similar to a big match statement where the conditions are matching on syntactic constructs and the result is the code you want to generate.

Rather than explain all the syntax and go through a bunch of boilerplate, let's first just write a macro that is similar to something you have seen before. Suppose the standard library did not come with the vec! macro. We find ourselves writing code like:

This starts to get tedious and it requires that weird let v = v; line to get an immutable binding to our vector after we are done filling it up. We could have instead put the creation code in a block such as:

In some ways that is a little better, but we still have a lot of repetition going on. Can we instead write some function that does this for us? That would look like:

This implies this signature:

That would work if we only wanted to create three elements vectors of type i32. It is easy to make the type more general by using a generic:

But can we extend this to an arbitrary number of elements? Rust does not support functions with a variable number of arguments, commonly called variadic functions. There is no way to write a function in Rust that takes an unknown number of arguments. We could write out many similar functions:

Depending on your use case, like if you are constantly making only one size vector but you do it in a lot of places, then maybe that could work. But this certainly does not scale. Enter macros which execute at compile time and are therefore not bound by the same constraints as the rest of the Rust language.

Let's implement this as a macro called myvec:

The syntax for creating a macro is to invoke a macro called macro_rules (very meta) with the name of your new macro, myvec in this case, without the exclamation point, followed by braces to denote the body of the macro.

The body of a macro is just a series of rules with the left side of an arrow, =>, indicating what pattern to match, and the right side specifying what code to generate. The patterns are checked in order from top to bottom, so they move from more specific to more general so that the specific patterns get a chance to match.

We have two possible match arms, let's start by explaining the first one. ($($:expr),*) The outer set of parentheses exists to denote the entire pattern. Inside we see $($x:expr),* which can be broken down into two parts. The inside $x:expr means to match a Rust expression and bind it to the variable $x. The outer part $(...),* means to match zero or more comma separated things inside the parentheses. So the total pattern means to match zero or more comma separated expressions, with each expression bound to the variable $x. We will see how that is possible to bind multiple expressions to a single name when we get to the right hand side.

The right hand side is surrounded by parentheses as well which signifies encloses the entirety of the code to generate. We then also have curly braces surrounding our code, these mean to literally generate a set of curly braces around our code so that we are actually outputting a block. If you did not have these then the macro would expand to just the literal lines of code. Sometimes that is what you want, but for us as we are going to use this macro as the right hand side of an assignment, we want it to expand to a single expression.

The macro definition grammar is a bit wild. As our first example of this, the outer parentheses can actually be any balanced bracket, i.e. (), [], or {} are all acceptable in that position. So sometimes you might see ($($x:expr),*) => {{ ... }} which still means to generate a single block, as the outer braces is just there so that the compiler knows where the right hand side begins and ends.

Then the rest of the right hand side looks just like the example code that we are replacing, except for the funky bit in middle. We are creating a vector, doing what looks kinda like our push, and then returning that vector. On the right hand side of a match inside a macro definition the syntax $(...)* means to repeat the code inside the parentheses for each repetition captured in the match arm. Within those parentheses the expression that we captured will be substituted directly for $x. It is in this way of expanding a repetition on the right that we get access to each of the repeated captures on the left.

The code inside that repetition v.push($x); is exactly what we were using before if you mentally replace $x with the different expressions that we pass to our macro.

From just what we have covered so far, we can understand that:

will expand to

which is exactly the code we saw earlier in our motivation to write this macro. What about if we wrote:

Note the trailing comma after the 3. This is where the second match arm comes in. It turns out that the syntax in the first match arm ($(...),*) implies matching a repetition of what is inside $(...) exactly separated by commas, i.e. without a trailing comma. Without the second part of our macro, having a trailing comma would be a syntax error as the first arm does not match and it is an error to have a macro without any arms that match.

Our second pattern $($x:expr,)* with the comma inside the repetition, $(...)*, means that we expect to see expressions followed by commas. This arm therefore only matches if we explicitly do have a trailing comma.

We use the right hand side to convert the trailing comma version to the version without a trailing comma and then rely on our previous pattern to match. We do this by recursively calling our macro and expanding $($x:expr,)* to $($x),*. Moving the comma from inside the repetition to outside means to take it from a list of expressions each followed by a comma to a list of expressions separated by commas.

With this little syntactic trampoline we get a macro which supports optional trailing commas. Rust generally supports trailing commas so your macros should too. It is a bit unfortunate that you need to go through this hoop for all macros to support both styles. Further, it is reasonable to feel like this syntax is weird and really not like the rest of Rust. You are correct. This declarative style of macro will probably be deprecated at some point in a future Rust version.

Expanding a macro#

Sometimes you think your macro is not quite working how you expect but you can't quite tell why. It would be great if you could see what the macro expands into to see if the generated code is what you expect. Look no further than the cargo expand command. There are installation instructions on the Github repo for the project, but the short story is cargo install cargo-expand should allow you to then run cargo expand from the root of a crate to see all macros expanded out. This works for all types of macros including procedural macros which we will cover later.

You must have a nightly toolchain installed for the expansion to work. You just need it installed, you don't have to be using it directly, the command will find it as needed as long as you have it installed. This can be accomplished using rustup.

Consider the following code in our main.rs that includes the myvec macro definition:

 

This page is a preview of Fullstack Rust

Start a new discussion. All notification go to the author.