I spent some time experimenting with Mtac and figured that I’d write a little bit about my experience. Mtac is a tactic language for Coq where you get write Gallina terms that manipulate raw Coq syntax through a monadic interface.

You can get the (extensively commented) full source code as a gist.

Equality in a Commutative Monoid

I opted to explore Mtac by building a procedure for proving equalities in a commutative monoid. Since I didn’t want to bother formalizing commutative monoids I just axiomatized them to keep the dependencies small.

Parameter m : Type. (* type of the monoid *)
Parameter star : m -> m -> m.
Local Infix "*" := star.
Parameter one : m.

(* commutative *)
Axiom star_comm  : forall a b : m, a * b = b * a.
(* associative *)
Axiom star_assoc : forall a b c : m, (a * b) * c = a * (b * c).
(* one is neutral *)
Axiom one_star   : forall a : m, one * a = a.

Using just these theorems it is not too difficult to prove this goal, but the result is quite horrific.

Goal forall a b c : m, a * b * c = c * b * a.
  rewrite star_assoc. rewrite (star_comm _ a).
  rewrite (star_comm c b). reflexivity.

Of course this quickly gets cumbersome as the problem gets bigger. In addition, the proof script is not robust, i.e. small changes in the goal completely break the proof script. To avoid both of these issues, I’ll use Mtac to write some automation.

Mtac Basics

The idea of Mtac is fairly simple, Mtac is a DSL (domain-specific language) inside of Coq for writing automation. Mtac comes with a monad M of Mtac tactics that can be used to write automation and an evaluation mechanism (more on this later) for evaluating Mtac “programs”. In addition to the standard bind and return that all monads have, M also comes equipped with a set of operations that are not kosher in Gallina, but are exactly what is needed to inspect terms and build proofs. In particular, it comes with an mmatch operation that allows Mtac programs to inspect the syntax of a Gallina term similar to what Ltac’s match allows you to do. To give a flavor of how it works, here’s an example of using mmatch to optimize star.

Definition opt_star (a b : m) : M { x : m | x = star a b }.
refine (
  mmatch a as a return M { x : m | x = star a b } with
  | one => ret (@exist _ _ b _ (* first proof *))
  | [? z] z =>
    mmatch b as b return M { x : m | x = star z b } with
    | one => ret (@exist _ _ a _ (* second proof *))
    | _ => ret (@exist _ _ (star a b) _ (* third proof *))

The goal of opt_star is to construct a value that is equal to a * b but does not star together units, i.e. one * a = a and a * one = a. This is codified in the types by stating that opt_star takes two terms (a and b) and returns a monadic operation that returns a term x and a proof that x = star a b. The use of “term” rather than “value” here is important (even though it is not explicitly represented in the type) because each branch of the match can provide a syntactic representation that is unified against the discriminee (the term being matched on). mmatch is required to return values in the monad, and, like Coq’s standard dependent pattern matching, mmatch provides dependent matching, i.e. the type of each branch depends on the pattern that is matched.

I used tactics to fill in the three proof obligations that arise from the commented holes in the term. The first one says the following:

a, b : m
e : a = one
b = star one b

Which is easy to solve using one_star.

clear; abstract (symmetry; apply one_star).

Here abstract treats the result of the tactic as opaque1. abstract X constructs a new definition with the given type and solves it using X (note that X must completely solve the goal). Then it solves the goal by appealing directly to the proof. The other two proofs can be solved in a similar way.

The Mrun “function” runs the Mtac program 2.

Let x := Mrun (mstar one one).
Print x.
(* x = exist (fun x : m => x = star one one) one (mstar_subproof one)
 *    : {x : m | x = star one one}

Mrun interprets the monadic program and replaces it with the result. Here, we see that, as the type states, the result is a pair of a term x which is one and a proof of one = mstar one one (mstar_subproof one, the lemma name generated by abstract).

Implementing a Canceller

To implement the canceller, I broke the problem down into two phases. In the first phase, the automation will iterate through atomic elements on the left-hand side. If it is one then it will drop the element and continue. Otherwise, it will attempt to find a matching term on the right-hand side. If the automation can find a matching element on the right-hand side, it will drop both terms; otherwise, it keeps the term on the left and continue the iteration.

The first thing to do is build some simple problems to solve. The following code constructs some simple, arbitrarily large, test cases that we can use to test and evaluate performance and scalability.

Require Import Coq.Lists.List.
Axiom asM : nat -> m. (** Just a way to construct 'm's **)

(** Build a large 'm' from a list. **)
Fixpoint big (n : nat) : list m :=
  match n with
  | 0 => nil
  | S n => asM n :: big n

Fixpoint stars (l : list m) : m :=
  match l with
  | nil => one
  | l :: nil => l
  | l :: ls => l * stars ls

(** This will allow us to set up problems that are reasonably 'difficult'
 ** to solve.
Definition benchmark (n : nat) : Prop :=
  stars (big n) = stars (rev (big n)).

Ltac bench n :=
  let p := eval compute in (benchmark n) in
  refine p.

Goal ltac:(bench 3).
 (* asM 2 * (asM 1 * asM 0) = asM 0 * (asM 1 * asM 2) *)

Since the high-level cancellation function clearly depends on the remove program, I’ll start by writing that function.

A First Attempt

My first attempt to implement the remove function was the following. I’m usually not a user of Program3 but I was told that it was useful for writing Mtac code so I figured I would give it a try.

(** An Mtac exception **)
Definition NotFound : Exception.
  exact exception.

Program Definition remove (x : m)
: forall s, M { s' : m | s = x * s' } :=
  (mfix1 f (s : m) : M { s' : m | s = x * s' } :=
    mmatch s as s return M { s' : m | s = x * s' } with
    | x => ret (@exist _ _ one _)
    | [? l r ] star l r =>
        il <- f l ;
        ret (@exist _ _ (il * r) _)
      with _ =>
        ir <- f r ;
        ret (@exist _ _ (l * ir) _)
    | _ => raise NotFound
Next Obligation.
  rewrite star_comm. symmetry. apply one_star.
Next Obligation.
  simpl. rewrite star_assoc. reflexivity.
Next Obligation.
  simpl. rewrite <- star_assoc. rewrite (star_comm l x0). rewrite star_assoc.

Here I used Mtac’s mfix1 operation to construct a recursive tactic on a single argument. The procedure is similar to the definition of opt_star above: the tactic returns a pair of a result and proof connecting the result to the input. In order to express that the element was not found, I am using Mtac’s exception mechanism which is built into the Mtac monad. As above, I solve the subgoals with abstracted proofs.

Program made the implementation quite simple, but the resulting proof terms are really bad. For example, removing a single term from a star of 10 terms produces a term that is 74 lines long. Looking at the proof term, we see that the automation is embedded inside the proof term.

Time Check Mrun (remove (asM 9) (stars (big 10))).
(** exist (fun s' : m => asM 9 * stars (big 9) = asM 9 * s')
     (exist (fun s' : m => asM 9 = asM 9 * s') one
        (remove_obligation_1 (asM 9)
           (tfix1' M (fun (a : Type) (x : M a) => x)
              (fun (f : forall x : m, M ((fun s : m => {s' : m | s = asM 9 * s'}) x)) (s : m) => (* ... this is a copy of the automation ... *) ...

To understand the problem, it is important to understand Mtac’s evaluation strategy.

Mtac Evaluation. Mtac’s evaluation strategy is lazy, it uses Coq’s head normal formal evaluation strategy (hnf) to uncover the next monadic operation (since these are defined as an inductive type) and then implements it in the OCaml plugin. For example, to implement bind (written x <- c ; k) it evaluates the first argument and, if it is a ret then it substitutes the argument in the body (k) and continues reducing. Or, if c reduces to an exception, then the reduction discards k and the whole term throws the exception. Performing minimal evaluation is important for implementing mmatch because it ensures that the term being matched on not reduced which would change its syntactic representation (and therefore the results of the match since the match is on the syntax).

With this understanding we can look at implementation that Program constructed for us:

remove =
fun x : m =>
mfix1 f (s : m) : M {s' : m | s = x * s'} :=
   (mmatch s as s0 return M {s' : m | s0 = x * s'}
    with x =c> [H] (* H : s = x *)
       ret (exist (fun s' : m => x = x * s') one (remove_obligation_1 x f s H))
    |  [?l r : m] l * r =c> [H] (* H : s = l * r *)
         (il <- f l;
            (exist (fun s' : m => l * r = x * s') (proj1_sig il * r)
                   (remove_obligation_2 x f s l r H il)))
         (fun e : Exception =>
           tmatch (fun _ : Exception => {s' : m | l * r = x * s'}) e
                  ((with [?x0 : Exception] x0 =c> [H0]
                     ir <- f r;
                       (exist (fun s' : m => l * r = x * s')
                              (l * proj1_sig ir)
                              (remove_obligation_3 x f s l r H e x0 H0 ir)) end) ++
                   (with [?x0 : Exception] x0 =c> [_]
                      raise x0 end)))| [?x0 : m] x0 =c> [_] raise NotFound
     : forall x s : m, M {s' : m | s = x * s'}

Here, the syntactic sugar for mmatch has been partially removed. The added terms [H] after each pattern (represented [? .. ] .. =c> [H]) is the proof that the discriminee s is equal to the pattern (types notated inline).

The performance problem with this implementation is the arguments to the proof obligations constructed by Program. For example, remove_obligation_3 is applied to 10 arguments, none of which will be reduced by Mtac’s reduction mechanism since none will be in head position because the entire proof is an argument to the exists constructor. Further, the modular implementation of mfix1 simply substitutes itself in the body which can make these arguments quite large.

Revising the Automation with the Proof in Mind

There are two main things that we can do to make better proof terms.

  1. Lift matches so that Mrun reduces them by making sure that they return values of type M.
  2. Avoid using Program to get finer granularity control over the obligations and term that is constructed.

Of the two, the first is the more important; the second simply makes it easier to do this since what you write is what you get. Adopting these two techniques leads us to the following definition:

Definition remove_refine (x : m)
: forall (s : m), M { s' : m | s = x * s' }.
  (mfix1 f (s : m) : M { s' : m | s = x * s' } :=
    mmatch s as s return M { s' : m | s = x * s' } with
    | x => ret (@exist _ _ one _)
    | [? l r ] star l r =>
        il <- f l ;
        match il with
        | exist _ il1 il2 =>
          ret (@exist _ _ (il1 * r) _)
      with _ =>
        ir <- f r ;
        match ir with
        | exist _ ir1 ir2 =>
          ret (@exist _ _ (l * ir1) _)
    | _ => raise NotFound
{ clear - e.
  abstract (rewrite star_comm; symmetry; apply one_star). }
{ clear - e il2.
  abstract (subst; rewrite star_assoc; reflexivity). }
{ clear - e0 e1 ir2.
  abstract (subst; rewrite <- star_assoc; rewrite (star_comm l x0);
            rewrite star_assoc; reflexivity). }

Note here that I am using clear before abstract to ensure that the subproof does not require any parameters that are not strictly necessary.

If you inspect the term using Print then you will note that it is a bit simpler. The main difference, however, can be seen when we look at the proof term from our simple result.

Check Mrun (remove_refine (asM 9) (stars (big 16))).

The proof is now 12 lines (6x smaller) and takes less than a tenth of a second to construct. That’s a 45x performance improvement just from doing explicit pattern matches.


The accompanying gist has a similar comparison for the implementation of cancel but the ideas are mostly the same. The best implementation of cancel that I was able to write was the following:

Definition cancel_refine
: forall (l r : m), M { xy : m * m
                      | forall Z, Z * fst xy = snd xy -> Z * l = r }.
refine (
  mfix2 cancel (l : m) (r : m)
  : M { xy : m * m
      | forall Z, Z * fst xy = snd xy -> Z * l = r } :=
    mmatch l as l
      return M { xy : m * m
               | forall Z, Z * fst xy = snd xy -> Z * l = r }
    | [? a' a''] star a' a'' =>
      res <- cancel a' r ;
      (** Explicit pattern matching and casting! **)
      match res with
      | exist _ _res1 _res2 =>
        match _res1 as xy
              return (forall Z, Z * fst xy = snd xy -> Z * _ = _) -> _
        | (fst_res1,snd_res1) => fun res2 =>
          res' <- cancel a'' (snd_res1) ;
          match res' with
          | exist _ _res'1 _res'2 =>
            match _res'1 as xy
                  return (forall Z, Z * fst xy = snd xy -> Z * _ = _) -> _
            | (fst_res'1,snd_res'1) => fun res'2 =>
              ret (@exist _ _ (fst_res1 * fst_res'1, snd_res'1) _)
            end _res'2
        end _res2
    | [? x ] x =>
      (** NOTE: Here we capture the full term **)
        (** NOTE: and then we use it here **)
        res <- remove_refine x r ;
        match res with
        | exist _ res1 res2 =>
          ret (@exist _ _ (one, res1) _)
      with _ =>
        ret (@exist _ _ (x, r) _)
(** Proofs elided **)

At the high level, the type of cancel_refine states that it takes in two monoidal expressions and returns a pair of monoidal expressions such that if the returned expressions are equal, then the input expressions are equal. However, to implement the function, we need to strengthen the recursion by introducing an extra variable Z that we star with both left-hand sides. This allows us to use returned results in the calling context and is analagous to strengthening the inductive hypothesis. To see why you need this, try removing the extra quantification and see where things go wrong.

A crucial, and subtle, part of the implementation is the final branch. It might seem intuitive to replace the final branch in the mmatch with _ and then use l instead of x in the body. However, while we might think of this as being equivalent, in reality it is not. This is because the return type of the branch will mention the value bound by _ (which does not have a name but still exists, for convenience let’s say the name is _). In that case, we will need to use the fact that _ = l. Mtac provides us access to that information in the body using the proof object (we saw it above when we printed the definition of remove). However, to use this proof in Coq, we need to explicitly match on it and therefore it will pollute our proof terms. By naming the pattern, i.e. using [? x ] x, we can mention it in the branch and avoid the extra cast since all of the operations in the branch and its return type are in terms of x and not s.


The final implementation of cancellation performs pretty well but it does take a bit of care to get there. In particular, it is really important that you inspect the resulting proof terms to ensure that you are not duplicating too much information.

The chart shows how the optimized Mtac automation compares to the standard Ltac implementation of cancellation on various problem sizes 4. Note that the numbers are approximates and could vary significantly between computers and Coq implementations.

Mtac solidly beats Ltac on most problem sizes for the search but is substantially slower on checking the final proof term. This suggests that the proof that we are building with Mtac is still far from optimal. At the high level, this can be attributed to a wildly different proof search technique, but there are also low-level optimizations that are still possible.

Comparison with Computational Reflection

Even after all of the optimization, Mtac is not as fast as computational reflection, e.g. the Rtac implementation of cancelation solves problems of size 100 in under a second while Mtac takes 30 seconds on a problem half the size. While it is not very common to have such large problems, it is not uncommon when stitching automation together. For example, program verification might result in tens of calls to an entailment checker each with a goal of perhaps 10 or so conjuncts. Stitching all of these proofs together along with the reasoning about the program can quickly become costly.

Final Thoughts

Mtac is quite convenient and light-weight to use; however, getting good performance can be a bit tricky. The main thing that you need to understand to write efficient Mtac tactics is the evaluation mechanism. The various iterations took some time but the optimizations are well worth it.

While you could chalk this up to implementation, I believe that what is going on here is important. Constructing the proof term within the function is convenient, however the type confuses what is happening. Even though you have equalities in the context, if your function attempts to eliminate one (in the usual way) in the wrong place everything the term will get stuck and ultimately fail. What you need to keep in mind is that some of the variables in your context are going to be “values” that you can inspect and others are going to be terms (not necesssarily closed) so you should not inspect them using vanilla pattern matching.

Mtac does allow you to write tactics in a typed manner, but it still generates proof objects. While the terms that it constructs are not immediately re-checked by the system, they are re-checked at Qed time. Coq 8.5’s uconstr feature removes some of the type checking overhead of naive Ltac, but Mtac still seems to be a lot nicer.

The benefit of Mtac over computational reflection, is that Mtac completely avoids the need for a reflective representation which decreases the up-front cost to using it and trivially addresses the entire language including complex dependent types, which systems like MirrorCore do not currently support (though it is in progress).

  1. abstract is not strictly necessary here because Mtac’s reduction mechanism will not reduce these terms anyways. However, it does hide the proof terms which makes the tactics a little bit easier to read. 

  2. Mrun is not really a “function” in the Coq sense of the term because it does not respect equality. Essentially, if a = b, Mrun a is not necessarily equal to Mrun b

  3. I think that the idea of Program is great, but for many of the things that I work on, computation is important and with Program you sometimes loose good computational behavior. In addition, Program sometimes quietly introdues axioms which seems like bad behavior). 

  4. Results for Ltac and Rtac are taken from Extensible and Efficient Automation through Reflective Tactics, ESOP’16