In this post I’m going to discuss a problem that I ran into while working with shape indexed types in Haskell and an elgant solution that is inspired by dependent type theory.
The problem is implementing functions that produce values of shape-indexed types without accepting a value to do case analysis on in order to refine the index.
A simple example arrises when implementing an Applicative
instance for length indexed vectors.
data Nat = O | S Nat
data Vector (l :: Nat) t where
Vnil :: Vector O t
Vcons :: forall l. t -> Vector l t -> Vector ('S l) t
Implementing <*>
is no problem at all because we get to do recursion on the input parameters; in fact, our type nicely guarantees that the two Vector
s have the same length.
apVector :: Vector l (a -> b) -> Vector l a -> Vector l b
apVector Vnil Vnil = Vnil
apVector (Vcons f fs) (Vcons x xs) = Vcons (f x) (apVector fs xs)
However, we run into a problem when we try to implement pure
because we don’t have any way to inspect the l
index of the vector type.
pureVector :: a -> Vector l a
pureVector = undefined -- What can we do?
In a depenent type theory such as Coq, Agda, Idris, etc. we could simply match on the l
and determine what to do, but in Haskell, l
is a at the type level1.
If we don’t have a way to pattern match on l
, then how can we solve this problem?
Just get the Value! (A Failed Attempt)
The first thing that came to my mind was that we simply needed to be able to pattern match on the value of type Nat
.
To do that, we can use a type-class (data-kind class?) to get the value.
class KnownNat (n :: Nat) where
natValue :: Nat
instance KnownNat 'O where
natValue = O
instance KnownNat n => KnownNat ('S n) where
natValue = S (natValue @n)
Unfortunately, this doesn’t help us solve our problem because, in Haskell, l :: Nat
at the value level is really completely different from l :: Nat
at the type level.
Inspecting the value-level l :: Nat
doesn’t tell us anything about the l :: Nat
that occurs at the type level.
Dependent Pattern Matching to the Rescue
In order to solve our problem, we need to express the dependent pattern match in the type-class. Here’s what that looks like:
class ElimNat (l :: Nat) where
elimNat :: f 'O
-> (forall n. ElimNat n => f ('S n))
-> f l
Note, the function elimNat
is polymorphic in a type-level function f
, which describes the result type as a function of the index (l :: Nat
).
Thus, the caller of the function will simply need to supply two arguments, for the result is l
is O
and the result if l
is S m
for some ElimNat m
.
Perhaps surprisingly, it is very easy to provide instances for this version of ElimNat
because when we declare instances, we use Haskell’s type-level unification to selet the appropriate index.
instance ElimNat 'O where
elimNat f _ = f
instance ElimNat n => ElimNat ('S n) where
elimNat _ f = f
In the first instance, we know that l
is 'O
, so we need to produce a value of type f 'O
given a value of type f 'O
(the first parameter).
Pretty simple.
Similarly, in the 'S
case, we are asked to produce a value of type f ('S n)
given a value of type (forall n. ElimNat n => f ('S n))
.
Since GHC resolves the type class instances for us (using the one required by the super-class declaration) the implementation looks similarly trivial.
Implementing pure
If we have an instance of ElimNat n
, then it is pretty easy to implement pureVector
.
pureVector :: forall a l. ElimNat l => a -> Vector l a
pureVector a = getPureVector $ elimNat (PureVector Vnil) mkCons
where mkCons :: forall l. ElimNat l => PureVector a ('S l)
mkCons = PureVector $ Vcons a (pureVector a)
-- Local type necessary to get around GHC's lack of type-level lambdas.
newtype PureVector a l = PureVector
{ getPureVector :: Vector l a }
The caveat in the implementation is that GHC doesn’t support type-level lambdas, so we need to make a newtype
to swap the arguments of Vector
since we need to instantiate f
with a type function of kind Nat -> *
.
I’m not sure if there is a cleaner way to achieve this in Haskell, but if you know one, please leave a comment.
What else is ElimNat
good for?
One might argue that we could avoid the complexity of ElimNat
simply by incorporating it into the Applicative
instances for Vector l
.
Indeed, this would work, but the generalization allows us to write other sorts of functions.
For example, we can implement fromList
which allows us to use the OverloadedLists
extension to have nice syntax for Vector
s.
fromList :: forall l a. ElimNat l => [a] -> Maybe (Vector l a)
fromList = getFromList $ elimNat (FromList $ const $ Just Vnil) mkCons
where mkCons :: forall l. ElimNat l => FromList a ('S l)
mkCons = FromList $ \ case
[] -> Nothing
(x:xs) -> Vcons x <$> fromList xs
newtype FromList a l = FromList
{ getFromList :: [a] -> Maybe (Vector l a) }
Learning the Structure of a Nat
Unlike the KnownNat
type class in the standard library, we can actually build instances of ElimNat
at runtime if we have access to a value that is indexed (in a useful way) on the Nat
.
For example, if we have a Vector
of length l
, we can compute an instance of ElimNat l
using the definitions in the constraints library and the following function:
learnLength :: Vector l t -> Dict (ElimNat l)
learnLength Vnil = Dict
learnLength (Vcons _ xs) = case learnLength xs of
Dict -> Dict
Note that behind the scenes, GHC is handling all of the type-class resolution for us.
It supplies ElimNat O
for the first instance of Dict
and ElimNat n => ElimNat ('S n)
for the second instance (resolving the ElimNat n
using the instance we get from pattern matching on the Dict
.
Conclusions
In this post I demonstrated a technique for “matching” data kinds by encoding their dependent pattern match within a type-class. This technique allows us to encapsulate the pattern match an use it to implement a variety of functions that produce indexed values without consuming an indexed value.
The contents of this post are available as a gist.
Footnotes
-
Note that strictly speaking
l
isn’t a type, it is aNat
. Haskell, however, treats data kinds as type-like in the sense that they exist only at compile time. ↩