A case for Dependent Types
I used to think of Haskell ADTs (Algebraic Data Types) as a sort of a final, flat type inheritance definition. Sometimes they are more like a base-class design in OO without the good bits. Here I show how we can overcome that by using dependent types.
A wrong analogy
Consider the following ADT:
I used to think there were - sort of - three types here. Or, maybe better put, I used to think that this was a kind of a Haskell equivalent of this approach in Java:
In the Java example, we have three types we can play around with.
But on the Haskell ADT above, there is only one type: Value
. StringValue
and IntValue
are just different constructors of that type.
Implications
In terms of code, this means we can have:
But we can’t have a specialized g
that only accepts, say, StringValue
s:
This is fairly annoying, as now the only way of applying specialized functions to a specific Value
type constructor is by unwrapping it beforehand, which still requires the parent operation to accept any Value
if we want to be total. This is ugly and repetitive:
This implies that once you define your ADT, you have to stick with it everywhere, even when parts of it are not applicable or sensible to an operation.
In the example above, the return type is also Value
, so we can sort of pass
it for the cases that we are not interested in. But what if we want to have a function k :: Value -> String
, that is meant to only accept StringValue
s?
A workaround would be to set such case as undefined
or to throw error
for the IntValue
case, as dynamically typed languages do. Another option would be to return a Maybe String
, where Nothing
represents the non-applicable-case.
None of these options are clean or even acceptable solutions, in my opinion. They are neither type-safe neither made right by design.
A better way
This problem can be very neatly addressed by bringing dependent types onto the table, by using GADTs
(Generalized algebraic datatypes) and DataKinds
:
Going step by step:
-
First, we define the type of values (
ValueType
) we support:StringType
andIntType
. -
Then, we define
Value
, whose type depends on a concreteValueType
to be resolved. -
Now, we define the value wrappers we need and specialize their type accordingly:
StringValue
is aValue StringType
which wraps aString
.IntValue
is aValue 'IntType
which wraps anInt
.
Notes:
- If you are wondering about the
'
before eachValueType
, it is just a way of being explicit that we are promoting theStringType
constructor, which is possible by using theDataKinds
extension. - The
FlexibleInstances
extension is used so that we could define the type signature ofValue
by its kind.
The whole and the self
By using GADTs and DataKinds, each type of value can now stand for itself, instead of just existing in the context of the whole.
That means that we can have functions that accept all types of Value
s:
As well as ones that only accept specific types of values:
Batteries included
Note that this approach brings further type-safety to our system. While before one could have the following and get away with it silently:
Now, to achieve the same result, one has to be strict and explicit:
We can also guarantee that, for functions that accept all kinds of Value
s, one can’t change the type of the input value when calculating the result value:
Final notes
I am not against the use of data
. I love it, and I wish that such terse and powerful way of defining types existed in more mainstream languages. But as you see, there are cases where they are not ideal left alone and where a higher level of type-safety is both necessary and advisable.
Here you can find the gist containing the Haskell code present in this page.