I used to think of Haskell ADTs (Algebraic Data Types) as a sort of a final, flat type inheritance definition. Sometimes they are more like a base-class design in OO without the good bits. Here I show how we can overcome that by using dependent types.

A wrong analogy

Consider the following ADT:

data Value
  = StringValue String
  | IntValue Int

I used to think there were - sort of - three types here. Or, maybe better put, I used to think that this was a kind of a Haskell equivalent of this approach in Java:

abstract class Value {}

final class StringValue extends Value {
  final String value;

  public StringValue(String value) {
    this.value = value;
  }
}

final class IntValue extends IntValue {
  final Int value;

  public IntValue(Int value) {
    this.value = value;
  }
}

In the Java example, we have three types we can play around with. But on the Haskell ADT above, there is only one type: Value. StringValue and IntValue are just different constructors of that type.

Implications

In terms of code, this means we can have:

f :: Value -> Value

But we can’t have a specialized g that only accepts, say, StringValues:

g :: StringValue -> StringValue

This is fairly annoying, as now the only way of applying specialized functions to a specific Value type constructor is by unwrapping it beforehand, which still requires the parent operation to accept any Value if we want to be total. This is ugly and repetitive:

g :: Value -> Value
-- the real reason for this function
g (StringValue str) = StringValue $ str ++ "?"
-- A kind of python `pass`
g (IntValue int) = IntValue int

This implies that once you define your ADT, you have to stick with it everywhere, even when parts of it are not applicable or sensible to an operation.

In the example above, the return type is also Value, so we can sort of pass it for the cases that we are not interested in. But what if we want to have a function k :: Value -> String, that is meant to only accept StringValues?

A workaround would be to set such case as undefined or to throw error for the IntValue case, as dynamically typed languages do. Another option would be to return a Maybe String, where Nothing represents the non-applicable-case.

None of these options are clean or even acceptable solutions, in my opinion. They are neither type-safe neither made right by design.

A better way

This problem can be very neatly addressed by bringing dependent types onto the table, by using GADTs (Generalized algebraic datatypes) and DataKinds:

{-# LANGUAGE GADTs             #-}
{-# LANGUAGE DataKinds         #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE KindSignatures    #-}

data ValueType = StringType | IntType

data Value :: ValueType -> * where
  StringValue :: String -> Value 'StringType
  IntValue :: Int -> Value 'IntType

Going step by step:

  1. First, we define the type of values (ValueType) we support: StringType and IntType.

  2. Then, we define Value, whose type depends on a concrete ValueType to be resolved.

  3. Now, we define the value wrappers we need and specialize their type accordingly:

    • StringValue is a Value StringType which wraps a String.
    • IntValue is a Value 'IntType which wraps an Int.

Notes:

  • If you are wondering about the ' before each ValueType, it is just a way of being explicit that we are promoting the StringType constructor, which is possible by using the DataKinds extension.
  • The FlexibleInstances extension is used so that we could define the type signature of Value by its kind.

The whole and the self

By using GADTs and DataKinds, each type of value can now stand for itself, instead of just existing in the context of the whole.

That means that we can have functions that accept all types of Values:

resetWithDefault :: Value a -> Value a
resetWithDefault (StringValue _) = StringValue "Hello, world!"
resetWithDefault (IntValue _) = IntValue 3

As well as ones that only accept specific types of values:

serious :: Value 'StringType -> Value 'StringType
serious (StringValue str) = SStringValue $ str ++ "!"

plusX :: Value 'IntType -> Int -> Value 'IntType
plusX (IntValue x) y = IntValue $ x + y

Batteries included

Note that this approach brings further type-safety to our system. While before one could have the following and get away with it silently:

foo :: Value -> Value
foo (StringValue _) = IntValue 0

Now, to achieve the same result, one has to be strict and explicit:

foo :: Value 'StringType -> Value 'IntType
foo (StringValue _) = IntValue 3

We can also guarantee that, for functions that accept all kinds of Values, one can’t change the type of the input value when calculating the result value:

bar :: Value a -> Value a
-- Compiler error, as `IntValue` does not match `StringValue`
bar (StringValue str) = IntValue 3

Final notes

I am not against the use of data. I love it, and I wish that such terse and powerful way of defining types existed in more mainstream languages. But as you see, there are cases where they are not ideal left alone and where a higher level of type-safety is both necessary and advisable.

Here you can find the gist containing the Haskell code present in this page.