Introductory TyDD in Scala: Deconstructing complex Types

These types are ugly and cumbersome; they are not at all human readable. How do we get data out of such a type and format it in a way that is useful to the operator? Let's again start with a naive example.

def stringify1[A, B, C](
  fa: A => String, fb: B => String, fc: C => String,
  in: List[(A, (B, C))]): String = {
  in.map{case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }.mkString("(", "; ", ")")
}

Here we take a List of nested pairs and return a string that is hopefully more human readable than the toString method would provide.

Abstract the F

Like before, we will abstract the F from our function that it may be used with any type constructor of arity 1 rather than hard-coded to List.

import cats.Functor
val functorList = new Functor[List]{
  override def map[A, B](fa: List[A])(f: A => B): F[B] =
    fa.map(f)
}
def stringify2[F[_]: Functor, A, B, C](
  fa: A => String, fb: B => String, fc: C => String,
  in: F[(A, (B, C))]): String = {
  val F = implicitly[Functor[F]]
  F.map(in){case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }
  ???
}

The cats library provides a nifty Functor typeclass for us so, we can abstract the map call pretty easily. What of the mkString? As it turns out, cats provides a typeclass for this as well! It is called Show; let's see how it works.

import cats.Show
def stringify3[F[_]: Functor, A, B, C](
  fa: A => String, fb: B => String, fc: C => String,
  in: F[(A, (B, C))])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val result = F.map(in){case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }
  FS.show(result)
}

And extending this idea of Show to the Function1 instances we have

def stringify4[F[_]: Functor, A: Show, B: Show, C: Show](
  in: F[(A, (B, C))])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val fa = implicitly[Show[A]].show _
  val fb = implicitly[Show[B]].show _
  val fc = implicitly[Show[C]].show _
  val result = F.map(in){case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }
  FS.show(result)
}

Abstracting over Arity

We'll attempt to build a recursive version of this function using the same principled we've used in previous posts.

def stringify5[F[_]: Functor, A: Show, B: Show](
  in: F[(A, B)])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val fa = implicitly[Show[A]].show _
  val fb = implicitly[Show[B]].show _
  val result = F.map(in){case (a, b) =>
    fa(a) + ", " + fb(b)
  }
  FS.show(result)
}

This is not what we want! We need to recurse on the Show instance inside the Functor. Let's make a recursive function for Show. This will follow the code we read from Shapeless very closely.

implicit def makeShow[A: Show, B: Show]: Show[(A, B)] = {
  val fa = implicitly[Show[A]].show _
  val fb = implicitly[Show[B]].show _
  new Show[(A, B)]{
    override def show(t: (A, B)): String = {
      val (a, b) = t
      "(" + fa(a) + ", " + fb(b) + ")“
    }
  }
}

This says, Given any two Show instances, a Show instances for their pair can be produced. Alternatively, it can be said that given a proof for Show[A] and a proof for Show[B] a proof for Show[(A, B)] follows.

So, now we have the following stringify function:

def stringify[F[_]: Functor, A: Show](
  in: F[A])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val fa = implicitly[Show[A]].show _
  val result = F.map(in)(fa)
  FS.show(result)
}

Notice how simple our code has become. All of the specific type information has been absorbed into the recursive implicit functions. This is simply the Show instance for Functor itself. Furthermore, our makeShow function will produce Show instances for any nesting of Tuple2 instances; it generates Show instances for binary trees. The makeShow function is 10 lines of code (even including the Scala boilerplate) and gave us a giant boost in usefulness for our library code.

What's the Point?

Let's see an example of how this can be used. Given our individual Show instances:

implicit val ShowListString = new Show[List[String]]{
  def show(in: List[String]): String =
    in.mkString("(", "; ", ")")
}
implicit val showInt = new Show[Int]{
  override def show(in: Int): String = in.toString}
implicit val showLong = new Show[Long]{
  override def show(in: Long): String = in.toString}
implicit val showString = new Show[String]{
  override def show(in: String): String = in}
implicit val showDouble = new Show[Double]{
  override def show(in: Double): String = f"$in%.2f"}
implicit val showFloat = new Show[Float]{
  override def show(in: Float): String = f"$in%.2f"}
implicit val showArrayByte = new Show[Array[Byte]]{
  override def show(in: Array[Byte]): String = new String(in)}

Our writer and reader code becomes:

//Write the thing with our writer
val result = implicitly[Result1]
//Read the thing back with our reader
println(stringify(result))

For anyone to create an application which reads and writes data with our library they need only define the specific business logic and types for their application. If they miss writing an instance, the compiler will tell them. If they rearrange the order of the types in IO, the compiler will figure it out for them. Many common pain points of writing data readers and writers have been taken care of by the library implementor giving the rest of the team the ability to focus on business logic.

The goal here is not to have an entire company of developers who write this kind of code. Just like the goal of any business is to have employees who each bring something new to the team, the goal here is to have a few developers who write these libraries that other developers can use to develop business applications. The benefit of this is the libraries help limit the kinds of errors that can make it to production; the entire production cycle gets a confidence boost.

Introductory TyDD in Scala: Writing Type Level Functions

Type level functions consume types and produce types. This allows us to tell the compiler exactly what we want in our code that errors may be caught at compile time rather than at later stages in our product deployment cycle. The true power of this approach lies in the fact that with Scala no application can be produced without compilation. No matter how undisciplined a developer may be, she cannot skip compilation. Moving error checking into compilation gives us more confidence in the performance of our binary than confidence gained by any other means. In short, when there is a failure in production we always ask "were the tests run?" but never do we ask "was it compiled?"

A Basic Zipper

Here we have a zipper.

def zipper[A, B, C](
  a: List[A], b: List[B], c: List[C]
): List[(A, (B, C))] = a.zip(b.zip(c))

From previous posts in this series we know this can be generalized with a type class. Let's exercise this muscle immediately.

trait Zip[F[_]]{
  def apply[A, B](a: F[A], b: F[B]): F[(A, B)]
}
def zipper[F[_]: Zip, A, B, C](
  a: F[A], b: F[B], c: F[C]): F[(A, (B, C))] = {
  val F = implicitly[Zip[F]]
  F(a, F(b, c))
}

This is nicer than the first version but it is still super restrictive. It only works for zipping exactly 3 values. If we want to zip 2 or 4 or 70 values, we are out of luck! We saw how the shapeless HList allowed us to compose an arbitrary number of arbitrary types into a single type. Let's try to use the same kinds of methods here to produce a zipper of arbitrary arity.

Step 1 - Simplify as much as is possible

We will simplify a zipper to the purest form we can. The purest zipper takes two instances of a particular type constructor and produces a single instance of that type constructor on a pair. Let's write that.

def zipper[F[_]: Zip, H, T](
  h: F[H], t: F[T]): F[(H, T)] = {
  val F = implicitly[Zip[F]]
  F(h, t)
}

Now, we don't need separate functions for different arity versions of a zipper. We can simply call this function recursively to produce the desired result.

val (list1, list2, list3, list4, list5, list6) = ...
implicit val ZipList = new Zip[List]{
  override def apply[A, B](
    a: List[A], b: List[B]): List[(A, B)] = a.zip(b)
}
val with2 = zipper(list1, list2)
val with3 = zipper(list1,
            zipper(list2, list3))
val with6 = zipper(list1,
            zipper(list2,
            zipper(list3,
            zipper(list4,
            zipper(list5, list6)))))

If we recall the shapeless HList code learned to read, we see the same pattern here of a recursive type being produced from recursive calls. This can be done in the compiler using implicits.

Step 2 - Replace explicit recursion with implicit

implicit def zipper[F[_]: Zip, H, T](implicit
  h: F[H], t: F[T]): F[(H, T)] = {
  val F = implicitly[Zip[F]]
  F(h, t)
}

This is the same code just the keyword implicit is placed in two locations. In scala we can promote an explicit call to an implicit call simply by adding a keyword. We inform the inputs with implicit so the compiler knows to find the head and tail by itself. We inform the function with implicit so the compiler knows to call the function implicitly if a value of the necessary type is not found.

We communicate our intent to the compiler with implicits and types. Type aliases help simplify business logic.

type F[A] = List[A]
type Result =
  F[
    (Int, (Long, (String, (Double, (Float, Array[Byte])
  ))))]

To tell the compiler which values it can use during the application of functions we inform the values as implicit

implicit val list1: List[Int] = ???
implicit val list2: List[Long] = ???
implicit val list3: List[String] = ???
implicit val list4: List[Double] = ???
implicit val list5: List[Float] = ???
implicit val list6: List[Array[Byte]] = ???

The implicitly function tells the compiler what it needs to execute.

implicitly[Result]

And that's it! The compiler assembles the recursive calls for us. No more errors from placing things in the wrong order; refactoring is as simple as rearranging the order of the types in our alias Result. One change on one line propagates throughout the entire code base automatically.

Why stop at Tuple2?

There is nothing here that requires that a tuple be used. In fact, the only thing we need is a type constructor of arity 2. We can express as

trait ZipG[F[_], G[_, _]]{
  def apply[A, B](a: F[A], b: F[B]): F[G[A, B]]
}
implicit def zipper[F[_], G[_, _], H, T](implicit
  F: ZipG[F, G], h: F[H], t: F[T]): F[G[H, T]] = {
  F(h, t)
}

And we can zip Tuple2s or Eithers by creating type class instances

implicit val zipListTuple2 = new ZipG[List, Tuple2]{
  override def apply[A, B](
    a: List[A], b: List[B]): List[(A, B)] = a.zip(b)
}
implicit val zipListEither = new ZipG[List, Either]{
  override def apply[A, B](
    a: List[A], b: List[B]): List[Either[A, B]] =
    for{a <- a; b <- b}yield{
      if(a.toString.size < b.toString.size) Left(a)
      else Right(b)
    }
}

For Tuple2 we have business logic like

type F[A] = List[A]
type Result =
  F[
    (Int, (Long, (String, (Double, (Float, Array[Byte])
  ))))]

implicit val list1: List[Int] = ???
implicit val list2: List[Long] = ???
implicit val list3: List[String] = ???
implicit val list4: List[Double] = ???
implicit val list5: List[Float] = ???
implicit val list6: List[Array[Byte]] = ???

implicitly[Result]

This is the same as before. Commonly, further abstractions at the type level have little or no effect on code at the value level. Abstractions should allow the code to be more expressive than it was prior to the abstraction exercise never less expressive.

Changing our business logic to use Either as our Zipping class is simple

type F[A] = List[A]
type Result =
  F[
    Either[Int, Either[Long, Either[String,
    Either[Double, Either[Float, Array[Byte]]
  ]]]]]

implicit val list1: List[Int] = ???
implicit val list2: List[Long] = ???
implicit val list3: List[String] = ???
implicit val list4: List[Double] = ???
implicit val list5: List[Float] = ???
implicit val list6: List[Array[Byte]] = ???

implicitly[Result]

This is very powerful. By changing our type aliases, we were able to entirely change the meaning of our business logic without complex refactorings. As long as there are the correct type classes in implicit scope, the business logic need not be bothered by those implementation details.

What we have here is a sort of data pipeline and writer. Now that we have formatted data that we can work with in code, how do we present that data to an operator? Next, we'll write a reader for our types.

 

Introductory TyDD in Scala: Reading Type Level Functions

This is shapeless' HList

sealed trait HList extends Product with Serializable

final case class ::[+H, +T <: HList](head : H, tail : T) extends HList {
  override def toString = head match {
    case _: ::[_, _] => "("+head+") :: "+tail.toString
    case _ => head+" :: "+tail.toString
  }
}

sealed trait HNil extends HList {
  def ::[H](h : H) = shapeless.::(h, this)
  override def toString = "HNil"
}

case object HNil extends HNil

The important thing to note is it is very similar to a standard scala List just at the type level. It has a head and tail as type parameters and the tail is itself an HList. The implementation details are not important for our purposes here. All we need to take from this is an HList is a recursive list (head, tail) which is either empty (HNil) or whose elements can be of different types.

A Type Level Operation on HList

This is a function defined in shapeless which helps client code work with HLists

trait Mapped[L <: HList, F[_]] extends Serializable {
  type Out <: HList
}
object Mapped {
  …
  type Aux[L <: HList, F[_], Out0 <: HList] =
    Mapped[L, F] { type Out = Out0 }
 implicit def hnilMapped[F[_]]: Aux[HNil, F, HNil] =
    new Mapped[HNil, F] { type Out = HNil }
  …
  implicit def hlistMapped1[
  H, T <: HList, F[_], OutM <: HList](implicit
  mt: Mapped.Aux[T, F, OutM]): Aux[H :: T, F, F[H] :: OutM] =
    new Mapped[H :: T, F] { type Out = F[H] :: OutM }
}

We can see all the parts we mentioned in the first post of this series. Recall:

  1. Keywords
  2. Input types
  3. Output types
  4. Body

Note there are multiple implicit def declarations. The modifiers implicit def are similar to case statements in value function bodies. Let's take this piece by piece.

The First Case

implicit def hnilMapped[F[_]]: Aux[HNil, F, HNil] =
new Mapped[HNil, F] { type Out = HNil }

When focusing on the types, the intent becomes clear

implicit def hnilMapped[F[_]]: Aux[HNil, F, HNil] =
  new Mapped[HNil, F] { type Out = HNil }

This reads something like: Given a type constructor, yield the empty HList. Simple enough! This stuff is comprehensible afterall!

The Second Case

implicit def hlistMapped1[
H, T <: HList, F[_], OutM <: HList](implicit
mt: Mapped.Aux[T, F, OutM]): Aux[H :: T, F, F[H] :: OutM] =
  new Mapped[H :: T, F] { type Out = F[H] :: OutM }

This is admittedly more complex. Let's take it in stride again focusing on the types.

implicit def hlistMapped1[
H, T <: HList, F[_], OutM <: HList](implicit
mt: Mapped.Aux[T, F, OutM]): Aux[H :: T, F, F[H] :: OutM] =
  new Mapped[H :: T, F] { type Out = F[H] :: OutM }
  • Given a head, a tail which is an HList, a type constructor and another HList
  • Also, given proof that this type class holds good for our tail
  • We can generate an HList where our type constructor is applied to our head and the tail follows

A brief Word on Induction

Many of the type level functions encountered in the wild are of this type. Where there is at the very least

  1. a trivial case which produces some terminal element
  2. a more complex case where given a type or arity n can produce a type of arity n+1

In other words, recursive types can be generated inductively assuming you have an instance of the recursive type and all the necessary parts to wrap that instance within the target type.

In the next post we'll create some type level code using these methods.

Introductory TyDD in Scala: Basic Type Class Development

(A more complete treatment of type classes and higer kinds.)

A simple Type Class

Take the following

trait Mapping[A, B]{
  def map(a: A): B
}

and an instance for it

val mapping: Mapping[List[Int], List[String]] =
  new Mapping[List[Int], List[String]]{
    override def map(a: List[Int]): List[String] =
      a.map(_.toString)
  }

This instance is super restrictive. It only works for taking Int into String. We want to map a List of any type. Since we know what our type parameters are, we can achieve our goal by passing in a function

trait ListMapping[A, B]{
  def map(list: List[A])(f: A => B): List[B]
}

So, given a List[A] and a function, A => B, we can get a List[B]. And by taking the type parameters from the trait definition and placing them onto the function definition, we can squeeze out a bit more freedom.

trait ListMapping{
  def map[A, B](list: List[A])(f: A => B): List[B]
}
val mapping: ListMapping =
  new ListMapping{
    override def map[A, B](a: List[A])(f: A => B): List[B] =
      a.map(f)
  }

Now, why would anyone ever do this? The List type provides a map function which does exactly this. With this approach one may provide any number of methods for mapping a list. Take this alternative:

val reverseMapping: ListMapping =
  new ListMapping{
    override def map[A, B](a: List[A])(f: A => B): List[B] =
      a.reverse.map(f)
  }

Through type classes, we can define new functionality for old data structures on the fly. Similar code can be written for sort order, string formatting or just about anything else.

Making things even more general

While, the ability to map Lists of any type in any number of ways is fairly abstract it is not abstract enough for our purposes. What if we want to map a different data structure such as an Option or a Stream or a spark Dataset?

Luckily, Scala has a language feature which can help us out here.

trait WithMap[F[_]]{
  def map[A, B](m: F[A])(f: A => B): F[B]
}

The type parameter, F[_], has a type parameter of _, this tells the compiler that our type parameter itself requires a type parameter. Notice in our definition all mention of List has been replaced by our parameter, F. This just says that given a type, F, which itself takes a type parameter, we can change the inner type or F without changing F. We can do this with any parameterized type of arity 1.

implicit val listWithMap = new WithMap[List]{
  override def map[A, B](m: List[A])(f: A => B): List[B] =
    m.map(f)
}
implicit val optionWithMap = new WithMap[Option]{
  override def map[A, B](m: Option[A])(f: A => B): Option[B] =
    m.map(f)
}
implicit val streamWithMap = new WithMap[Stream]{
  override def map[A, B](m: Stream[A])(f: A => B): Stream[B] =
    m.map(f)
}
val reverseListWithMap = new WithMap[List]{
  override def map[A, B](m: List[A])(f: A => B): List[B] =
    m.reverse.map(f)
}


With these techniques we can define super polymorphic functions. Take this pretty strinfigy function

def prettyString[F[_]: WithMap, A](m: F[A])(f: A => String): String = {
  implicitly[WithMap[F]].map(m)(f).toString
}

This takes two type parameters, F[_]: WithMap and A. The `:` character in the first type parameter tells the compiler that it needs an implicit instance of WithMap defined for our type F.

And here is a data processor defined in the same way

def processData[F[_]: WithMap, A, B, C, D](
  m1: F[A])(f1: A => B)(f2: B => C)(f3: C => D): F[D] = {
  val F = implicitly[WithMap[F]] 
  val m2 = F.map(m1)(f1)
  val m3 = F.map(m2)(f2)
  F.map(m3)(f3)
}

We have taken an implementation detail (the map function on List, Option, etc...) and brought it outside the type. This has given us the ability to talk about data which has a sensible map function without knowing what that data necessarily looks like.

Next we'll learn how to read some of the Type Level functions that exist in the shapeless library.

Introductory TyDD in Scala: Anatomy of a Type Level Function

Value Level Functions

A value level function typically looks like

def f(a: Int, b: Int): String = {
  a.toString + b.toString
}

The key parts are

  1. A keyword: def
  2. Inputs: a: Int, b: Int
  3. Outputs: String
  4. Body: a.toString + b.toString

When you see these 4 parts, you know you are reading a value level function. Nothing surprising here. Now, let's see what a similar definition looks like at the type level.

Type Level Functions

A type level function looks like

trait MyTrait[A, B]{type Out}
object MyTrait{
  def apply[A, B, C](): MyTrait[A, B]{type Out = C} =
    new MyTrait[A, B]{override type Out = C}
}

In this definition there is a type refinement, MyTrait[A, B]{type Out = C}. These are undesirable artifacts of type level development. To simplify these definitions we use the Aux alias (a document about on this). Aux helps us remove type refinements from our logic.

A type level function looks like

trait MyTrait[A, B]{type Out}
object MyTrait{
  type Aux[A, B, C] = MyTrait[A, B]{type Out = C}
  def apply[A, B, C](): Aux[A, B, C] =
    new MyTrait[A, B]{override type Out = C}
}

The type refinement from the previous example is replaced by the nicer (more readable, fewer braces, less code) Aux type.

Type level functions have the same 4 key parts as value level functions

  1. Keywords: trait object
  2. Inputs: A, B
  3. Outputs: type Out
  4. Body: def apply[A, B, C](): Aux[A, B, C]

Here the inputs are type parameters and outputs are type members. This is so the output types are not erased and can be referenced later in business logic. This is similar to value level functions as the result of a value level function does not expose the inputs required by the function.

Bodies of type level functions are value level functions. They are typically only a few lines long. Their purpose is to present the compiler with a way to construct a new type from the types provided. This is what these blog posts will focus on.

Whenever you see a set of definitions which have these 4 qualities, you know you are looking at a type level function.

The type class is the fundamental element of this style of type driven development. The next post will give an overview of this concept.