Cats (Introducing Typelevel Scala into an OO Environment)

(Examples can be found on GitHub)

Last time, we discovered Type Classes and how they give us more power to build extensible software than sub class polymorphic trees can give us. Here we introduce the Cats library to use a production quality library instead of the Adder and Chainer classes from the previous post.

We'll need the following imports from the cats library:

import cats.Monoid
import cats.Monad
import cats.implicits._

and to recall our Team definition:

case class Team[Type](members: List[Type])

The Monoid Type Class

Recall the Adder trait described the process of adding two containers of the same type together to create a new container of that type. This is the basic idea for a structure in category theory called a Monoid. A full treatment of a Monoid is beyond the scope of this post; the important thing here is they describe semantics for adding members of other types together.

In the cats library, a Monoid for our unstructured Team data is defined thus:

implicit def adder[Arg]: Monoid[Team[Arg]] =
  new Monoid[Team[Arg]]{
    override def empty: Team[Arg] = Team(Nil)
    override def combine(
        left: Team[Arg], right: Team[Arg]): Team[Arg] = {
      val newMembers = left.members ++ right.members
      Team(newMembers)
    }
  }

A Monoid has two operations, empty and combine, where empty is the identity element under the combine operation. In other words empty is a value, e, that when combined with any other value, v, returns v. Monoid replaces our Adder trait from the previous post. Our structured Team would have Monoid:

implicit def adder[Arg]: Monoid[Team[Arg]] =
  new Monoid[Team[Arg]]{
    override def empty: Team[Arg] = Team(Nil)
    override def combine(
        left: Team[Arg], right: Team[Arg]): Team[Arg] = {
      val (lead1, indi1) = left.members.splitAt(2)
      val (lead2, indi2) = right.members.splitAt(2)
      val newMembers = lead1 ++ lead2 ++ indi1 ++ indi2
      Team(newMembers)
    }
  }

Note, the empty value is the same for both cases. This is often the case for multiple Monoids over the same data structure; no matter how you combine elements, the identity is trivially applied.

The Monad Type Class

Now, we shift our focus to the Chainer trait from the previous post. This trait described how to sort of flatten a nesting of containers for instance, a Team[Team[_]] into a Team[_]. This is the basic operation behind the Monad. Again, we're not interested in figuring out Monads in detail; we're just trying to use a library in our work.

With Cats we'll have:

implicit def chainer: Monad[Team] = new Monad[Team]{
  override def flatMap[Arg, Ret](
      team: Team[Arg])(f: Arg => Team[Ret]): Team[Ret] = {
    val newMembers = team.members.flatMap(f(_).members)
    Team(newMembers)
  }
}

for our unstructred Monad and our structured would look like:

implicit def chainer: Monad[Team] = new Monad[Team]{
  override def flatMap[Arg, Ret](
      team: Team[Arg])(f: Arg => Team[Ret]): Team[Ret] = {
    val (leaders, individuals) = team.members.map{member =>
      val mems = f(member).members
      mems.splitAt(2)
    }.unzip
    Team(
        leaders.flatMap {x=>x} ++
        individuals.flatMap{x=>x})
  }
}

Monads add the flatMap (also called bind or >>= in some circles) operation to a data type. flatMap describes how to take a value of Team and a function which maps a member to a Team to produce another Team. It is a flattening operation. These are important as they describe data flow and functional composition through a system. To get the individual contributors from their Directors on could:

case class Director(name: String)
case class Manager(name: String)
case class Individual(name: String)
val directors: Team[Director] = ???
def managers(director: Director): Team[Managers] = ???
def individualContributors(manager: Manager): Team[Individual] = ???
val individuals = directors >>= (managers) >>= (individualContributors)

Then swapping out different functionality is simply recombining your function calls around your Monadic chain.

Monoids and Monads are simple to use. They describe operations to combine and process data as simple, type safe functional chains.

5. Typeclasses over Subclasses (Introducing Typelevel Scala into an OO Environment)

(Examples can be found on GitHub)

In the previous post, we introduced the Argonaut library to convert between values and JSON strings. The important part of this conversion is there was no superclass or interface to implement in order to get the benefit of JSON across classes. All we needed to do was define values of type CodecJson for each of the types we wanted to convert. We added the functionality to the class without changing the class itself. 

Argonaut allowed us to call toJson on classes with a codec and decodeOption on Strings to produce values of classes with a codec defined. This type of polymorphism, where a function's implementation depends on its inputs is called ad-hoc polymorphism. Furthermore, when we define a type, T, which defines functionality across classes to be used in ad-hoc polymorphic functions we call T a Type Class. Type Class polymorphism is a specific flavor of ad-hoc polymorphism.

Type Class polymorphism is a powerful tool for expressing context based functionality far more powerful than subclass polymorphism. As a well-known example take the Java interfaces Comparable and Comparator. If some data is defined in a class which implements Comparable, it can be sorted one way and needs an entire second class definition to be sorted with a different method. On the other hand, using Comparator the data is defined with a single class and each sort method gets its own Comparator. Comparator is a Type Class and allows the developer to determine in which contexts which sorting method should be used.

Subclass Method

Take the following traits:

trait Adder[Type]{
  def add(other: Type): Type
}
trait Chainer[Arg, Type[Arg]]{
  def chain[Res](f: Arg => Type[Res]): Type[Res]
}

Adder describes how to add two values of some Type together. Chainer describes how to chain operations over a parameterized type.

We'll use the idea of a team to illustrate. For the sake of simplicity we say a team consists of people of a certain profession. So we can have a team of engineers or a team of doctors or a team of cashiers or ...

Teams can (trivially) grow by hiring but, they can also grow by combining with other teams. Teams can be added.

Teams can have members who are themselves team leads. At times, the members of a lead's team must join the team the lead belongs to. This implies an operation which develops teams out of the members of teams. Teams can be chained.

Here is our implementation of Team given this functionality:

case class Team[Type](members: List[Type])
  extends Adder[Team[Type]]
  with Chainer[Type, Team]{
  override def add(other: Team[Type]): Team[Type] = {
    Team(members ++ other.members)
  }
  override def chain[Res](
      f: Type => Team[Res]): Team[Res] = {
    val list = members.flatMap(member => f(member).members)
    Team(list)
  }
}

Simple enough but this doesn't account for an organization of structured teams. For an organization who develops teams that each have one product lead and one technical lead, simple concatenation won't maintain a soft ranking of individuals within the new team. We need a new Team definition which accounts for this.

case class TeamStructured[Type](members: List[Type])
  extends Adder[TeamStructured[Type]]
  with Chainer[Type, TeamStructured]{
  override def add(
      other: TeamStructured[Type]): TeamStructured[Type] = {
    val (lead1, indi1) = members.splitAt(2)
    val (lead2, indi2) = other.members.splitAt(2)
    TeamStructured(lead1 ++ lead2 ++ indi1 ++ indi2)
  }
  override def chain[Res](
      f: Type => TeamStructured[Res]): TeamStructured[Res] = {
    val (leaders, individuals) = members.map{member =>
      val mems = f(member).members
      mems.splitAt(2)
    }.unzip
    TeamStructured(
        leaders.flatMap {x=>x} ++
        individuals.flatMap{x=>x})
  }
}

Now we have two definitions for the same data that differ only by functionality. We have a triple coupling here:

  1. Data Definition
  2. Addition Description
  3. Chaining Description

If the data needs to change (from List to Set is a good place to start) the change needs to be made in two places. Each function which accepts a Team for the purpose of team composition and combination needs to know which style of team it needs at development time. These problems gets worse for each possibility for combining and chaining teams (maybe a round robin or reverse algorithm would fit in certain situations). Type Classes solve these issues.

Type Class Method

Our traits become:

//The underscore here implies we need a parameterized type.
trait Adder[Type[_]]{
  def add[Item](
      left: Type[Item], right: Type[Item]): Type[Item]
}
trait Chainer[Type[_]]{
  def chain[Item, Res](
      arg: Type[Item], f: Item => Type[Res]): Type[Res]
}

These have the same uses as their counterparts above. However we have a single definition of the Team type:

case class Team[Type](members: List[Type])

The data is defined in a single place. Each piece of software which requires a Team has a consistent idea about what a Team is and means. The two versions of functionality are defined by:

object unstructured{
  implicit def adder: Adder[Team] = new Adder[Team]{
    override def add[Item](
        left: Team[Item], right: Team[Item]): Team[Item] = {
      Team(left.members ++ right.members)
    }
  }

  implicit def chainer: Chainer[Team] = new Chainer[Team]{
    override def chain[Item, Res](
        arg: Team[Item], f: Item => Team[Res]): Team[Res] = {
      val list = arg.members.flatMap(
          member => f(member).members)
      Team(list)
    }
  }
}

object structured{
  implicit def adder: Adder[Team] = new Adder[Team]{
    override def add[Item](
        left: Team[Item], right: Team[Item]): Team[Item] = {
      val (lead1, indi1) = left.members.splitAt(2)
      val (lead2, indi2) = right.members.splitAt(2)
      Team(lead1 ++ lead2 ++ indi1 ++ indi2)
    }
  }

  implicit def chainer: Chainer[Team] = new Chainer[Team]{
    override def chain[Item, Res](
        arg: Team[Item], f: Item => Team[Res]): Team[Res] = {
      val (leaders, individuals) = arg.members.map{member =>
        val mems = f(member).members
        mems.splitAt(2)
      }.unzip
      Team(
          leaders.flatMap {x=>x} ++
          individuals.flatMap{x=>x})
    }
  }
}

Now, each function which accepts a team, if needed, will also accept an adder or chainer or both (wholly decoupled). The down side here is each call to such a function requires at least one extra argument from the subclass versions. Scala has a fix for this limitation.

Implicits

The implicit keyword before a definition is an important part of making Type Class polymorphism beneficial to the developer. The word implicit, according to Oxford Dictionaries, means Implied though not plainly expressed. In Scala it means we can prepend the implicit keyword to an argument list and not explicitly produce the value in code assuming a valid value is in scope. For example:

def chainTeams[Type, Result](
  team: Team[Type])(
  func: Type => Team[Result])(
  implicit chain: Chainer[Team]): Team[Result] = {
  chain.chain(team, func)
}

This has three arguments, the team to operate on, the operation to perform, and the chainer for application. However, since the final argument is implicit, if we bring a valid implicit value into scope, there is no need to pass it in directly.

import structured._
val team: Team[Person] = Team(List(???))
val func: Person => Team[Person] = {(p: Person) => ???}
val newTeam: Team[Person] = chainTeams(team)(func)//valid

Since, we don't need to explicitly state the Chainer it keeps boilerplate clean. A nice effect of implicit resolution is if you have scoped two separate valid values for the implicit argument, the compiler will complain. The suggestion if you have multiple valid implicits in scope is to decouple your code functionally. No single scope should have use of more than one implicit of the same type; this is a code smell. A corollary to this is one should not explicitly provide implicit arguments; let the compiler do its work.

In the final post of this series, we will introduce another library, Cats.

Monocle & Argonaut (Introducing Typelevel Scala into an OO Environment)

(Examples can be found on GitHub)

In the last post, we put to rest our use of mutable objects for good. Here we learn how to make use of our new found Functional Programming powers in the real world.

Every application of sufficient user base requires a persistent settings store. The more users one has, the more styles one is responsible for accommodating. In my experience, JSON has been the most useful format for small-scale persistence. JSON is widely understood, works on the web and is plaintext. We'll use Monocle and Argonaut to implement a persistent settings store.

The first question here is "Why not circe?". I found circe to be a bit too ethereal for most Java developers to wrap their heads around. Implicit scope (especially when its as magical as circe's auto) is an alarming feature for people who come from a language with no developer-defined implicit semantics (C++ developers are quite comfortable with this notion).

Monocle

The Monocle library provides semantics for defining simple accessors and combinators on nested data. Monocle is especially well suited for handling nested Case Classes which is what we'll focus on. Take the following data definition

case class Color(r: Byte, g: Byte, b: Byte)
case class FishTank(liters: Int, color: Color, fish: List[Fish])

We have nested Case Classes as well as a nested collection. To cover the changes that can occur here we would need to define:

  • eighteen operations
  • six of which are a composition from a Fish Tank into a Color
  • one of which is nested within a List structure.

Monocle makes this simple:

val (tankLiters, tankColor, tankFish) = {
  val gen = GenLens[FishTank]
  (gen(_.liters), gen(_.color), gen(_.fish))
}
val (colorR, colorG, colorB) = {
  val gen = GenLens[Color]
  (gen(_.r), gen(_.g), gen(_.b))
}
val (tankColorR, tankColorG, tankColorB) = (
  tankColor.composeLens(colorR),
  tankColor.composeLens(colorG),
  tankColor.composeLens(colorB)
)

Defining your data using Case Classes provides Monocle with the information it needs in order to generate lenses (nested views) into your data structures. Lens composition in Monocle is a single straightforward call. We can get into and out of our data with very little boilerplate.

Now that we can define settings and alter them, we need a way to persist them and communicate them to other parts of the system. We'll use JSON as our data format and Argonaut as our transcoder.

Argonaut

Argonaut provides semantics for converting between classes and JSON strings. Like Monocle, its easiest to use with Case Classes. Taking the same classes as above we would have:

//ignore the implicit keyword.
//I promise we'll get to it in the next post!
implicit def codecTank: CodecJson[FishTank] =
  casecodec3(
      FishTank.apply, FishTank.unapply
  )("liters", "color", "fish")
implicit def codecColor: CodecJson[Color] =
  casecodec3(
      Color.apply, Color.unapply
  )("r", "g", "b")
implicit def codecFish: CodecJson[Fish] =
  CodecJson(
    (f: Fish) =>
      ("name" := f.name) ->:
      ("color" := f.color) ->:
      jEmptyObject,
    (c: HCursor) => for{
      name <- (c --\ "name").as[String]
      color <- (c --\ "color").as[String]
    }yield{(name, color) match{
      case ("One Fish", "Red Fish") => OneFish
      case ("Two Fish", "Blue Fish") => TwoFish
      case _ => NotFish
    }}
  )

There are quite a few operators here. This could make things tricky for Java developers at first but, there are few of them so no big deal. For Case Classes, Argonaut has next to no boilerplate; one passes in the apply and unapply functions and names everything. Also, composition in Argonaut is implicit. It gets a little tricky with non Case Classes but, its still not much. At most Argonaut requires two functions; one from the class to JSON, the other from JSON to the class.

One more thing to note is that codecs for simple standard collections are implicit. The codec for List[Fish] is implicitly defined by the codec for Fish.

Putting it all together

A settings object would look something like:

object settings{
  private val settings: mutable.Map[String, FishTank] =
    mutable.Map()
  
  def apply(key: String): Option[FishTank] = settings.get(key)
  def update(key: String, byte: Byte): Unit = {
    settings(key) = settings.get(key) match{
      case Some(tank) =>
        tankColor.modify { _ => Color(byte, byte, byte) }(tank)
      case None =>
        FishTank(0, Color(byte, byte, byte), Nil)
    }
  }
  def update(key: String, size: Int): Unit = {
    settings(key) = settings.get(key) match{
      case Some(tank) =>
        tankLiters.modify(_ => size)(tank)
      case _ =>
        FishTank(size, Color(0,0,0), Nil)
    }
  }
  def update(key: String, fish:List[Fish]): Unit = {
    settings(key) = settings.get(key) match{
      case Some(tank) =>
        tankFish.modify(_ => fish)(tank)
      case _ =>
        FishTank(1, Color(0,0,0), fish)
    }
  }
  
  def persist(): Unit = {
    val jsonRaw = settings.toList.asJson
    val json = jsonRaw.nospaces
    putOnWire(json)
    writeToDisk(json)
  }
  def recall(): Unit = {
    val str = getFromDisk()
    val opt = str.decodeOption[List[(String, FishTank)]]
    opt.foreach{list =>
      settings ++= list.toMap
    }
  }
}

But, of course this is not threadsafe and it uses a mutable collection to perform its work. A different threadable implementation could look something like:

val actorSystem: ActorSystem = ???
implicit val timeout: akka.util.Timeout = ???
implicit val ec: ExecutionContext = ???
object asyncSettings{
  private sealed trait Message
  private case class Get(key: String)
      extends Message
  private case class SetGrey(key: String, hue: Byte)
      extends Message
      
  private class Perform extends Actor{
    override val receive: Receive = step(Map())
    def step(map: Map[String, FishTank]): Receive = {
        case Get(key) => sender ! map(key)
        case SetGrey(key, value) =>
          val newTank: FishTank = ???
          val newMap = map + (key -> newTank)
          context.become(step(newMap))
    }
    override def preStart(): Unit = ???//recall
    override def postStop(): Unit = ???//persist
  }
  
  val actor: ActorRef = actorSystem.actorOf{
    Props(new Perform())
  }
  def apply(key: String): Future[FishTank] =
    (actor ? Get(key)).collect{
      case Some(t: FishTank) => t
    }
  def update(key: String, hue: Byte) = 
    actor ! SetGrey(key, hue)
}

Here we use akka for asynchronous operations. One could also employ scalaz Task or simple Future composition or really any other asynchronous library.

Next we'll cover Type Classes to further decouple our data from functionality.

 

 

 

 

4. Objects are not Coroutines (Introducing Typelevel Scala into an OO Environment)

In the previous posts, we went over how to introduce immutability, combinators and case classes to move toward functional programming. These three points together are the basis for the point described in this point that Objects are not Coroutines.

If you are unfamiliar with coroutines, wikipedia has a basic description of them.

In Java, the usual application runs a little like this:

  1. Initialize an object
  2. Perform an operation
  3. Mutate the object
  4. Perform an Operation
  5. ...

This habit breaks all of the FP ideas we have developed so far.

When introducing Typelevel Scala, it is important to note we are not simply adding a library to an already existing system (the JVM). We are trying to change how people do their day to day work. Some of them have been writing OO Imperative software products for decades making a change of paradigm difficult. Keeping to simple language is key to our goal of shifting an organization's workflow.

The workflow shift we are suggesting here is:

  1. Define
  2. Apply Combinator
  3. ...

We will be moving from a paradigm based in mutability and state to one built on immutability and functions.

OO Imperative Style

We will be defining a school of fish and how that school of fish grows.

class BadSchool(){
  private var name: String = null
  private var depth: Depth = null
  private var location: Location = null
  private var fish: mutable.Buffer[Fish] = null
  def setName(newName: String): Unit = {
    name = newName
  }
  def getName(): String = name
  def setDepth(newDepth: Depth): Unit = {
    depth = newDepth
  }
  def getDepth(): Depth = depth
  def setLocation(newLocation: Location): Unit = {
    location = newLocation
  }
  def getLocation(): Location = location
  def setFish(newFish: mutable.Buffer[Fish]): Unit = {
    fish = newFish
  }
  def removeFish(aFish: Fish): Unit = {
    fish -= aFish
  }
  def addFish(aFish: Fish): Unit = {
    fish += aFish
  }
  def getFish(): mutable.Buffer[Fish] = fish
  override def toString(): String= {
    s"School(\n\t$name,\n\t$depth,\n\t$location,\n\t$fish)"
  }
}

In my experience, this is the type of code commonly written by people who have just made the jump from an OO language into Scala. Here we initialize the object's members to null and use mutable containers to maintain and augment state. A typical use case would look like:

def asCoroutine(): Unit = {
  val coroutine = new BadSchool()
  val (name, depth, location, fish) = someInit()
  coroutine.setName(name)
  coroutine.setDepth(depth)
  coroutine.setLocation(location)
  coroutine.setFish(fish)
  convertToJsonAndPutOnTheWire(coroutine)
  var newFish: Fish = null
  for(i <- (0 to 10)){
    newFish = nextFish(coroutine)
    coroutine.addFish(newFish)
    convertToJsonAndPutOnTheWire(coroutine)
  }
}

This application is difficult to follow and very cluttered. In order to create a new valid school of fish, one must initialize the object and call four set methods. When growing a school of fish, the developer needs to destroy the previous school of fish forcing any interaction with the object to be synchronous.

This is like a very messy coroutine. The addFish method is like a yield and there is no way to get back to the previously returned yield state.

Typelevel Style

The above impure code can be fairly simply converted to a more FP style. First we define our school of fish.

case class School(
    name: String,
    depth: Depth,
    location: Location,
    fish: immutable.Queue[Fish])

This is short and to the point. The intent of the code is clear and there are no messy methods defined for maintaining, getting and mutating state. The same use case would be implemented functionally like:

def aBetterWay(): Unit = {
  @annotation.tailrec
  def perform(qty: Int, acc: List[School]): List[School] = {
    if(qty > 0 && acc.nonEmpty){
      val head :: tail = acc
      val currentFish = head.fish.last
      val next = nextFish(currentFish)
      val result = head.copy(fish = head.fish.enqueue(next))
      perform(qty - 1, result :: acc)
    }else acc
  }
  val school = School(
      "Bikini Bottom",
      Deep,
      South,
      immutable.Queue(OneFish))
  val result = perform(10, List(school))
  result.foreach(convertToJsonAndPutOnTheWire)
}

There are two main ideas:

  1. In lieu of initializing state we define data. 
  2. Once data is defined, it cannot be redefined.

The function buildSchool builds a new school of fish from a provided school of fish. State is created, never destroyed, and there is no initialization step. Moreover, the webservice call can be made asynchronous without worry for synchronization or heap issues.

Next, we'll introduce our first libraries: Monocle and Argonaut.

 

3. Case Classes and Auto-Encapsulation (Introducing Typelevel Scala into an OO Environment)

(Examples can be found on GitHub)

We have decoupled data from its usage by replacing more complicated control flow ideas (like loops null and throw) with combinators by passing functions as parameters.  This allows us to think differently about how we write programs.

Instead of thinking about software as objects acting on themselves; we can start to think of it as two parts:

  1. Data
  2. Functions which define Data flow

To illustrate, we'll redefine our color function as data.

sealed trait Fish{
  val name: String
  val color: String
}
case object OneFish extends Fish{
  override final val name: String = "One Fish"
  override final val color: String = "Red Fish"
}
case object TwoFish extends Fish{
  override final val name: String = "Two Fish"
  override final val color: String = "Blue Fish"
}
case object NotFish extends Fish{
  override final val name: String = "Ahab"
  override final val color: String = "White Whale"
}

Here, we model our fish data in three points; two good states and a bad state. Instead of modeling a two point set of data with Strings, we model it as three discrete states using case classes.

Case Classes are automatically encapsulated and immutable by default and, as the states are encoded into the types, there is no need to use null or throw.

At this point, we have built up a nice base for functional programming. We can define robust data structures which model our problem space very closely and functional constructs which allow us to transform that data. This brings us to the salient feature of this process: Objects are not Coroutines.

What was it like when you switched from Java to Scala?

I get asked this question a lot. My answer is always similar. This post represents the long form of that answer.

I started programming in 2005, during my freshman year at Rutgers University. My first course was in C. C is well suited to procedural programming and, I seemed to take to this style very well.

After C was C++. Object Orientation did not click for me. I didn't understand why someone would:

  1. Build a data structure
  2. Seal all the data from access
  3. Add semantics to the data structure itself for mutation

It seemed quite unintuitive to me. Why not just create a struct and write functions to process instantiations? Why couple data to usage? C++ seemed to me like an unnecessary complication over an otherwise simple idea.

When I was instructed in Java things were even more confusing. Like C++, Java preferred Object Orientation. Even more problematic, Java enforced OO practices. Now, in order to do anything, I needed to first build a class. Need to add two integers? Build a class. Need to print "Hello World!" to console? Build a class. Need to find the 17th  prime? Build a class.

Given my attraction to procedural programming and aversion from OO, its not surprising a lot of my Java looked like:

class Clazz{public int a; public String b;}
class Processing{
  public static Clazz doSomething(Clazz c){...}
  public static String asCsv(Clazz c){...}
}

After some time, I found the demands of my profession starting to become too large for the time I could put into it. I had to find another way of doing things. I hit the books and papers. I read about many programming languages and many paradigms. I had two immovable restrictions:

  1. I was bound to the JVM
  2. I needed to interoperate with legacy Java and C++ code

I had decided I would choose between Ceylon and Scala. For me Scala was a better choice.

Functional Programming meshed nicely with my procedural programming style. The scalaz and typelevel communities focused on functional, pure constructs and turned away from effectful OO design. Furthermore, Scala's interoperability with Java was far superior to Ceylon's at the time making it easy for me to build interfaces that could integrate seamlessly with legacy POJO and JNI code.

For me, switching from Java to Scala was easy and straight forward. I already had a propensity toward the functional paradigm and I never got into the OO craze to begin with.

2. Combinators over loops, null & throw (Introducing Typelevel Scala into an OO Environment)

(examples can be found on GitHub)

In the previous section we found ways to transform mutable thinking into the immutable. Here we will take this one step further and introduce combinators into the organization.

A combinator is a method on a data structure. This method takes a function as argument and maps values of the data structure into other values of the data structure. This fits well with our immutable approach. There are four main ideas here:

  1. Functions produce new state; they do not destroy old state.
  2. Methods on structures are functions.
  3. Data and usage should be separate.
  4. Bad state should be handled as early as possible.

OO Imperative Style

When developers first get into Scala, it is pretty typical to see this kind of code:

class BadFish(
    private var m_name: String,
    private var m_color: String
){
  def this() = this(null, null)
  def getName(): String = m_name
  def getColor(): String = m_name
  def setName(name: String){
    m_name = name
  }
  def setColor(color: String){
    m_color = color
  }
  def isValid(): Boolean = try{
    check()
    true
  }catch{
    case _: IllegalArgumentException => false
  }
  def check(): Unit = {
    check(m_name, m_color)
  }
  def check(newName: String, newColor: String){
    if(!color(newName).equals(newColor))
      throw new IllegalArgumentException(
          "Fish color and name do not match"
      )
  }
}

The use case for this data is very tightly coupled to the data itself. In order to create a new value based on this value one must make at least 5 calls; 2 get calls, a new and 2 set calls:

val current: BadFish = ...
val (name, color) = (current.getName(), current.getColor())
val next = new BadFish()
next.setName(name)
next.setColor(color)

This puts a burden on any user of this class. One must know the entire structure of the class and the precise method calls for access in order to use it.

The set methods destroy old state to provide new state and the state of the object is not validated until a check method is envoked. When any one reference to this object has a set method called on it, the object could be made invalid for all references without any indication. It is very difficult to follow the logical flow of data through an application full of these kinds of classes.

This class also initializes values to null and provides semantics for its own data being overwritten. A null value is not a value of any type; its essentially the type system lying to the developer.

Typelevel Style

The same functionality can be provided by the following class:

class Fish(val fishName: String){
  val fishColor: String = color(fishName)
  def spawnFish(f: String => String): Fish = {
    new Fish(f(fishName))
  }
}

First thing's first. Its much shorter than the other snippet. The brevity affords a great deal of clarity about the class and intended usage.

It has a single input rather than two; the second variable is dependent upon the first for a valid state. Bad state is found at construction time. there is no longer any need to throw an exception in the class based on the class' own data. Everything is handled up front.

The single function doesn't destroy the state of the object, it only creates a new object based on the data in the current object. This allows us to make the data immutable as well as give the user of the class all the flexibility she needs for instantiation and processing of objects.

Next we'll use Case Classes to encapsulte our data and define it as immutable by default.

1. Immutability as Default (Introducing Typelevel Scala into an OO Environment)

(examples can be found on GitHub)

The first point is using Immutability as Default.

What does "as default" mean? We are not barring mutability from applications wholesale; there are practical reasons for using mutability. For instance,

  • performance
  • global settings.

However, the mutable code should be bounded by its defining scope. This idea can be captured with four rules of thumb:

  1. Function inputs are immutable.
  2. Function outputs are immutable.
  3. var and collection.mutable values are local and temporary.
  4. Function return values are placed into a val.

OO Imperative Style

First, we have a helper function for our examples.

def color(str: String): String = {
  str match{
    case "One Fish" => "Red Fish"
    case "Two Fish" => "Blue Fish"
  }
}

This snippet defines an imperative style One Fish Two Fish Red Fish Blue Fish. Note that we have two valid data points

  1. ("One Fish", "Red Fish")
  2. ("Two Fish", "Blue Fish")

Yet, we model it with a function from String to String which is a space much larger than 2 points. Also, it throws an exception on bad input; the function has a return type of String which is a lie since a thrown exception is a very different result from a String. I found code like this to be very common in the OO imperative world. In the coming posts we will seek to replace this with something better.

Some typical code which breaks our general rules looks a little like:

def bad(): mutable.Buffer[String] = {
  val fish = mutable.Buffer[String]()
  var one = "One Fish"
  var two = "Two Fish"

  fish.append(one)
  fish.append(two)

  one = color(one)
  two = color(two)

  fish.append(one)
  fish.append(two)
  
  fish
}

Here, mutable inputs or outputs can very easily poison threaded data in an application. This makes it difficult to follow the data through an application and reason about control flow.

Also, the vars one and two are used to first hold constant data then hold the result of a function call. The functional assignment to a var has similar negative cognitive affects as mutable arguments and return values. They needlessly complicate control flow.

Typelevel Style

The typelevel way to write the same functionality would be close to:

def better(): List[String] = {
  val one = "One Fish"
  val two = "Two Fish"

  List(
      one,
      two,
      color(one),
      color(two)
  )
}

The function is self contained. No mutable state inside the function escapes its defining scope and all function calls are nested readably within other function calls. The important part is the intent of the function is clear. Also, any threading performed around it is safe from accidental data poisoning; no messy synchronization calls are necessary.

Now that we have a foundation of immutability, we'll add combinators to our toolset.

Introducing Typelevel Scala into an OO Environment

These posts seek to describe a process by which a Scala dev can reasonably (responsibly) introduce Functional Programming (FP) in Scala into an Object Oriented (OO) Java development atmosphere. It can also serve as an entry point for an OO dev into the Scala community. They are based on a talk I gave at the Typelevel Summit in Philadelphia in 2016. Here are the slides and index cards from that talk. A video is available on YouTube.

We'll transform from OO paradigms to FP paradigms through five key ideas:

  1. Immutability as Default
  2. Combinators over loops, null & throw
  3. Case Classes for auto-encapsulation
  4. Objects are not Coroutines
  5. Type Classes over Subclasses

And provide practical examples using 3 libraries:

  1. Monocle
  2. Argonaut
  3. Cats

Each of these ideas builds upon the last and by the end we'll have a solid foundation for introducing FP and Category Theory without even saying any of the "M-words".

Before starting it is important to note that many of the people who develop software do not have Computer Science (CS) degrees. They will sometimes have a degree in another science or engineering discipline. Often, they will have no degree at all. An extension of this idea is you don't need a degree to define flatMap.

This fact is important to us because, the vernacular of CS is different from the vernacular of other disciplines. On the other hand, we are assuming an audience of OO developers so, some CS terminology is ok and indeed will be present throughout. Wherever OO words fail us, we will find "plain old English" ways of describing idioms and ideas. We will be coaching usage, strategies and tactics not terms and vocabulary.

Typelevel Summit Philadelphia

I just spoke at the Typelevel Summit in Philadelphia and it was fantastic!

I thank all the people who were involved in planning it as well as the kind people who approached me after I did my bit to share their kind words with me. It really made me feel like I was not only part of a community but also welcomed and valued as a member. I am truly touched.

This was my first talk ever. I spoke about introducing Typelevel Scala into an OO environment. Here are the slides. Here are the Index Cards.

The talk seemed to have gone over well and I got a lot of good feedback from attendees and fellow speakers that I can use to make any future talks I might do better.

In the coming weeks, I'll write a series of posts that further detail my process for bringing people into the ecosystem.

Type Class over Coupled Monads

Scala has support for built in language level Monads. They go a little something like this:

case class MyMonad[Type](item: Type){
  def map[Ret](f: Type => Ret): MyMonad[Ret] = ???
  def flatMap[Ret](f: Type => MyMonad[Ret]): MyMonad[Ret] = ??? 
}

The issue here is the Monadic flatMap (bind, >>=, whatever) is coupled to the data structure; there is a single operation which defines the Monadic functionality across all the code which uses the data structure. For instance, the Buffer flatMap flattens the Buffer[Buffer] into a Buffer by concatenating the first with the second with the third ... This flattening style is the only flattening style Buffer provides with Scala's built in, for comprehension Monadic chaining. This is very limiting.

If one wanted a different kind of flattening such as a round-robin: taking the first element from each Buffer, then the second from each ... One would need to implement their own Buffer structure and bake into that new Buffer a flatMap operation detailing the round-robin. Now, there are two separate Buffer data structures presenting two separate methods of Monadic chaining. Each function which accepts a Buffer needs to enforce which Buffer type it needs to ensure the flatMap provides the necessary functionality. Type Classes provide a solution to this issue.

Cats to the rescue

The Typelevel Cats library defines a Monad Type Class and implicits for syntax. Keeping to Buffer for our example: we would have two Monads over Buffer which look like:

import scala.collection.mutable.Buffer
import cats.Monad
import cats.implicits._

val mMonad = new Monad[Buffer]{
  override def pure[Type](items: Type): Buffer[Type] = pure(Seq(items))
  def pure[Type](items: Seq[Type]): Buffer[Type] = Buffer(items: _*)
  override def flatMap[Arg, Ret](c: Buffer[Arg])(f: Arg => Buffer[Ret]): Buffer[Ret] = {
    c.flatMap(f)
  }
}
val rrMonad = new Monad[Buffer]{
  override def pure[Type](items: Type): Buffer[Type] = pure(Seq(items))
  def pure[Type](items: Seq[Type]): Buffer[Type] = Buffer(items: _*)
  override def flatMap[Arg, Ret](c: Buffer[Arg])(f: Arg => Buffer[Ret]): Buffer[Ret] = {
    val mapped = c.map(f)
    val tupled = mapped.map(i => (i, i.size))
    val max = tupled.unzip._2.max
    val seq = (0 to max).flatMap{index =>
      tupled.collect{
        case (seq, size) if size > index => seq(index)
      }
    }
    seq.toBuffer
  }
}

Then to use them:

object simple{
  implicit val monad = mMonad
  def apply(values: Seq[Int]){
    val mMulti = mMonad.pure(values) >>={ int =>
      mMonad.pure((0 to int))
    }
    println(mMulti)
  }
}
object roundRobin{
  implicit val monad = rrMonad
  
  def apply(values: Seq[Int]){
    val rrMulti = rrMonad.pure(values) >>={int =>
      rrMonad.pure((0 to int))
    }
    println(rrMulti)
  }
}
val values = Seq(1,2,3,4,5)
simple(values)
roundRobin(values)

And that's it! Ad-hoc Monadic chaining any way you'd like without redefining classes or parameterized methods.

D3js defines a Functor over the DOM

D3js is a wonderful library for parsing datasets and manipulating the DOM based on what was parsed. With the right frame of mind, even complex data manipulations into the DOM become simple enough to implement in short order. The code here is written in javascript.

Those unfamiliar with the library can visit the D3js website for a set of very nice tutorials to get started.

Those unfamiliar with Category theory and Functors can find myriad resources on the internet which describe the concepts. In my opinion, the best treatment of Category Theory is in the book Categories for the Working Mathematician written by Saunders Mac Lane. The first few chapters are all that's needed here.

The basic idea behind D3js is you select one or many DOM elements and manipulate the set of them without cluttering your code with loops and control statements. Take:

<ul id='list'>
  <li id='idx1'>first</li>
  <li id='idx2'>second</li>
  <li id='idx3'>third</li>
  <li id='idx4'>fourth</li>
</ul>

This represents the array ['first','second','third','fourth']. This is the example we will use to show how D3js can be seen as a Functor.

A Category over the DOM

We will define a Category over the DOM by taking the DOM elements as the objects and containment as the arrows.

Given an arrow, f, the domain is the parent DOM element and the codomain is the child DOM element. So, an arrow f: a -> b shows that b is a child of a. Trivially, the identity arrow maps DOM elements to themselves, f: a -> a.

These arrows are composable. Given an arrow, f: a -> b, and an arrow, g: b -> c, an arrow, h: a -> c, can be produced by first applying f to a then applying g to b = f(a); h = g(f). This would show a grandchild relationship within the DOM.

This composition is associative. Take arrows, f: a -> b, g: b -> c, h: c -> d. Let z = h(g(f)), y = g(f), w = h(g). z = h(y) and z = w(f). The grandchild's child is the same as the child's grandchild. This is the great-grandchild relationship within the DOM.

This composition obeys the left and right unit laws. Take f: a -> b, f(id(a)) = f(a) = b. id(f(a)) = id(b) = b.

d3.select is a Functor over the DOM

The first part to discovering a Functor is to define a unit function which maps an element of the DOM, a, to the Functor, F(a). In D3js this function is d3.select.

var element = document.getElementById('list');
var selection = d3.select(element);

d3.select returns a d3 selector object which wraps the DOM element in argument. It "lifts" the DOM element into the d3 selector.

Next we need a function which maps arrows in the DOM, f: a -> b, to arrows in the Functor, F(a) -> F(b). We took our arrows to refer to containment; getElementsByTagName will serve as our f.

var element = document.getElementById('list');
var selection = d3.select(element);

var childElement = element.getElementsByTagName('li')[0];
var childText = childElement.textContent;
var child = selection.select(function(){
  var current = this.getElementsByTagName('li')[0];
  return current;
});

var same = childText === child.text();
if(!same){
  alert('It didn\'t work!');
}
alert(same);

Higher Kinded Types

A higher kinded type is simply an abstraction over types. The examples on this page use the scala language.

A proper type is something such as String. You can only instantiate values of proper types directly. These are represented as kind *.

val str: String = "seven"

In the above snippet, a String can simply be instantiated. No bells, whistles or hoops. (Some proper types cannot be instantiated, these are usually abstract types.)

On the other hand, a higher kinded type is a type which needs another type provided to it in order to create a proper type. It is a type constructor. Values cannot be instantiated for types of these kinds. These have myriad kind representations, the simplest of which is kind * -> *. Which says given a type of kind *, you get a type of kind *. List is a common example.

var list: List = List()//compile error
var list: List[String] = List("sev", "en")

In the above example a List cannot simply be instantiated. It must be provided with a proper type before the compiler can reason about it.

An extension of this is that a type like Map, which takes two type parameters, is kinded * -> * -> *. This says given two types of kind * a type of kind * is produced (currying for types).

Furthermore, a type like Monad[F] is of kind (* -> *) -> * which implies given a type of kind * -> * a type of kind * is produced.

Category Theory: Cheat Sheet

I have found Category Theory has increased my productivity as a programmer more than any other mathematic tool at my disposal. This is my cheat sheet. 

Metagraph

  • Objects: a, b, c,...  
  • Arrows: f, g, h,...  
  • Domain: for each arrow, f, there is an object a = dom f 
  • Codomain: for each arrow, f, there is an object b = cod f 

A Metacategory is a Metagraph plus

  • Identity: for each object, a, there is an arrow f: a -> a
  • Composition: arrows, <f, g>, where dom g = cod f, admit an arrow h: dom f - > cod g
  • Composition is associatve
  • Composition obeys left and right unit laws 

A Category is an interpretation of a Metacategory within Set Theory. 

 

A Functor, T: C - > B, is a morphism of categories.  

  • Object function: mapping objects c in C to objects Tc in B
  • Arrow function: mapping arrows f in C to arrows Tf in B where the unit and composition are preserved
  • T is Covariant if for each arrow f: a -> b in C, Tf: Ta -> Tb in B
  • T is Contravariant if for each arrow f: a -> b in C, Tf: Tb -> Ta (arrows are reversed).
  • T is an Endofunctor if B = C

A Natural Transformation, t: S -*> T, is a morphism of Functors, T: C -> B. 

  • Each object c in C yields an arrow, tc: Sc -> Tc, in B
  • Each arrow f: a -> b in C holds tb(Sf(Sc)) = tb(Sb) = Tb 

A Monad in a Category, C, is an endofunctor, T, with two Natural Transformations; e: Id(C) -*> T, m: T(T) -*> T.

  • m(Tm) = m(mT)
  • m(Te) = m(eT) = Id(T)

 

A Monic is an arrow m: a -> b in a Category, C, that is left cancellable. 

  • Given parallel arrows f, g: d -> a, m(f) = m(g) implies f = g

An Epi is an arrow h: a -> b in a Category, C, that is right cancellable. 

  • Given parallel arrows f, g: b -> c, f(h)  = g(h)  implies f = g. 

 

An Idempotent is an arrow, f: b -> b, where f(f) = f. 

An Initial object, a, of a Category, C, yields for each object, c, in C exactly one arrow f: a -> c. 

A Terminal object, a, of a Category, C, yields for each object, c,  in C exactly one arrow f: c -> a.  

 

A Monoid is a Category with a single object.  

A Groupoid is a Category, C, where for each arrow, f, in C there is an inverse of f. 

 

A Dual Category, D, of a Category, C, is C with all arrows reversed

  • An arrow, f: a -> b, in C yields an arrow, g: b -> a, in D.

 

Source: https://dreadedsoftware.squarespace.com/co...