Tuesday, April 8, 2014

Kotlin: Pattern matching

The past year I have evolved a simple pattern matching scheme that I now automatically incorporate into new Kotlin class hierarchies. The scheme is trivial to include, costs nothing to program and offers some beneficial features. Its development is thanks to the many nice features offered by Kotlin.

Consider the development of an immutable list type. The architecture comprises the trait ListIF, the abstract class ListAB, and the two concrete classes Nil and Cons. The ListIF trait operates as an interface, introducing the behaviours supported. The abstract class ListAB is the home for the implementation of most of the common behaviours. The concrete classes Nil and Cons are used to create list instances.

The implementation for this architecture is shown in Example 1. Note the match member function introduced in the trait. This is the pattern matching scheme for this class hierarchy. The function is given two parameters, one for each concrete class. The first parameter is a function to convert a Nil into the result type B. The second parameter is a function from a Cons to the result type B.

The concrete subclasses Nil and Cons override the definition for match. In the class Nil the first parameter is invoked against the Nil instance. In the class Cons the second parameter is invoked against the Cons instance. This pattern matching scheme is effectively the strategy design pattern.

Example 1: Pattern matching

package example1

import kotlin.test.*

trait ListIF<A> {
    fun size(): Int
    fun length(): Int
    fun interleave(xs: ListIF<A>): ListIF<A>

    fun <B> match(nil: (Nil<A>) -> B, cons: (Cons<A>) -> B): B
}

abstract class ListAB<A> : ListIF<A> {
    override fun size(): Int {...}

    override fun length(): Int {...}
    
    override fun interleave(xs: ListIF<A>): ListIF<A> {...}
}

class Nil<A> : ListAB<A>() {
    override fun <B> match(nil: (Nil<A>) -> B, cons: (Cons<A>) -> B): B = nil(this)
}

class Cons<A>(val hd: A, val tl: ListIF<A>) : ListAB<A>() {
    override fun <B> match(nil: (Nil<A>) -> B, cons: (Cons<A>) -> B): B = cons(this)
}

fun main(args: Array<String>) {
    val xs: ListIF<Int> = Cons(1, Cons(3, Cons(5, Nil<Int>())))
    val ys: ListIF<Int> = Cons(2, Cons(4, Nil<Int>()))

    assertEquals(3, xs.size())
    assertEquals(2, ys.length())
    assertEquals(4, xs.interleave(ys).length())
}

The definition for member function size is:

override fun size(): Int {
    return this.match(
        {(nil: Nil<A>) -> 0},
        {(cons: Cons<A>) -> 1 + cons.tl.size()}
    )
}

A match is performed against the recipient list and if it is empty then zero is returned. If the list is non empty then we return 1 plus the size of the remaining list. Of course we could achieve the same using a when clause and a number of is selector clauses. However, it is incumbent on the programmer to provide all the necessary choices. With the match function if you omit a choice you get a compiler error.

The second function literal given to match reveals another feature. The formal parameter cons is of type Cons and is effectively a smart cast of this. Hence in the body of this function literal we can reference the tl property of cons.

The definition for member function interleave shows how we handle nested pattern matching:

override fun interleave(xs: ListIF<A>): ListIF<A> {
    return this.match(
        {(nil1: Nil<A>) -> Nil<A>()},
        {(cons1: Cons<A>) ->
            xs.match(
                {(nil2: Nil<A>) -> Nil<A>()},
                {(cons2: Cons<A>) -> Cons(cons1.hd, Cons(cons2.hd, cons1.tl.interleave(cons2.tl)))}
            )
        }
    )
}

Interleaving the values from two lists requires a number of considerations. They are fully captured by the nested pattern matching. If the first list (this) is empty then an empty list is delivered. If the first list is not empty then we look to the second list (xs). If this is empty then an empty list is returned. Otherwise, for two non empty lists we prefix the head of the first and the head of the second on to interleaving their tails. As well as a type safe implementation of our logic it also allows us to reason about its correctness before we test it.

The recursive definition of the list data type naturally leads to a recursive definition of the classes and its member functions. Here is member function length:

override fun length(): Int {
    fun recLength(xs: ListIF<A>, acc: Int): Int {
        return xs.match(
            {(nil: Nil<A>) -> acc},
            {(cons: Cons<A>) -> recLength(cons.tl, 1 + acc)}
        )
    }

    return recLength(this, 0)
}

The nested function recLength has an accumulating parameter so that the recursive call might exploit tail call optimization. If I understand the Kotlin annotation tailRecursive this does not apply here since the recursive call is nested in the function literal. Perhaps the clever people at IntelliJ could find a way so that I can have my cake and eat it.

Of course not all my class hierarchies are defined recursively and so this issue does not arise. The pattern matching scheme is inexpensive to implement, is type safe, supports smart casts and a clarity in function bodies that we can reason about their correctness.

Friday, February 14, 2014

Kotlin: An Option type #2

The first blog introduced the OptionIF<T> trait and some of the supporting member functions. The package level functions inject and bind are used to make the OptionIF<T> trait into a monad. A monad is a way to structure computations in terms of values and sequences of computations using those values. Monads allow the programmer to build up computations using sequential building blocks, which can themselves be sequences of computations. The monad determines how combined computations form a new computation and frees the programmer from having to code the combination manually each time it is required. It is useful to think of a monad as a strategy for combining computations into more complex computations. The inject and bind functions have the signatures:

fun <T> inject(t: T): OptionIF<T>
fun <T, V> bind(op: OptionIF<T>, f: (T) -> OptionIF<V>): OptionIF<V>

The inject function is simply a factory method to create an instance of OptionIF<T>.

The basic intuition behind the bind operation is that it allows us to combine two computations into one more complex computation and allows them to interact with one another. The first parameter type OptionIF<T> represents the first computation. Significantly, the second parameter type is T -> OptionIF<V> which can, given the result of the first computation, produce a second computation to run. In other words bind(op){t -> ... } is a computation which runs op and binds the value wrapped in op to the function literal parameter t. The function body then decides what computation to run using the value for t.

Example 5 demonstrated the map member function of OptionIF<T>. Example 6 shows how we might achieve the same using inject and bind. The mapped function applies the function parameter f to the content of the option type parameter op. Significantly the definition of mapped does not have to distinguish between the concrete OptionIF<A> types of op.

Example 6: The monadic option type

package example6

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*

fun <A, B> mapped(f: (A) -> B, op: OptionIF<A>): OptionIF<B> = bind(op){(a: A) -> inject(f(a))}

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    assertEquals(Some(11), mapped({(str: String) -> str.length()}, name))
    assertEquals(None<Int>(), mapped({str -> str.length()}, absentName))
}

Example 7 demonstrates how we might use the monadic option type. Running the program with the inputs 18, 27 and 9 delivers the result Some(Pair(2, 3)), demonstrating that 18 and 27 are exactly divisible by 9. The inputs 18, 27 and 6 produce the result None, showing that 18 and 27 and not both exactly divisible by 6.

Example 7: Staircasing

package example7

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*


fun divide(num: Int, den: Int): OptionIF<Int> {
    return if (num % den != 0) None<Int>() else Some(num / den)
}


fun division(a: Int, b: Int, c: Int): OptionIF<Pair<Int, Int>> {
    val ac: OptionIF<Int> = divide(a, c)
    return when (ac) {
        is None<Int> -> None<Pair<Int, Int>>()
        is Some<Int> -> {
            val bc: OptionIF<Int> = divide(b, c)
            when (bc) {
                is None<Int> -> None<Pair<Int, Int>>()
                is Some<Int> -> Some(Pair(ac.get(), bc.get()))
                else -> None<Pair<Int, Int>>()
            }
        }
        else -> None<Pair<Int, Int>>()
    }
}


fun bindDivision(a: Int, b: Int, c: Int): OptionIF<Pair<Int, Int>> {
    return bind(divide(a, c)){(ac: Int) -> bind(divide(b, c)){(bc: Int) -> inject(Pair(ac, bc))}}
}


fun main(args: Array<String>) {
    assertEquals(Some(Pair(2, 3)), division(18, 27, 9))
    assertEquals(None<Pair<Int, Int>>(), division(18, 27, 6))
    assertEquals(Some(Pair(2, 3)), bindDivision(18, 27, 9))
    assertEquals(None<Pair<Int, Int>>(), bindDivision(18, 27, 6))
}

Function division threatens to march off to the right side of the listing if it became any more complicated. It is very typical imperative code that is crying out for some refactoring.

We can bring the staircasing effect under control with bind. Consider the implementation of bindDivision. The outer call to bind(divide(a, c)){ ac -> ...} has an OptionIF value produced from the expression divide(a, c). Should this be a Some value, then the function literal is called and its formal parameter ac is bound to the result from the Some value. In the inner call bind(divide(b, c)){ bc -> ...} the inner function literal is called with the formal parameter bc bound to the Some value produced from divide(b, c). If the two calls to divide both deliver Some values then a final Some result is produced carrying the pair we seek. If a None value is delivered by either call to divide then a None value is the final result.

This simplification is also present in the two following examples. Often we make lookups of a map data structure, a database or a web service and receive an optional response. In some scenarios we take the result of a first lookup and use it to perform a second lookup. Example 8a is an application based on a set of products differentiated by a unique code. The products are organized by country, city, supplier and code. A product stock is realized by overlapping maps.

The immutable MapIF<K, V> trait and its lookUpKey member function provide the implementation. The function getProductFrom aims to deliver a Product given the country, city, supplier and code, and the product stock:

fun getProductFrom(country: String, ...,
productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>>): OptionIF<Product>

The product, if it exists, is obtained by making repeated lookUpKey function calls down through the nested maps. If anyone fails than we bail out with None<Product>.

Example 8a: Product searching

package example8a

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*
import com.adt.kotlin.data.immutable.map.*

data class Product(val code: String, val name: String)

fun getProductFrom(country: String, city: String, supplier: String, code: String, productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>>): OptionIF<Product> {
    val productsByCity: OptionIF<MapIF<String, MapIF<String, MapIF<String, Product>>>> = productsByCountry.lookUpKey(country)
    if (productsByCity.isDefined()) {
        val productsBySupplier: OptionIF<MapIF<String, MapIF<String, Product>>> = productsByCity.get().lookUpKey(city)
        if (productsBySupplier.isDefined()) {
            val productsByCode: OptionIF<MapIF<String, Product>> = productsBySupplier.get().lookUpKey(supplier)
            if (productsByCode.isDefined()) {
                return productsByCode.get().lookUpKey(code)
            } else
                return None<Product>()
        } else
            return None<Product>()
    } else
        return None<Product>()
}

// By country, by city, by supplier and by code
val productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>> =
        fromSequence(
                "USA" to fromSequence(
                        "San Francisco" to fromSequence(
                                "Graham" to fromSequence(
                                        "SH" to Product("SH", "Shiraz"),
                                        "ZI" to Product("ZI", "Zinfandel")
                                )
                        )
                ),
                "France" to fromSequence(
                        "Bordeaux" to fromSequence(
                                "Jacques" to fromSequence(
                                        "SE" to Product("SE", "Saint Emilion"),
                                        "ME" to Product("ME", "Medoc")
                                )
                        ),
                        "Beaune" to fromSequence(
                                "Marcel" to fromSequence(
                                        "AC" to Product("AC", "Aloxe-Corton"),
                                        "ME" to Product("ME", "Mersault"),
                                        "PO" to Product("PO", "Pommard"),
                                        "SA" to Product("SA", "Savigny"),
                                        "VO" to Product("VO", "Volnay")
                                )
                        )
                )
        )

fun main(args: Array<String>) {
    assertEquals(Some(Product("PO", "Pommard")), getProductFrom("France", "Beaune", "Marcel", "PO", productsByCountry))
    assertEquals(Some(Product("SH", "Shiraz")), getProductFrom("USA", "San Francisco", "Graham", "SH", productsByCountry))
    assertEquals(None<Product>(), getProductFrom("France", "Beaune", "Marcel", "ZZ", productsByCountry))

}

The implementation of function getProductFrom just does not look great. It looks ugly and not code you want to be associated with in your place of work.

However, since our OptionIF<T> is monadic, we can sequence the lookUpKey calls much more gracefully. The implementation of the function getProductFrom in Example 8b is now much simpler and is reminiscent of the bindDivision function from Example 7.

Example 8b: Product searching

package example8b

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*
import com.adt.kotlin.data.immutable.map.*


data class Product(val code: String, val name: String)


fun getProductFrom(country: String, city: String, supplier: String, code: String, productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>>): OptionIF<Product> {
    return bind(productsByCountry.lookUpKey(country)) {productsByCity ->
        bind(productsByCity.lookUpKey(city)){productsBySupplier ->
            bind(productsBySupplier.lookUpKey(supplier)){productsByCode ->
                productsByCode.lookUpKey(code)
            }
        }
    }
}


// By country, by city, by supplier and by code
val productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>> = ...


fun main(args: Array<String>) {
    assertEquals(Some(Product("PO", "Pommard")), getProductFrom("France", "Beaune", "Marcel", "PO", productsByCountry))
    assertEquals(Some(Product("SH", "Shiraz")), getProductFrom("USA", "San Francisco", "Graham", "SH", productsByCountry))
    assertEquals(None<Product>(), getProductFrom("France", "Beaune", "Marcel", "ZZ", productsByCountry))
}

Alongside the OptionIF<T> trait and its concrete classes None<T> and Some<T> there are a number of package level utility functions that make working with options very easy and natural. The sequenceM function combines a list of monadic options into one option monad that has a list of the results of those options inside it:

fun <T> sequenceM(bs: ListIF<OptionIF<T>>): OptionIF<ListIF<T>>

Example 9 demonstrates how this might be used. Consider we have a MapIF<String, Int> for the frequency distribution of words in a document, and a ListIF<String> of words whose total occurrences we wish to obtain.

Example 9: Utility function sequenceM

package example9

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*
import com.adt.kotlin.data.immutable.list.*
import com.adt.kotlin.data.immutable.map.*

val frequencies: MapIF<String, Int> = fromSequence(
        "software" to 5,
        "engineering" to 3,
        "kotlin" to 9,
        "object" to 8,
        "oriented" to 3,
        "functional" to 3,
        "programming" to 20
)
val words: ListIF<String> = fromSequence("object", "functional")

fun getWordFrequencies(words: ListIF<String>, frequencies: MapIF<String, Int>): ListIF<OptionIF<Int>> =
        words.map{(word: String) -> frequencies.lookUpKey(word)}

fun distributions(frequency: ListIF<OptionIF<Int>>): Int {
    fun ListIF<Int>.sum(): Int = this.foldLeft(0){(res: Int) -> {(elem: Int) -> res + elem}}
    val freq: OptionIF<ListIF<Int>> = sequenceM(frequency)
    return freq.getOrElse(emptyList<Int>()).sum()
}

fun main(args: Array<String>) {
    val wordFrequencies: ListIF<OptionIF<Int>> = getWordFrequencies(words, frequencies)
    val dist: Int = distributions(wordFrequencies)
    assertEquals(11, dist)
}

Given the words and the frequencies the function getWordFrequencies maps each word into a ListIF of OptionIF<Int> through application of the lookUpKey member function. The function distributions then finds the total occurences in the ListIF<Int> (wrapped in an OptionIF) obtained from sequenceM. Note that if any of the OptionIF<Int> in the ListIF passed to sequenceM is a None<Int> then a None<ListIF<Int>> is returned.

The OptionIF<T> provides a type for representing optional values. It also acted as a vehicle for demonstrating powerful abstractions that can hide mundane implementation details. Using higher order abstractions such as higher order functions, monads, etc. we can simplify the implementation of function bodies. 

These observations have inspired me to reconsider how I implement function logic. I aim to reduce the complexity of function bodies by reducing or removing my use of control structures such as if statements nested in while statements, representative of much of my existing imperative code. By presenting function bodies as simple sequential actions they are much easier to write and much easier to reason about their correctness. Functions that are much more trustworthy. Functions such as bindDivision and distributions are simple illustrations of what I seek to achieve.

Kotlin: An Option type #1

This pair of blogs introduces the Kotlin Option type that allows the programmer to specify where something is present or absent. It can be used in place of null to denote absence of a result. The Option type offers more than a new data type: more powerful abstractions, such as higher order functions, that hide many of the details of mundane operations such as mapping a function across a data type. We get to focus on the what, not the how making our code more declarative: incentivizing us to make our code blocks simpler.

Probably all Java developers have experienced a NullPointerException. Usually this occurs because some function returns it when not expected and when not dealing with that possibility in your client code. The value null is also used to represent the absence of a value such as when performing the member function get on a Map.

Kotlin, of course, allows you to work safely with null. The null-safe operator for accessing properties is foo?.bar?.baz will not throw an exception if either foo or its bar property is null. Instead, it returns null as its result.

An alternative approach that seeks to reduce the need for null is to provide a type for representing optional values, i.e. values that may be present or not: the OptionIF<T> trait. By stating that a value may or mat not be present on the type level, you and other developers are required by the compiler to deal with this possibility. Further, by providing various combinators we can chain together function calls on option values.

Kotlin OptionIF<T> is a container for zero or one element of a given type. An OptionIF<T> can be either a Some<T> wrapping a value of type T, or can be a None<T> which represents a missing value. You create an OptionIF<T> by instantiating the Some or None classes:

val name: OptionIF<String> = Some("Ken Barclay")
val absentName: OptionIF<String> = None<String>()

Given an instance of OptionIF<T> and the need to do something with it, how is this done? One way would be to check if a value is present by means of the isDefined member function, and if that is the case obtain the value with the get member function. This is shown in Example 1.

Example 1Accessing options

package example1

import com.adt.kotlin.data.immutable.option.*

fun main(args: Array<String>) {
    val op: OptionIF<Int> = Some(5)
    if (op.isDefined())
        println("op: ${op.get()}")  // => op: 5
}

Deliberately, I have chosen to include types explicitly as an aid to the reader. Of course, Kotlin's type inferencing would allow me to omit some and, in turn, make the code more compact.

The most common way to take optional values apart is through a pattern match. Example 2 introduces the show function which pattern matches on the OptionIF parameter and for a Some instance the member function get is used to retrieve the enclosed value.

Example 2: Pattern matching options

package example2

import com.adt.kotlin.data.immutable.option.*

fun show(op: OptionIF<String>) {
    when (op) {
        is Some<*> -> println(op.get())
        else -> println("Missing")
    }
}

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    show(name)              // => Ken Barclay
    show(absentName)    // => Missing
}

If you think this is clunky and expect something more elegant from Kotlin OptionIF<T> you are correct. If you use the member function get, you may forget about checking with isDefined leading to a runtime exception, and this scenario is no different from using null. You should avoid using options this way. One simple improvement we can make to these use cases is provided by the getOrElse member function as shown in Example 3.

Example 3: Provide a default

package example3

import com.adt.kotlin.data.immutable.option.*

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    println(name.getOrElse("Missing"))              // => Ken Barclay
    println(absentName.getOrElse("Missing"))    // => Missing

}

I suggested that we consider an option as a collection of zero or one elements. Consequently it is provided with many of the behaviors we associate with containers such as filtering, mapping and folding. Example 4 illustrates transforming an OptionIF<String> into an OptionIF<Int>. When you map an OptionIF<String> that is a None<String> then you get a None<Int>.

Example 4: Mapping

package example4

import com.adt.kotlin.data.immutable.option.*

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    println(name.map{(str: String) -> str.length()}.getOrElse(0))         // => 11
    println(absentName.map{str -> str.length()}.getOrElse(0))           // => 0
}

The member function map is a higher order function, accepting a transformer function as parameter. Using customizable higher order functions allows us to think about solutions differently. Function map allows us to apply a generic operation to the data type and means we can focus on the result.

A somewhat more elaborate illustration is given in Example 5 which implements a repository of users. We need to be able to find a user by their unique id. A request made with a non-existent id suggests a return type of OptionIF<User>.

Example 5: User repository

package example5

import com.adt.kotlin.data.immutable.option.*

class User(val id: Int, val firstName: String, val lastName: String, val age: Int)

class UserRepository {

    fun findById(id: Int): OptionIF<User> {
        return if (users.containsKey(id)) {
            val user: User = users.get(id)!!
            Some(user)
        } else
            None<User>()
    }

// ---------- properties ----------------------------------

    val users: Map<Int, User> = hashMapOf(
            1 to User(1, "Ken", "Barclay", 25),
            2 to User(2, "John", "Savage", 31)
    )
}

fun main(args: Array<String>) {
    val repository: UserRepository = UserRepository()
    val user: OptionIF<User> = repository.findById(1)
    val userName: OptionIF<String> = user.map{u -> "${u.firstName} ${u.lastName}"}
    if (userName.isDefined())
        println("User: ${userName.get()}")      // => User: Ken Barclay
}

Our dummy implementation uses a HashMap. Perhaps we might develop a Map<K, V> implementation with a lookUpKey member function that returns an OptionIF<V>.

In the follow-on blog I will consider some more advanced use-cases with OptionIF<T>. I aim to show how more powerful abstractions over the OptionIF<T> type can reduce the complexity of functions, encouraging us to cede control to these abstractions and remove the need to code each atomic step in an algorithm.