Thursday, March 31, 2022

Kotlin: Functional Domain Modeling #3

 Kotlin: Functional Domain Modeling #3


In this blog we introduce the applicative functor pattern. The central idea behind the pattern is the sequencing of effects. Validation is one use case for which applicative functors are useful. Given a series of validation effects, regardless of the result (failure or success) they guarantee all will be executed independently.



Applicative functor

Consider filling out a web form to sign up for an account. You enter your user name and password and the system responds by saying that a user name may not contain dashes. When you make the necessary changes and resubmit you might now be informed that the password must have at least one capital letter.

It would be nice to have all the errors reportedly simultaneously. We have seen that the monadic Either fails fast, reporting the first validation error it detects. Another approach is to use the Validation class that is comparable to Either:

sealed class Validation<E, A> {
    data class Failure<E, A>(val value: E) : Validation<E, A>()
    data class Success<E, A>(val value: A) : Validation<E, A>()
}

in which the type parameter E represents the error type and the type parameter A represents the obtained value.

The following shows validating the LastName with the Validation class replacing our earlier version with Either. Function create returns a Validation object with a String for its error type and LastName for its payload. The asserts show a successful creation and a failure creation.

data class LastName(val lastName: String) {

    companion object {

        fun create(lastName: String): Validation<String, LastName> {
            return if(lastName.length < 2)
                failure("Last name too short: $lastName")
            else if (!regex.matches(lastName))
                failure("Last name malformed: $lastName")
            else
                success(LastName(lastName))
        }   // create

        private val regex: Regex = Regex("[A-Z][A-Z]*")

    }

}   // LastName


assertEquals(success(LastName("BARCLAY")), LastName.create("BARCLAY"))
assertEquals(failure("Last name malformed: Barclay"), LastName.create("Barclay"))

The Validation class is accompanied with the typealias:

typealias ValidationNel<E, A> = Validation<NonEmptyList<E>, A>

where the error type for ValidationNel is a NonEmptyList of errors of type E. The NonEmptyList reflects the fact that if there are validation errors then one or more will be reported. ValidationNel is supported with the factory functions failureNel and successNel, substitutes for the functions failure and success.

The next example repeats the above, this time using ValidationNel. Note how the failing assert packs the single error into a NonEmptyList.

data class LastName(val lastName: String) {

    companion object {

        fun create(lastName: String): ValidationNel<String, LastName> {
            return if(lastName.length < 2)
                failureNel("Last name too short: $lastName")
            else if (!regex.matches(lastName))
                failureNel("Last name malformed: $lastName")
            else
                successNel(LastName(lastName))
        }   // create

        private val regex: Regex = Regex("[A-Z][A-Z]*")

    }

}   // LastName


assertEquals(successNel(LastName("BARCLAY")), LastName.create("BARCLAY"))
assertEquals(failureNel("Last name malformed: Barclay"), LastName.create("Barclay"))
typealias ValidationNelError<A> = ValidationNel<ValidationError, A>

Rebuilding our LastName, FirstName, MiddleName and Name using ValidationNel we have:

data class Name(val lastName: LastName, val firstName: FirstName, val middleName: Option<MiddleName> = none()) {

    companion object {

        fun create(lastName: String, firstName: String, middleName: String = ""): ValidationNel<String, Name> {
            val vLastName: ValidationNel<String, LastName> = LastName.create(lastName)
            val vFirstName: ValidationNel<String, FirstName> = FirstName.create(firstName)

            return if (middleName == "")
                fmap2(vLastName, vFirstName){ln, fn -> Name(ln, fn) }
            else {
                val vMiddleName: ValidationNel<String, MiddleName> = MiddleName.create(middleName)
                fmap3(vLastName, vFirstName, vMiddleName){ln, fn, mn -> Name(ln, fn, some(mn)) }
            }
        }   // create

        fun create(name: String): ValidationNel<String, Name> {
            val names: List<String> = name.split(" ")
            val size: Int = names.size

            return if (size == 0)
                failureNel("Name.create: empty name not supported")
            else if (size == 1)
                failureNel("Name.create: name must exceed 1 part")
            else if (size > 3)
                failureNel("Name.create: name must not exceed 3 parts")
            else if (size == 2) {
                val vLastName: ValidationNel<String, LastName> = LastName.create(names[0])
                val vFirstName: ValidationNel<String, FirstName> = FirstName.create(names[1])
                fmap2(vLastName, vFirstName){ln, fn -> Name(ln, fn)}
            } else {
                val vLastName: ValidationNel<String, LastName> = LastName.create(names[0])
                val vFirstName: ValidationNel<String, FirstName> = FirstName.create(names[1])
                val vMiddleName: ValidationNel<String, MiddleName> = MiddleName.create(names[2])
                fmap3(vLastName, vFirstName, vMiddleName){ln, fn, mn -> Name(ln, fn, some(mn)) }
            }
        }   // create

    }

}   // Name





If all three ValidationNel<String, ...> instances indicate a successful validation we extract the validated arguments and pass them to a pure function f that constructs the validated result object. In case of errors they are reported to the caller of Name.create. The functions that we use are fmap2 and fmap3 with the signature for the former:

fun <E, A, B, C> fmap2(
    vea: ValidationNel<E, A>, veb: ValidationNel<E, B>, f: (A, B) -> C
): ValidationNel<E, C>

This is an example of the applicative functor pattern of functional programming. The pattern is used when you deal with effects in functional programming. These applicative effects refer to the way the effects are applied. Our application of fmap3 demonstrates that the effects are sequenced through all the validation functions regardless of the failure or success they produce. This is shown in the following asserts:

assertEquals(
    successNel(Name(LastName("BARCLAY"), FirstName("Ken"), some(MiddleName("Andrew")))),
    Name.create("BARCLAY", "Ken", "Andrew")
)
assertEquals(
    failureNel(NonEmptyListF.of("Last name malformed: Barclay")),
    Name.create("Barclay", "Ken", "Andrew")
)
assertEquals(
    failureNel(NonEmptyListF.of("First name malformed: ken")),
    Name.create("BARCLAY", "ken", "Andrew")
)
assertEquals(
    failureNel(NonEmptyListF.of("Middle name malformed: andrew")),
    Name.create("BARCLAY", "Ken", "andrew")
)
assertEquals(
    failureNel(NonEmptyListF.of("First name malformed: ken", "Middle name malformed: andrew")),
    Name.create("BARCLAY", "ken", "andrew")
)

The final assert demonstrates how the applicative pattern fails slow, identifying two separate validation failures.



The applicative functor

The functions fmap2 and fmap3 are generalizations of fmap, to functions with more than one parameter. With F the actual data type place-marker, the overloaded function fmap2 has the signatures:

fun <A, B, C> fmap2(fa: F<A>, fb: F<B>, f: (A) -> (B) -> C): F<C>
fun <A, B, C> fmap2(fa: F<A>, fb: F<B>, f: (A, B) -> C): F<C> = fmap2(fa, fb, C2(f))

Note how the second version with the function f presented in its uncurried form can be defined in terms of the first version using the Dogs function C2 to curry its function parameter. The Dogs library includes the currying functions C2, C3, C4 and C5.

Whilst the Dogs library includes functions fmap2 and fmap3 across all types that are applicative functors, the question is whether we should also include fmap4, fmap5, etc, making the code very tedious.

The solution gives rise to the notion of an applicative functor. With F the place-marker for the actual type for which it applies, the applicative functor is described with the ap extension function (representing sequential application) and the function pure:

fun <A, B> F<A>.ap(f: F<(A) -> B>): F<B>
fun <A> pure(a: A): F<A>

With these two basic functions we can construct any mapping function fmapN. The function pure converts a value of type A into the context F<A>. If the context is, say List, then function pure creates a singleton List with that single value. Function ap is a generalized form of function application. We have a function (A) -> B, a value of type A, and a result of type B, all wrapped in a context F.

A typical use of these functions is shown in the following asserts:

assertEquals(some(3), some(2).ap(pure{m: Int -> m + 1}))
assertEquals(some(3), pure{m: Int -> m + 1} appliedOver some(2))
assertEquals(some(5), pure{m: Int -> {n: Int -> m + n}} appliedOver some(2) appliedOver some(3))

The function appliedOver is an infix version of function ap with the function given first. The final assert is often known as the applicative style because of its similarity to normal function application.

This final example informs us how we can now define fmap2:

fun <A, B, C> fmap2(fa: F<A>, fb: F<B>, f: (A) -> (B) -> C): F<C> =
    pure(f) appliedOver fa appliedOver fb

We see that whilst we could continue with this to define fmap3, fmap4, etc, we can simply employ the applicative style to get what we require.

Here it is defined for the Option type:

fun <A, B, C> fmap2(oa: Option<A>, ob: Option<B>, f: (A) -> (B) -> C): Option<C> =
    pure(f) appliedOver oa appliedOver ob

assertEquals(some(5), fmap2(some(2), some(3)){m -> {n -> m + n}})



The code for the Dogs library can be found at:

https://github.com/KenBarclay/TBA
https://github.com/KenBarclay/TBA


Tuesday, March 29, 2022

Kotlin: Functional Domain Modeling #2

 Kotlin: Functional Domain Modeling #2


In this blog we investigate creation strategies for our domain elements. Factories offer a unified strategy for object creation. But other considerations for object creation include: how do we ensure that the factory returns a valid object; where to put the validation logic; and what happens if the validation fails.



The Integrity of Simple Values

We can create an object in a multitude of ways - the simplest being a direct invocation of the class constructor. But, the simplest technique always has some pitfalls. We need to apply some good software engineering principles when creating domain model objects. We must ensure our object are valid. It is very unusual to have an unbounded Int or String in a real world domain. In the previous blog our product id was represented by an Int but it is unlikely that the business uses a negative Int. Equally, the last name for a customer is represented by a String, but we would not expect it to include a tab character.

We see some of these issues in the following code. The class represents someones last name but there are no constraints on what values we provide.

class LastName(val lastName: String)

val me = LastName("BARCLAY")
val doppelganger = LastName("Ken")

assertEquals("BARCLAY", me.lastName)
assertEquals("Ken", doppelganger.lastName)

We want to ensure the values of these types cannot be created unless they satisfy the domain constraints. Thereafter, because the object is immutable, the value never needs to be checked again.

To ensure the constraints are enforced we make the constructor private/internal and provide a separate function that creates only valid values. The standard technique for the construction of objects that honors some constraint is known as the smart constructor idiom. Here is an example:

@JvmInline
value class LastName private constructor(val lastName: String) {

    companion object {

        fun create(text: String): LastName = LastName(text)

    }

}   // LastName


// compilation error: cannot access private LastName
// val me = LastName("BARCLAY")
val doppelganger = LastName.create("Barclay")

assertEquals("Barclay", doppelganger.lastName)

So now the LastName object cannot be created due to the private constructor. This is shown in the commented code. The create function is our (not yet) smart constructor factory function.

In functional programming, an effect adds some capabilities to a computation. An effect is modeled in the form of a type constructor that constructs types with additional capabilities. Consider some arbitrary type LastName, then, for example, Option<LastName> adds optionality to the type LastName.

In the next example the smart constructor function create makes two checks on the text supplied for the name. If the text contains only one character or if the text is not all uppercase letters then None is returned to indicate that a valid name could not be constructed. Otherwise, a LastName object is created and wrapped in a Some to denote success.

data class LastName(val lastName: String) {

    companion object {

        fun create(lastName: String): Option<LastName> {
            return if(lastName.length < 2)
                none()
            else if (!regex.matches(lastName))
                none()
            else
                some(LastName(lastName))
        }   // create

        private val regex: Regex = Regex("[A-Z][A-Z]*")

    }

}   // LastName


assertEquals(some(LastName("BARCLAY")), LastName.create("BARCLAY"))
assertEquals(none(), LastName.create("Barclay"))



Managing failure with Option type

In a business application a customer name is modeled as a last name and a first name. Like above the last name is expected to be fully capitalized. The first name is expected to have a leading capital letter and any number of lowercase letters. Here is class FirstName modeled as earlier:

data class FirstName(val firstName: String) {

    companion object {

        fun create(firstName: String): Option<FirstName> {
            return if(firstName.length < 2)
               none()
            else if (!regex.matches(firstName))
                none()
            else
                some(FirstName(firstName))
        }   // create

        private val regex: Regex = Regex("[A-Z][a-z]*")

    }

}   // FirstName

The Name class declaration starts with:

data class Name(val lastName: LastName, val firstName: FirstName) ...

It too adopts the smart constructor idiom with its own create function:

data class Name(val lastName: LastName, val firstName: FirstName) {

    companion object {

        fun create(lastName: String, firstName: String): Option<Name> {
            return ...
        }   // create

    }

}   // Name

The body of the function create will pass lastName to the function LastName.create returning an Option<LastName>, and pass firstName to the function FirstName.create returning an Option<FirstName>. Option is a monad as defined in the Dogs library. Monads allow the programmer to build up computations using sequential building blocks. The monad determines how combined computations form a new computation and frees the programmer from having to code the combination manually each time it is required. It is useful to think of a monad as a strategy for combining computations into more complex computations.

We use Option's bind (aka flatMap) function to compose the results from LastName.create and FirstName.create. This bind function makes Option a monad and helps to compose them in a clean manner. Here is the final version of the Name class:

data class Name(val lastName: LastName, val firstName: FirstName) {

    companion object {

        fun create1(lastName: String, firstName: String): Option<Name> {
            val oLastName: Option<LastName> = LastName.create(lastName)
            val oFirstName: Option<FirstName> = FirstName.create(firstName)
            return oLastName.bind{ln ->
                oFirstName.bind{fn ->
                    some(Name(ln, fn))
                }
            }
        }   // create1

        fun create2(lastName: String, firstName: String): Option<Name> {
            val oLastName: Option<LastName> = LastName.create(lastName)
            val oFirstName: Option<FirstName> = FirstName.create(firstName)
            return bind2(oLastName, oFirstName,){ln ->
                {fn ->
                    some(Name(ln, fn))
                }
            }
        }   // create2

        fun create3(lastName: String, firstName: String): Option<Name> {
            val oLastName: Option<LastName> = LastName.create(lastName)
            val oFirstName: Option<FirstName> = FirstName.create(firstName)
            return bind2(oLastName, oFirstName){ln, fn -> some(Name(ln, fn)) }
        }   // create3

        fun create(name: String): Option<Name> {
            val names: List<String> = name.split(" ")
            val size: Int = names.size

            return if (size != 2)
                none()
            else {
                val oLastName: Option<LastName> = LastName.create(names[0])
                val oFirstName: Option<FirstName> = FirstName.create(names[1])
                bind2(oLastName, oFirstName){ln, fn -> some(Name(ln, fn))}
            }
        }   // create

    }

}   // Name

The three functions create1, create2 and create3 are presented to show the different ways of using bind. Here is the signature of (extension) function bind from the Option class:

fun <A, B> Option<A>.bind(f: (A) -> Option<B>): Option<B>

The bind function operates on an Option<A> along with the function parameter that converts an A into an Option<B>. The result type is an Option<B>. If the Option<A> is a Some then the function parameter is applied to its wrapped value, producing the required result. If the Option<A> is a None then its value is bypassed as a None value.

If in Name.create1 the function call LastName.create(lastName) is a Some, then the wrapped LastName is associated with the bind parameter ln. If the function call FirstName.create(firstText) is a Some, then the wrapped FirstName is associated with the bind parameter fn. A Name is created from these two values and wrapped in a Some to denote success.

Where the chain of operations involves no other processes then the function bind2 can be used. In function create2 we use the version of bind2 that expects a curried binary function. The overloaded bind2 in create3 uses an uncurried function to give the most compact coding.

The first assert creates a successful Name. The other assert statements fail with, respectively, invalid last name, invalid first name, and invalid last and first names. Note carefully the final assert. Both names are invalid but only the last name is the cause of the resulting None. The call to LastName.create("Barclay") returns an Option<LastName> which is a None, and the chain of binds breaks. Calls to FirstName.create("ken") is never executed. For that reason monads are said to fail fast.

assertEquals(
    some(Name(LastName("BARCLAY"), FirstName("Ken"))),
    Name.create1("BARCLAY", "Ken")
)
assertEquals(none(), Name.create1("Barclay", "Ken"))   // bad last name
assertEquals(none(), Name.create2("BARCLAY", "ken"))   // bad first name
assertEquals(none(), Name.create3("BARCLAY", "ken"))   // bad last + first name



The monad

We present a formal definition of the monad through the following two functions. Again, F is a place-marker for the actual type for which it applies:

fun <A, B> F<A>.bind(f: (A) -> F<B>): F<B>
fun <A> inject(a: A) -> F<A>

The bind function is the monadic sequencing operation. If the receiver F<A> produces an A, then bind applies the function f converting it to an F<B>. The function bind returns an F<B>. The inject function is like the applicative pure function and injects a value into the F context.

An important aspect of the function bind is understanding how the chaining of computations takes place when you plug multiple binds together. They compose sequentially, and the chain breaks when one of the binds break. We saw exactly this when using the Option monad in the smart constructor for the Name class.

Here it is defined for the List type:

fun <A, B> List<A>.bind(f: (A) -> List<B>): List<B> =
    concat(this.fmap(f))

assertEquals(ListF.of(-1, 1, -2, 2, -3, 3), ListF.of(1, 2, 3).bind{n -> ListF.of(-n, +n)})

When defining this bind function we have a List<A> as the receiver and a function on the elements of the List: (A) -> List<B>. Then a natural thing to do is to map that function over the List as in the sub-expression this.fmap(f). This sub-expression has the type List<List<B>>. The helper function concat transforms this into a List<B> as required.

The next example defines a function pairs that takes two List parameters and returns a List of Pairs of elements from those two Lists. The function is expected to deliver all possible pairs from the two Lists. Since Lists are monadic we can define pairs with:

fun <A, B> pairs(la: List<A>, lb: List<B>): List<Pair<A, B>> =
    la.bind{a: A ->
       lb.bind{b: B ->
            inject(Pair(a, b))
        }
    }

The first application of bind selects one value from the first list while the second application of bind selects one value from the second list. We then inject the Pair into the result List.

The monad as a functional design pattern is, like all others, accompanied with a set of laws which must be satisfied for each implementation. We use KwikCheck to check the monad laws on the monad instances.




Managing failure with Either type

Consider the smart constructor function create in the class LastName. Under normal circumstances the caller supplies a capitalized String of at least two characters. But what about the exceptions?  Both exceptions report failure by returning None but give no indication of the nature of the error.

Our creation process can result in two possible errors: the text is too short or it is not fully capitalized. So to report success for the two types of error we employ the Either<A, B> type which has two type parameters and the two specializations: Left and Right. We inject a value of type A by using the Left constructor, or we inject a value of type B by using the Right constructor. Conventionally, when using Either for validations Left is used to denote failure and Right for success.

In the following version of function create we wrap the output with an Either type which returns a String if a problem occurs and a LastName if successful. The first assert is a valid name while the remaining two asserts are malformed names.

data class LastName(val lastName: String) {

    companion object {

        fun create(lastName: String): Either<String, LastName> {
            return if(lastName.length < 2)
                left("Last name too short: $lastName")
            else if (!regex.matches(lastName))
                left("Last name malformed: $lastName")
            else
                right(LastName(lastName))
        }   // create

        private val regex: Regex = Regex("[A-Z][A-Z]*")

    }

}   // LastName


assertEquals(right(LastName("BARCLAY")), LastName.create("BARCLAY"))
assertEquals(left("Last name malformed: Barclay"), LastName.create("Barclay"))
assertEquals(left("Last name too short: B"), LastName.create("B"))

Here is how the Name class when using the Either type:

data class Name(val lastName: LastName, val firstName: FirstName) {

    companion object {

        fun create(lastName: String, firstName: String): Either<String, Name> {
            val eLastName: Either<String, LastName> = LastName.create(lastName)
            val eFirstName: Either<String, FirstName> = FirstName.create(firstName)

            return bind2(eLastName, eFirstName){ln, fn -> right(Name(ln, fn)) }
        }   // create

        fun create(name: String): Either<String, Name> {
            val names: List<String> = name.split(" ")
            val size: Int = names.size

            return if (size == 0)
                left("Name.create: empty name not supported")
            else if (size == 1)
                left("Name.create: name must exceed 1 part")
            else if (size > 3)
                left("Name.create: name must not exceed 3 parts")
            else {
                val eLastName: Either<String, LastName> = LastName.create(names[0])
                val eFirstName: Either<String, FirstName> = FirstName.create(names[1])
                bind2(eLastName, eFirstName){ln, fn -> right(Name(ln, fn))}
            }
        }   // create

    }

}   // Name

We are avoiding using primitives such as Strings and instead create domain-specific types. Errors deserve the same treatment and should be modeled like everything else in the domain. To keep our code samples brief we continue to use Strings but more typically we might model errors as:

sealed class PaymentError {
    data class PaymentRejected(val amount: Int) : PaymentError()
    data class CardTypeNotRecognized(val type: String) : PaymentError()
    // ...
}



Modeling Optional Values

In an application a name is modeled as a last name, a first name and an optional middle name. The last name and the first name are as above. The middle name follows the pattern for the first name. The Option type ensures the correct semantics for a name. Here is the new Name class:

data class Name(val lastName: LastName, val firstName: FirstName, val middleName: Option<MiddleName> = none()) {

    companion object {

        fun create(lastName: String, firstName: String, middleName: String = ""): Either<String, Name> {
            val eLastName: Either<String, LastName> = LastName.create(lastName)
            val eFirstName: Either<String, FirstName> = FirstName.create(firstName)

            return if (middleName == "")
                bind2(eLastName, eFirstName){ln, fn -> right(Name(ln, fn, none())) }
            else {
                val eMiddleName: Either<String, MiddleName> = MiddleName.create(middleName)
                bind3(eLastName, eFirstName, eMiddleName){ln, fn, mn -> right(Name(ln, fn, some(mn))) }
            }
        }   // create

        fun create(name: String): Either<String, Name> {
            val names: List<String> = name.split(" ")
            val size: Int = names.size

            return if (size == 0)
                left("Name.create: empty name not supported")
            else if (size == 1)
                left("Name.create: name must exceed 1 part")
            else if (size > 3)
                left("Name.create: name must not exceed 3 parts")
            else if (size == 2) {
                val eLastName: Either<String, LastName> = LastName.create(names[0])
                val eFirstName: Either<String, FirstName> = FirstName.create(names[1])
                bind2(eLastName, eFirstName){ln, fn -> right(Name(ln, fn))}
            } else {
                val eLastName: Either<String, LastName> = LastName.create(names[0])
                val eFirstName: Either<String, FirstName> = FirstName.create(names[1])
                val eMiddleName: Either<String, MiddleName> = MiddleName.create(names[2])
                bind3(eLastName, eFirstName, eMiddleName){ln, fn, mn -> right(Name(ln, fn, some(mn))) }
            }
        }   // create

    }

}   // Name

The middle name is modeled with Option and is None if absent from the constructor call. The create function chains together three calls to function bind to assemble a valid Name. The following assert statements should be self-explanatory:

assertEquals(
    right(Name(LastName("BARCLAY"), FirstName("Ken"), some(MiddleName("Andrew")))),
    Name.create("BARCLAY", "Ken", "Andrew")
)
assertEquals(left("Last name malformed: Barclay"), Name.create("Barclay", "Ken", "Andrew"))
assertEquals(left("First name malformed: ken"), Name.create("BARCLAY", "ken", "Andrew"))
assertEquals(left("Middle name malformed: andrew"), Name.create("BARCLAY", "Ken", "andrew"))
assertEquals(left("First name malformed: ken"), Name.create("BARCLAY", "ken", "andrew"))



The code for the Dogs library can be found at:

https://github.com/KenBarclay/TBA
https://github.com/KenBarclay/TBA


Kotlin: Functional Domain Modeling #1

Kotlin: Functional Domain Modeling #1

In this blog we start investigating how to use the Kotlin type system to accurately capture the domain model in code. We will see that types can act as documentation; documentation that does not get out of sync with the design because the latter is represented in the code itself.



Modeling Simple Values

As developers we have a tendency to focus on technical issues. However, the domain experts for whom we are developing an application think in terms of domain concepts such an order id, a product code, a customer name or a customer address. If we are to have a successful project it is important that we, as developers, fully embrace the domain experts requirements. To do that we must use the same vocabulary. So, instead of thinking in terms of Int and String, we think in terms of order id and address, even where the order id is an Int and the address is a String.

As our development proceeds it is important that, for example, a product code and a name are not mixed up. Just because they are both represented by Strings, say, they are not interchangeable. To make clear these types are distinct we employ a wrapper type that wraps the primitive representation.

In the following the classes CustomerId and OrderId are wrapper classes around a primitive Int. Creating simple types like this ensures that they cannot be accidentally mixed. The final comment line shows we cannot compare them.

data class CustomerId(val id: Int)
data class OrderId(val id: Int)


val customerId = CustomerId(123)
val orderId = OrderId(456)

// compilation error: operator == cannot be applied
// val isSame = (customerId == orderId)

Equally, if we have the function processCustomer that expects a CustomerId as input, then calling that function with an OrderId is another error.

val customerId = CustomerId(123)
val orderId = OrderId(456)
fun processCustomer(customerId: CustomerId): Unit = TODO()

processCustomer(customerId)

// compilation error: required: CustomerId; found: OrderId
// processCustomer(orderId)

Sometimes it is necessary for business logic to create a wrapper around some type. However, it introduces runtime overhead due to additional heap allocations. Moreover, if the wrapped type is primitive, the performance hit is terrible, because primitive types are usually heavily optimized by the runtime, while their wrappers don't get any special treatment.

To solve such issues, Kotlin introduces a special kind of class called the inline classTo declare an inline class, use the value modifier before the name of the class. To declare an inline class for the JVM backend, use the value modifier along with the @JvmInline annotation before the class declaration. An inline class must have a single property initialized in the primary constructor. At runtime, instances of the inline class will be represented using this single property. This is the main feature of inline classes, which inspired the name inline: data of the class is inlined into its usages (similar to how the content of inlined functions is inlined to call sites). All this is shown in the next example:

@JvmInline
value class LastName(val lastName: String)

// No actual instantiation of class LastName happens
// At runtime surname contains just a String
val surname = LastName("Barclay")

assertEquals("Barclay", surname.lastName)

If a class has two or more simple properties used directly in the class declaration then it can lead to problems as shown in the following example:

data class Name(val lastName: String, val firstName: String)

val me = Name("BARCLAY", "Ken")
val doppelganger = Name("Ken", "Barclay")

assertEquals(true, me.firstName == doppelganger.lastName)

We simply cannot avoid inadvertently mixing the two properties. First, we introduce wrapper classes as shown earlier. Observe how the final comment line would produce a compilation error when we mix the first and last names.

data class LastName(val lastName: String)
data class FirstName(val firstName: String)
data class Name(val lastName: LastName, val firstName: FirstName)

val me = Name(LastName("BARCLAY"), FirstName("Ken"))

assertEquals("Ken", me.firstName.firstName)

// compilation error: type mismatch
// val doppelganger = Name(FirstName("Ken"), LastName("BARCLAY"))

Then we remove the overhead of using simple types without loss of type-safety:

@JvmInline
value class LastName(val lastName: String)
@JvmInline
value class FirstName(val firstName: String)

data class Name(val lastName: LastName, val firstName: FirstName)

// No actual instantiation of class LastName happens
// At runtime surname contains just a String
val surname = LastName("BARCLAY")

val me = Name(surname, FirstName("Ken"))

assertEquals("Ken", me.firstName.firstName)

Almost always simple types are constrained in some way, such as having to be in a certain range or match a certain pattern. It is very unusual to have an unbound Int or String in a real-world domain. In the next blog we will discuss how to enforce these constraints.



The code for the Dogs library can be found at:

https://github.com/KenBarclay/TBA
https://github.com/KenBarclay/TBA


Sunday, March 27, 2022

Kotlin: Map type

 Kotlin: Map type

A map is an unordered collection of key/value pairs. Inserting into a map requires two values: the key and the corresponding value. Indexing the map with that same key retrieves the associated value. The usual operations for a map include an operation to add a new key/value pair, an operation to delete a key/value pair given the key, and an operation to determine if the key is present in the map.

The Dogs library supports two types of maps: the immutable and persistent Map implemented as a binary search tree; and the immutable and persistent Map based on a hash-array mapped trie. We will discus both in this blog.



Maps

A Map is a collection of key-value pairs. It is sometimes known as a dictionary or an association list. The keys in a Map are unique and each key is associated with a value that need not be unique. The Map data type has two type parameters, for the key type and for the value type. So a Map<String, Int> maps Strings to Ints, while a Map<Int, List<String>> maps an Int to a List of Strings.

The usual operations for a Map include an operation to add a new key/value pair, an operation to delete a key/value pair given the key, and an operation to determine if the key is present in the MapThere is no significance to the order in which the elements are added to the MapThe Map data type is realized as a balanced binary search treeOf course, a balanced binary tree implies an ordering of the elements but this will be invisible to the user. The key type must be Comparable so that the ordering can be maintained.



Immutable and Persistent Maps

Like the other data types in the Dogs library, Maps are immutable and persistent. Immutable Maps have many advantages over their mutable siblings. They are thread-safe and can be passed to untrusted libraries without any side effects. A persistent Map, like a persistent List, is a data structure that always preserves the previous version of itself when it is modified. It does so by sharing parts of its structure.

The following figure shows the effect of inserting the key value Kim into the balanced binary tree xs forming the new tree ys. The resulting tree is no longer balanced since the subtree rooted at Ken has a depth of two for its right subtree and a depth of zero for its left subtree. A rebalancing is achieved by a rotation around the node labelled Ken.


The next figure illustrates a typical balanced insertion. Since the insertion into xs will first apply to the right subtreee of the root node labelled John then the left subtree of John can be shared with the new tree ys. Inserting key Kim into the subtree labelled Ken will cause the rebalancing and this new tree structure becomes part of the new tree ys.







Working with Immutable Maps

The following example demonstrates how to construct a Map and then perform the operations containsKey and lookUpKey. The Map values are the name of the states in the USA and uses their abbreviation for the key.

val states: Map<String, String> = MapF.of(
    "AL" to "Alabama",
    "AZ" to "Arizona",
    "CA" to "California",
    "FL" to "Florida",
    "KS" to "Kansas",
    "NV" to "Nevada",
    "SC" to "South Carolina",
    "TX" to "Texas"
)

assertEquals(true, states.containsKey("FL"))
assertEquals(false, states.containsKey("WY"))

assertEquals(OptionF.some("California"), states.lookUpKey("CA"))
assertEquals(OptionF.none(), states.lookUpKey("UT"))

In the next example the Map pairs the name of a color with its hexadecimal code:

val colors: Map<String, String> =
    MapF.empty<String, String>()
        .insert("Red" to "#FF0000")
        .insert("Blue" to "#0000FF")
        .insert("DarkBlue" to "#00008B")
        .insert("Yellow" to "#FFFF00")
        .insert("White" to "#FFFFFF")

assertEquals(true, colors.containsKey("White"))
assertEquals(OptionF.some("#FFFF00"), colors.lookUpKey("Yellow"))
assertEquals(OptionF.none(), colors.lookUpKey("Green"))

In the following example the Map has a String for the key and an Int for the corresponding value. The code demonstrates the effect of inserting a new key/value pair into an existing MapIf the key is already present in the Map, then the value is replaced.

val people: Map<String, Int> = MapF.of(
    "Ken" to 25,
    "John" to 31,
    "Jessie" to 22,
    "Andrew" to 35,
    "Gordon" to 35
)
val updated: Map<String, Int> = people.insert("Andrew" to 40)

assertEquals(OptionF.some(22), people.lookUpKey("Jessie"))
assertEquals(OptionF.some(22), updated.lookUpKey("Jessie"))

assertEquals(OptionF.some(35), people.lookUpKey("Andrew"))
assertEquals(OptionF.some(40), updated.lookUpKey("Andrew"))

In the next example we show the effect of the member function delete. When the key is not a member of the Map then the original Map is returned.

val people: Map<String, Int> = MapF.of(
    "Ken" to 25,
    "John" to 31,
    "Jessie" to 22,
    "Andrew" to 35,
    "Gordon" to 35
)
val removed: Map<String, Int> = people.delete("Ken").delete("John")

assertEquals(true, people.containsKey("Ken"))
assertEquals(false, removed.containsKey("Ken"))

assertEquals(OptionF.some(22), people.lookUpKey("Jessie"))
assertEquals(OptionF.some(22), removed.lookUpKey("Jessie"))

In Dogs we implemented the data structures List and Stream, both of which could be folded. When writing code that needs to process data contained in one of these structures, we often do not care about the shape of the structure. Folding is one such operation that can be applied to them. We can fold over the values of a Map with the following example:

val people: Map<String, Int> = MapF.of(
    "Ken" to 25,
    "John" to 31,
    "Jessie" to 22,
    "Andrew" to 35,
    "Gordon" to 35
)

assertEquals(148, people.foldLeft(0){acc -> {age -> acc + age}})
assertEquals(0, MapF.empty<String, Int>().foldLeft(0){acc -> {age -> acc + age}})




Hash-Array Mapped Tries

The prime candidate data structure for efficient Map collections is the Hash-Array Mapped Trie (HAMT) by Bagwell. A general trie is a lookup structure for finite strings that acts like a deterministic finite automaton. In a HAMT, the strings are the bits of the hash codes of the elements stored in the trie. Depending on the branching factor we may use different sizes of chunks from the hash code for indexing into the (sparse) array of child nodes.

Figures (a), (b) and (c) graphically illustrate the structure of a small HAMT collection with branching factor 32 after step-by-step inserting the objects A, B, C, and D. HAMTs encode the hash codes of the elements in their tree structure (cf. (d)). The prefix tree structure grows lazily upon insertion until the new element can be distinguished unambiguously from all other elements by its hash code prefix.


The index numbers in the left top corner of each node refer to the positions of elements in an imaginary sparse array. This array is actually implemented as a 32-bit bitmap and a completely filled array with length equal to the node’s arity.

Immutable HAMTs perform path-copying on updates: the edited node and all its parent nodes are reallocated. The resulting new root node satisfies the immutability property by resembling the updated path-copied branch and, for the remainder, references to unmodified branches.

The Dogs implementation of an HAMT has an interface similar to that of the Map shown in the examples above. It includes, for example, the operations containsKey, lookUpKey, insert and foldLeft.

Map is an object that maps keys to values. A Map cannot contain duplicate keys; each key can map to at most one value. However, we come across scenarios wherein more than one value has to be mapped to a key. In such scenarios where multiple values need to be mapped to a single key, we end up creating a Map with a single key but a collection of values: the MultiMap.

The Dewey Decimal Classification (DDC), colloquially known as Dewey Decimal System, is a proprietary library classification system which allows new books to be added to a library in their appropriate location based on subject.

We define a library catalog based on the Dewey classification with the Book class and a typealias for a MultiMap with the Dewey code for the key and a List of Books associated with that classification:

data class Book(val isbn: String, val title: String, val author: String)
typealias Dewey = Map<String, List<Book>>

Using the HAMT Map we associate a collection of Books with the Dewey classification. In the following listing the first assert determines the number of algebra books, while the second assert finds how many books are authored by Saumont in the programming category:

val dewey: Dewey = MapF.of(
    "005" to ListF.of(      // Programming
        Book("9781617295362", "The Joy of Kotlin", "Pierre-Yves Saumont"),
        Book("9781617293290", "Kotlin in Action", "Dmitry Jemerov"),
        Book("9781530075614", "Kotlin for Android Developers", "Antonio Leiva"),
        Book("9781617292736", "Functional Programming in Java", "Pierre-Yves Saumont")
    ),
    "512" to ListF.of(      // Algebra
        Book("9789083136608", "Linear Algebra", "Michael Cohen"),
        Book("9780471433347", "Abstract Algebra", "David Dummit")
    ),
    "520" to ListF.of(      // Astronomy
        Book("9780008389321", "The Universe", "Andrew Cohen")
    )
)

assertEquals(2, dewey["512"].size())    // number of algebra books
assertEquals(2, dewey["005"].filter{book -> (book.author == "Pierre-Yves Saumont")}.size())



Performance

A benchmark comparing our immutable persistent Map, immutable persistent HAMT Map, Kotlin's mutable MutableMap and the PersistentMap class from the Immutable Collections Library for Kotlin at [https://github.com/Kotlin/kotlinx.collections.immutable]. The timings measure (in ms) inserting 10000 elements into the collections:

Insert 10000 elements into our immutable/persistent Map:    9.2
Insert 10000 elements into our immutable/persistent HAMT Map:    8.2
Insert 10000 elements into Kotlin's MutableMap type:    1.1
Insert 10000 elements into kotlinx PersistentMap type:        14.6

A second benchmark compares looking up the values for these 10000 keys in our immutable persistent Map, immutable persistent HAMT Map, Kotlin's mutable MutableMap and the PersistentMap class from the Immutable Collections Library for Kotlin at [https://github.com/Kotlin/kotlinx.collections.immutable]. The timings measure (in ms) searching 10000 elements in the collections:

Lookup 10000 elements in our immutable/persistent Map:    324.8
Lookup 10000 elements in our immutable/persistent HAMT Map:    3.1
Lookup 10000 elements in Kotlin's MutableMap type:    1.4
Lookup 10000 elements in kotlinx PersistentMap type:        3.9



The code for the Dogs library can be found at:

https://github.com/KenBarclay/TBA