Friday, March 25, 2022

Dogs

 Dogs: A library of immutable and persistent data types

Dogs is a Kotlin library of immutable and persistent data types such as the List type or the MultiMap type. Through Kotlin we provide statically type safe functional data structures that are inspired by Haskell's data types.



Motivation

A data structure is a way of storing and organizing data so that it can be used efficiently. Different kinds of data structures are suited to different applications. For example, compilers usually employ hash tables to manage information about program identifiers.

Ordinary data structures, such as those provided through Kotlin, are ephemeral in the sense that making a change to the structure destroys the old version, leaving only the new one. This is the case for the Java/Kotlin List, Set and Map types. These imperative data structures operate through in-place mutation of data - the data structure is mutable. In our everyday usage of Kotlin data structures, we mutate lists, maps, sets and other types to realize the algorithm that we implement.

An important property of functional data structures is that they always persist - updating a functional data structure does not destroy the existing version, but rather creates a new version that coexists with the old version. Persistence is achieved by copying the affected nodes of a data structure and making all changes in the copy rather than the original. Because they are never modified directly, all nodes that are unaffected by an update can be shared between the old and the new versions of the structure without worrying that a change in one version will inadvertently be visible to the other.

Despite their name, persistent data structures do not refer to data structures stored on disk. Instead they are immutable data structures which are copy-on-write. That is, whenever you mutate a persistent data structure a new one is created that contains the data in the original plus the mutation while the original is left untouched. To make this reasonably efficient, structural sharing is used between the data structures.

Functional languages exploit immutable and persistent data structures, many of them based on the seminal book by Chris Okasaki, Purely Functional Data Structures.



Organisation

A data structure in Dogs is defined over three Kotlin source files. For example, the immutable and persistent List data structure is defined in the source files List.kt, ListE.kt and ListF.kt. The class declaration is defined in the source file List.kt. Contravariant functions which would otherwise conflict with the covariant type parameter in the List class appear as extension functions in the source file ListE.kt. The object declaration ListF from the file ListF.kt is used to present functions that deliver a List but are not natural members of the List class.



Never null

Kotlin treats null in a special way so that we may work safely with values that might be null. For instance, the null-safe operator for accessing properties foo?.bar?.baz will not throw an exception if either foo or its bar property is null, instead directly returning null.

Another approach to this problem is to provide a type for representing optional values. Values that may be present or absent are supported by the Option<A> class from the Dogs library. Thus, for example, a function to open an Account in a personal banking domain might be:

fun open(accNo: String, accName: String, initDeposit: Int): Option<Account>

returning an Option<Account> rather than an Account. When we return Option<Account> we are returning an abstraction over the evaluation. The caller of function open can decide whether to extract the value out of it or compose it with other abstractions.

Option as an abstraction follows other abstractions in the Dogs library. Like a List it can be mapped over; it is foldable and can be collapsed into a single value; it follows the monadic model of computation in which we can chain a sequence of the above function invocations.

The Dogs library supports migrating from using null to adopting Option. The generic types in Dogs have Any as their explicit upper bounds. Without this, Kotlin considers Any? as the default upper bounds and would allow null types as members. Thus, the Option and List classes appear as:

class Option<A : Any> ...
class List<A: Any> ...



Testing

An extensive collection of unit tests are provided by the separate IntelliJ project DogsTests. The project contains a number of class files that provide the unit tests for the Dogs classes. For example, the file OptionTests carries the unit tests for the Option class.

The IntelliJ project DogsPBTests is a collection of property-based tests for some of the fundamental laws that apply to the Dogs classes. These property-based tests are conducted by the KwikCheck framework. KwikCheck is a property-based testing framework written entirely in Kotlin and build atop the Dogs library.



Documentation

The following series of blogs present a user description of some of the data structures included in Dogs. The Function type article describes Kotlin functions especially those features used across the library, such as curried functions, function composition or partial function application.




The code for the Dogs library can be found at:

https://github.com/KenBarclay/TBA

No comments: