In this blog we introduce the notion of function types. A function type is syntactic sugar for a (function) interface which is not used directly. These function types have a special notation that corresponds to the signature of functions describing their parameter types and their return type.
In
this blog we also introduce higher
order functions: functions that are parameters to other functions and the
results of executing functions. It turns out that they are a highly expressive mechanism for defining
behaviors without recourse to more primitive programming constructs such as
for loops and if statements. They are a really
powerful way of solving problems and thinking about programs.
This material is a precursor to a series of blogs on immutable and persistent data types in Kotlin. The features described here are included in the custom library Dogs consisting of immutable and persistent data types as well as a mix of functional primitives.
Functions
Kotlin supports
functions at a number of levels. A function can be declared at package level.
We might develop an application in terms of these functions. This might be appropriate for a utility application that performs some simple processing of,
say, data stored in a file or spreadsheet. The data might be represented as,
say, data classes but the program architecture is procedural.
Functions can also be declared in
the body of other functions. Such local functions have a support role to the
enclosing function. These nested functions are inaccessible to other parts of
the code and so reduce the concepts surrounding the supported function.
Member
functions in a class define the behaviors of object instances of that class.
Extension functions in Kotlin enable you to add new
functions to existing classes without creating a new derived class,
recompiling, or otherwise modifying the original class. Extension functions are
a special kind of ‘static’ function, but they are called as if they were
instance functions on the extended type. For client code there is no apparent
difference between calling an extension function and the member functions that are
actually defined in a class.
Below are a number of simple functions.
fun negate(n: Int): Int = -n // negate: (Int) -> Int
fun sum(m: Int, n: Int): Int = m + n // sum: (Int, Int) -> Int
fun same(text: String, len: Int): Boolean = // same: (String, Int) -> Boolean
(text.length == len)
assertEquals(-4, negate(increment(3)))
assertEquals(12, sum(3, sum(4, 5)))
assertEquals(true, same("four + nine", 13))
Function types
In Kotlin
functions are first class citizens. The implication of this is that a
function can be treated like any other object. For example, a String object can be referenced in a
value binding, can be passed as the actual parameter in a function call, can be
the return value from a function call, can be a property of a class, can be
an element of a container such as a List,
and can be considered an object instance of the String class and call one of its member functions on it (such as
the length member function or the indexOf member function).
These
features of a String object are also
applicable to a function object. A
function can be referenced in a value binding, can be passed as the actual
parameter in a function call or be the return value from a function call
(giving rise to higher order functions), can be the property of a class, can be
an element of a container such as a List,
and can be considered an object instance of some function class and call one of
its member functions on it (such as the invoke
member function).
Kotlin
is a statically typed programming language so we are required to provide the
type of each object either explicitly or by type inference. To use a function
we must have a way to present a function
type. A function type is syntactic sugar for a (function) interface which
is not used directly. The syntax of a function type is:
(parameter type,
parameter type, …) -> return type
with
the left side specifying the function parameter types enclosed in parentheses,
and the right side specifying the return type of the function.
These function types have a special notation that corresponds to the signature of functions describing their parameter types and their return type. In the following listing the val bindings for inc, neg and add include these function types.
fun increment(n: Int): Int = 1 + n // increment: (Int) -> Int
fun negate(n: Int): Int = -n // negate: (Int) -> Int
fun sum(m: Int, n: Int): Int = m + n // sum: (Int, Int) -> Int
val inc: (Int) -> Int = ::increment
val neg: (Int) -> Int = ::negate
val add: (Int, Int) -> Int = ::sum
assertEquals(-4, negate(increment(3)))
assertEquals(12, sum(3, sum(4, 5)))
assertEquals(-4, neg(inc(3)))
assertEquals(12, add(3, add(4, 5)))
assertEquals(-4, neg.invoke(inc.invoke(3)))
assertEquals(12, add.invoke(3, add.invoke(4, 5)))
To obtain an instance of a function type we use a callable reference to an existing function declaration with the expression ::increment which is the value of the function type (Int) -> Int.
There are several other ways to obtain an instance of a function type. Two of the more common are anonymous functions and lambda expressions. An
anonymous function (or function expression) has a regular
function declaration except its name is omitted. The parameter types and return
type are specified as for regular functions. The body of an anonymous function
can be an expression or a block. The return type of the anonymous function
can be inferred automatically from the function if it is an expression and has
to be specified explicitly for the anonymous function if it is a block. Here are some examples:
val isOdd = fun(n: Int): Boolean = (n % 2 == 1)
val isEven: (Int) -> Boolean = fun(n) = (n % 2 == 0)
val add: (Int, Int) -> Int = fun(m: Int, n: Int): Int = (m + n)
val sum: (Int, Int) -> Int = add
assertEquals(true, isOdd(5))
assertEquals(false, isEven(5))
assertEquals(7, add(3, 4))
assertEquals(7, sum(3, 4))
The val binding isOdd has the regular function declaration (with its name omitted). The val binding for isEven includes a function type so the types of the parameter and the return type can be omitted. The val binding for add redundantly includes both the function type and the function parameter and return type. Finally, the sum val binding includes the function type but this could be omitted as the Kotlin compiler will infer the correct type.
Lambda expressions (or
function literals) are essentially anonymous functions that we can treat as
values – we can, for example,
pass them as arguments to functions, return them, or do any other thing we
could do with a normal object. A lambda expression is presented as a code
block:
{parameter: type, parameter: type, … -> code body}
where the type can be omitted if
it can be inferred. Like anonymous
functions, lambda expressions are instances of some appropriate interface with an invoke member function. Then { ...
}.invoke(...) executes the function invoke
on the lambda expression { ... } and passing the actual parameters ( ... ). Here are some examples:
assertEquals(9, {n: Int -> n * n}.invoke(3))
assertEquals(9, {n: Int -> n * n}(3)) // invoke is inferred
assertEquals(3, {n: Int -> n + 1}(2)) // invoke is inferred
Since
a lambda expression is a Kotlin object, then we can bind an identifier to it.
The identifier is a reference to that function and the function can be called
through the identifier:
val square = {n: Int -> n * n}
val increment = {n: Int -> n + 1}
assertEquals(9, square.invoke(3))
assertEquals(9, square(3)) // invoke is inferred
assertEquals(3, increment(2)) // invoke is inferred
The
full syntactic form for binding a lambda expression would be shown by the
following example:
val sum: (Int, Int)
-> Int = {m: Int, n: Int -> m + n}
The
formal parameters are presented exactly as for a normal function: the name and
type for each parameter. The lambda expression is enclosed by matching braces
and the parameter list and function body is separated by ->.
More commonly we will associate the
function type with the bound reference. The parameter types and the return
type represent the function type and would look like:
val sum: (Int, Int)
-> Int = {m, n -> m + n}
Here,
sum is bound to a function that
expects two Ints and returns an Int.
Simple as these function type
signatures may appear, they are the key to unlocking some of the more advanced
programming we shall encounter with functions. The higher-order extension
functions for the custom List
class are examples of the more powerful capabilities possible with lambda
expressions.
The arrow notation in function types is right associative. The function type (String) -> (Int) -> Boolean describes a function that accepts a single String parameter and returns a function with the type (Int) -> Boolean. Since parentheses can be used in function types the latter is equivalent to (String) -> ((Int) -> Boolean). Notice that the function type ((String) -> Int) -> Boolean describes a function that returns a Boolean and take a single function parameter of the type (String) -> Int.
The arrow notation in function types is right associative. The function type (String) -> (Int) -> Boolean describes a function that accepts a single String parameter and returns a function with the type (Int) -> Boolean. Since parentheses can be used in function types the latter is equivalent to (String) -> ((Int) -> Boolean). Notice that the function type ((String) -> Int) -> Boolean describes a function that returns a Boolean and take a single function parameter of the type (String) -> Int.
Higher order functions
A function can
be passed as a parameter to another function. A function can be the return value
from the call to a function. Such functions are called higher order
functions. Through functions as parameters and functions as results we are
able to fully exploit the machinery of functions, such as function composition.
The resulting function declarations can be both more concise and more readable
than traditional approaches.
In the following listing function doTwice is a higher order function as it expects a function as its second parameter. The interpretation of the signature for doTwice is that it returns an Int as its value. Two parameters are required, the first of which is an Int. The second parameter is the function parameter. The effect of doTwice is to apply the function twice to the numeric parameter.
In the following listing function doTwice is a higher order function as it expects a function as its second parameter. The interpretation of the signature for doTwice is that it returns an Int as its value. Two parameters are required, the first of which is an Int. The second parameter is the function parameter. The effect of doTwice is to apply the function twice to the numeric parameter.
fun triple(n: Int): Int = 3 * n // as function
val quadruple: (Int) -> Int = {n -> 4 * n} // as lambda expression
val twelveTimes = fun(n: Int): Int = triple(quadruple(n)) // anonymous function
fun doTwice(n: Int, f: (Int) -> Int): Int = f(f(n))
assertEquals(6, triple(2))
assertEquals(12, quadruple(3))
assertEquals(36, twelveTimes(3))
assertEquals(18, doTwice(2, ::triple))
assertEquals(48, doTwice(3, quadruple))
assertEquals(144, doTwice(1, twelveTimes))
assertEquals(75, doTwice(3, {n -> 5 * n}))
assertEquals(75, doTwice(3){n -> 5 * n})
Consider
a function with two parameters of
type String and Int returning a Boolean
result. If we are required to partially
apply the function (see below) on the Int parameter then we require the Int to be the first parameter. We define a
function that takes a binary function as parameter, and which returns a binary
function that accepts its parameters in the opposite order. The flip function is declared as:
fun <A, B, C> flip(f: (A, B) -> C):
(B, A) -> C =
{b:
B, a: A -> f(a, b)}
We use a lambda
expression for the result. The function is defined generically so it can be
applied to all parameter types. Function flip
is a generic higher order function taking a function parameter and delivering a
function result.
fun isSameSize(str: String, n: Int): Boolean = (str.length == n)
val isSame: (Int, String) -> Boolean = flip(::isSameSize)
assertTrue(isSame(3, "Ken"))
assertFalse(isSame(3, "John"))
The
function flip is provided by the Dogs custom library. It is defined in the object declaration FunctionF from the com.adt.kotlin.dogs.fp package. It is overloaded to operate on both an uncurried binary function and a curried binary function (see later).
Function composition
Perhaps one of the more important
characteristics of functions is composition, wherein you can define one
function whose purpose is to combine other functions. Using composition, two or
more simple functions can be combined to produce a more elaborate one. For example,
suppose we have two arithmetic functions f
and g. One way of composing these two
functions would be to first compute y, and
then to compute z from y,
as in y = g(x) followed by z = f(y). One could get the same
result with the single composition written z
= f(g(x)). We write this as f ● g, meaning function f
composed with function g (sometimes also described as f after g).
Function
composition lets us compose a solution by combining separately designed and
implemented functions. Structuring a large solution into separate functions
simplifies the programming activity. The functions can be defined, implemented
and tested separately. Simple functions can be used to construct more elaborate
functions, reducing the burden and complexity of a system. The functions can
also be defined for one application and re-used in other applications.
fun f(n: Int): Int = 2 * n + 3
fun g(n: Int): Int = n * n
val gf: (Int) -> Int = compose(::g, ::f)
val fg1: (Int) -> Int = ::f compose ::g
val fg2: (Int) -> Int = ::f o ::g
assertEquals(121, gf(4))
assertEquals(35, fg1(4))
assertEquals(35, fg2(4))
Pay attention to the signature of compose.
It is a function that takes two functions as its parameters and returns a
function as its result. The second function g accepts an input of the type A and delivers a result of type B.
The result is passed as input to the first function f and delivers a result of type C. Hence the composed function has
the signature (A) -> C:
fun <A, B, C> compose(f: (B) -> C, g: (A) -> B): (A) -> C = {a: A -> f(g(a))}
The function compose is provided by the Dogs library. Note the complementary infix functions compose and o.
Both are defined as an infix function on
the interface that represents the function type (B) -> C. They expects a
function parameter with type (A) -> B and composes them using compose to deliver the function (A)
-> C.
In the following example function compose is used to compose isEven and strLength. In some circumstances the Kotlin compiler is unable to infer the actual types for the type parameters A, B and C. In that case you must assist the compiler and provide the actual type parameters. This is shown in the call to compose, though in this case it is unnecessary since the compiler correctly infers these types.
fun strLength(str: String): Int = str.length
fun isEven(n: Int): Boolean = (n % 2 == 0)
val strIsEven: (String) -> Boolean = compose(::isEven, ::strLength)
assertEquals(false, strIsEven("Ken"))
assertEquals(true, strIsEven("John"))
In the following example function compose is used to compose isEven and strLength. In some circumstances the Kotlin compiler is unable to infer the actual types for the type parameters A, B and C. In that case you must assist the compiler and provide the actual type parameters. This is shown in the call to compose, though in this case it is unnecessary since the compiler correctly infers these types.
fun strLength(str: String): Int = str.length
fun isEven(n: Int): Boolean = (n % 2 == 0)
val strIsEven: (String) -> Boolean = compose(::isEven, ::strLength)
assertEquals(false, strIsEven("Ken"))
assertEquals(true, strIsEven("John"))
The order in f compose g is significant and can be confusing: f compose g means apply g and then apply f to the result. The Dogs library also declares the forwardCompose function which takes its parameters in the opposite order to compose:
fun <A, B, C> forwardCompose(f: (A) -> B, g: (B) -> C): (A) -> C = {a: A -> g(f(a))}
The library also declares the infix forms forwardCompose, fc and pipe:
val successor: (Int) -> Int = {n -> 1 + n} // as function literal
fun triple(n: Int): Int = 3 * n // as function
val quadruple = fun(n: Int): Int = 4 * n // as function expression
val twelveTimes: (Int) -> Int = compose(::triple, quadruple)
val threePlusOne: (Int) -> Int = ::triple fc successor
assertEquals(60, twelveTimes(5))
assertEquals(16, threePlusOne(5))
val up2: (Int) -> Int = successor o successor
val upDown: (Int) -> Int = successor o {n -> n - 1}
val up2Down2: (Int) -> Int = up2 pipe {n -> n - 2}
assertEquals(5, up2(3))
assertEquals(5, upDown(5))
assertEquals(5, up2Down2(5))
The function type signatures are an
important resource in this work. They alert us when compositions would not
work. For example, consider that we have the function f: (Int) -> Boolean
and the function g: (String) -> Int. The signature for compose tells us compose(f, g) will produce a function with the
signature (String) -> Boolean. Equally, the signature reveals that
compose(g, f) will not compose correctly.
It is relatively easy to prove that map(f ● g, xs) = (map f ● map g)(xs) for a List xs. Applying f ● g to every element of a list is the same as applying g to every member of the list and then applying f to every member of the resulting list. The equality demonstrates how we might refactor our code to make it more efficient. If our code includes (map f ● map g) then we optimize it with map(f ● g). The former makes one pass across the initial list and then a second pass across the result list. The latter makes only a single pass across the initial list reducing the amount of work to be done.
val f: (Int) -> Int = {n -> 2 * n}
val g: (Int) -> Int = {n -> n + 1}
val map: ((Int) -> Int) -> (List<Int>) -> List<Int> = {f -> {list -> list.map(f)}}
val fg: (Int) -> Int = f compose g
val mapfmapg: (List<Int>) -> List<Int> = map(f) compose map(g)
val numbers: List<Int> = ListF.of(1, 2, 3, 4)
assertEquals(map(fg)(numbers), mapfmapg(numbers))
It is relatively easy to prove that map(f ● g, xs) = (map f ● map g)(xs) for a List xs. Applying f ● g to every element of a list is the same as applying g to every member of the list and then applying f to every member of the resulting list. The equality demonstrates how we might refactor our code to make it more efficient. If our code includes (map f ● map g) then we optimize it with map(f ● g). The former makes one pass across the initial list and then a second pass across the result list. The latter makes only a single pass across the initial list reducing the amount of work to be done.
val f: (Int) -> Int = {n -> 2 * n}
val g: (Int) -> Int = {n -> n + 1}
val map: ((Int) -> Int) -> (List<Int>) -> List<Int> = {f -> {list -> list.map(f)}}
val fg: (Int) -> Int = f compose g
val mapfmapg: (List<Int>) -> List<Int> = map(f) compose map(g)
val numbers: List<Int> = ListF.of(1, 2, 3, 4)
assertEquals(map(fg)(numbers), mapfmapg(numbers))
Perhaps
rather pedantically the full signature for function literals will be given.
Kotlin's type inferencing engine would make some parts of the signature
unnecessary. Their inclusion is for the benefit of the reader. Their absence may make some of the code opaque to the reader and so the full signatures are
retained.
Curried functions
In
this section we demonstrate how a single function can effectively define a
complete family of functions. Having defined the parent function, we are able
to specialize from it a number of related functions. To achieve this we use the
notion of currying, named after the
mathematician Haskell Curry.
Currying is the process of transforming a function that takes
multiple parameters into a function that takes just a single parameter and
returns another function which accepts further parameters.
Consider the function add
which adds together two Int
parameters. Its definition is given by:
fun add(m: Int, n:
Int): Int = m + n
The
type signature informs us that add is a function which when given a pair of Ints
will return an Int. If we were
able to call function add and give only one actual parameter, then
we would get a result which is itself another function. For example, if we
could call add with the parameter 1, then effectively the
first formal parameter m is replaced throughout the definition for add
with this value. Further, since the value for the parameter m
is now specified, then this formal parameter can be removed from the
definition. We then effectively produce the function:
fun add1(n: Int): Int =
1 + n
one that takes a single numeric parameter and
returns its successor to that value.
In
the following listing we declare the add
function in the usual way and use it in the first assert. Since a function can
be the result for a function, then any multi-parameter function can be
converted into a series of one parameter functions. The function sum is defined to operate like add but
in its curried form. We execute sum with the call sum(6)(7). Observe how sum10 is defined by calling sum with its first parameter delivering a function that adds 10 to its parameter.
fun add(m: Int, n: Int): Int = m + n
val sum: (Int) -> (Int) -> Int = {m: Int -> {n: Int -> m + n}}
val sum10: (Int) -> Int = sum(10)
assertEquals(7, add(3, 4))
assertEquals(13, sum(6)(7))
assertEquals(17, sum10(7))
Suppose
we want to develop a curried version of a binary function f which is uncurried and
of the type (A, B) -> C. This binary function f expects its parameters as a pair but its curried version will
apply them sequentially. We therefore have to form them into a pair before
applying f to them. The custom function C2 does exactly this:
fun <A, B, C> C2(f:
(A, B) -> C): (A) -> (B) -> C = {a: A -> {b: B -> f(a, b)}}
Again,
the signature reveals what is happening. Function C2 expects a function as its parameter. That parameter function has
the signature (A, B) -> C, an uncurried binary function. Function C2 returns the curried variant with the
signature (A) -> (B) -> C, as required. We can use C2 where we need to transform an uncurried binary function into its
curried form:
fun add(m: Int, n: Int): Int = m + n
val sum: (Int) -> (Int) -> Int = C2(::add)
assertEquals(7, add(3, 4))
assertEquals(13, sum(6)(7))
The Dogs library define this function C2 as well as the related functions C3, C4, C5 and C6. The library also includes the complementary functions U2, U3, U4, U5 and U6 to uncurry curried functions.
Partial function application
A key value of curried functions is that they can be partially applied. For example consider
the curried binary function sum
declared in the earlier example. In the listing the binding sum10 is a function from Int to Int obtained by partially applying the
function sum. In effect the function
sum10 is defined as:
val sum10: (Int) ->
Int = {n: Int -> 10 + n}
The Dogs library includes a number of variants of the function partial. Here, we are using:
fun <A, B, C> partial(a: A, f: (A) -> (B) -> C): (B) -> C =
{b: B -> f(a)(b)}
in the following code:
fun add(m: Int, n: Int): Int = m + n
val sum: (Int) -> (Int) -> Int = {m: Int -> {n: Int -> m + n}}
val sum1: (Int) -> Int = partial(1, sum)
val add10: (Int) -> Int = partial(10, ::add)
assertEquals(11, sum1(10))
assertEquals(12, add10(2))
Further callable references
A callable reference to an existing function declaration was introduced earlier. A callable reference to a class member function or to a an extension function is also supported. In the following listing the first three val bindings refer to function members of the String class. In the callable reference String::length the function length of class String has no parameters and returns an Int. The function type for this includes the class String as the first function type parameter.
val strLength: (String) -> Int = String::length
val subStr: (String, IntRange) -> String = String::substring
val subStrC: (String) -> (IntRange) -> String = C2(String::substring)
val crop: (String) -> String = String::trim
val clip: (String, (Char) -> Boolean) -> String = String::trim
val take: (Int) -> (String) -> String = flip(C2(String::take))
val take4: (String) -> String = take(4)
assertEquals(3, strLength("Ken"))
assertEquals("nn", subStr("Kenneth", 2..3))
assertEquals("nn", subStrC("Kenneth")(2..3))
assertEquals("ken", crop(" ken "))
assertEquals("ken", clip("123ken4", Character::isDigit))
assertEquals("Kenn", take4("Kenneth"))
assertEquals("Barc", take4("Barclay"))
The callable reference operator :: can also be used with overloaded functions when the expected type is known from the context. This is shown using the overloaded substring function from class String.
At the time of writing (April 2020) there are some problems when using callable references to function members of a generic class. Workarounds are possible and need to be used.
Case Study
This example includes a number of individual functions which together compile the index of the words in a text document. The index is an alphabetical list of the words and, for each word, the line numbers in the document on which it appears. The solution is a composition of the functions that perform the various operations on the text.
We start by introducing a number of aliases that make it easier to interpret the signatures of the various functions:
typealias Document = String
typealias Line = String
typealias Word = String
typealias Number = Int
typealias LineNumber = Pair<Line, Number>
typealias WordNumber = Pair<Word, Number>
typealias WordNumbers = Pair<Word, List<Number>>
The use of Document, Line and Word, for example, help to distinguish how a String is being used.
We start with a Document and break it into one or more Line at the line breaks. The function to do this is called lines and has the signature:
fun lines(document: Document): List<Line>
Once we have the lines from the document we use the function numberedLines to convert each Line to a LineNumber, pairing the line with its line number:
fun numberedLines(lines: List<Line>): List<LineNumber>
At this point we have the means to convert a document into its lines and their corresponding line number:
::lines pipe ::numberedLines
The next task is to break the numbered lines into words numbered with their line number:
fun numberedWords(lines: List<LineNumber>): List<WordNumber>
This function additionally transforms each word to lowercase so that, for example, all the 'anonymous' and 'Anonymous' are considered identical. To group the word number pairs we sort them:
fun sort(numberedWords: List<WordNumber>): List<WordNumber>
then merge the line numbers of adjacent matching words:
fun merge(words: List<WordNumber>): List<WordNumbers>
Finally we provide the means whereby we select the first n elements:
val take: (Int) -> (List<WordNumbers>) -> List<WordNumbers>
Now we can define:
fun wordConcordance(n: Int): (Document) -> List<WordNumbers> =
::lines pipe ::numberedLines pipe ::numberedWords pipe ::sort pipe ::merge pipe take(n)
The declaration of wordConcordance is a pipeline of six component functions glued through functional composition.
We run our code against the following document:
There are several other ways to obtain an instance of a function type
Two of the more common are anonymous functions and lambda expressions
An anonymous function or function expression has a regular function
declaration except its name is omitted The parameter types and return
type are specified as for regular functions The body of an anonymous
function can be an expression or a block The return type of the anonymous
function can be inferred automatically from the function if it is an expression
and has to be specified explicitly for the anonymous function if it is a block
Executing concordance(7)(document) and printing the result delivers:
<[(a, <[1, 3, 6, 8]>), (an, <[1, 3, 5, 6, 7]>), (and, <[2, 4, 8]>), (anonymous, <[2, 3, 5, 6, 8]>), (are, <[1, 2, 5]>), ... ]>
val strLength: (String) -> Int = String::length
val subStr: (String, IntRange) -> String = String::substring
val subStrC: (String) -> (IntRange) -> String = C2(String::substring)
val crop: (String) -> String = String::trim
val clip: (String, (Char) -> Boolean) -> String = String::trim
val take: (Int) -> (String) -> String = flip(C2(String::take))
val take4: (String) -> String = take(4)
assertEquals(3, strLength("Ken"))
assertEquals("nn", subStr("Kenneth", 2..3))
assertEquals("nn", subStrC("Kenneth")(2..3))
assertEquals("ken", crop(" ken "))
assertEquals("ken", clip("123ken4", Character::isDigit))
assertEquals("Kenn", take4("Kenneth"))
assertEquals("Barc", take4("Barclay"))
The callable reference operator :: can also be used with overloaded functions when the expected type is known from the context. This is shown using the overloaded substring function from class String.
At the time of writing (April 2020) there are some problems when using callable references to function members of a generic class. Workarounds are possible and need to be used.
Case Study
This example includes a number of individual functions which together compile the index of the words in a text document. The index is an alphabetical list of the words and, for each word, the line numbers in the document on which it appears. The solution is a composition of the functions that perform the various operations on the text.
We start by introducing a number of aliases that make it easier to interpret the signatures of the various functions:
typealias Document = String
typealias Line = String
typealias Word = String
typealias Number = Int
typealias LineNumber = Pair<Line, Number>
typealias WordNumber = Pair<Word, Number>
typealias WordNumbers = Pair<Word, List<Number>>
The use of Document, Line and Word, for example, help to distinguish how a String is being used.
We start with a Document and break it into one or more Line at the line breaks. The function to do this is called lines and has the signature:
fun lines(document: Document): List<Line>
Once we have the lines from the document we use the function numberedLines to convert each Line to a LineNumber, pairing the line with its line number:
fun numberedLines(lines: List<Line>): List<LineNumber>
At this point we have the means to convert a document into its lines and their corresponding line number:
::lines pipe ::numberedLines
The next task is to break the numbered lines into words numbered with their line number:
fun numberedWords(lines: List<LineNumber>): List<WordNumber>
This function additionally transforms each word to lowercase so that, for example, all the 'anonymous' and 'Anonymous' are considered identical. To group the word number pairs we sort them:
fun sort(numberedWords: List<WordNumber>): List<WordNumber>
then merge the line numbers of adjacent matching words:
fun merge(words: List<WordNumber>): List<WordNumbers>
Finally we provide the means whereby we select the first n elements:
val take: (Int) -> (List<WordNumbers>) -> List<WordNumbers>
Now we can define:
fun wordConcordance(n: Int): (Document) -> List<WordNumbers> =
::lines pipe ::numberedLines pipe ::numberedWords pipe ::sort pipe ::merge pipe take(n)
The declaration of wordConcordance is a pipeline of six component functions glued through functional composition.
We run our code against the following document:
There are several other ways to obtain an instance of a function type
Two of the more common are anonymous functions and lambda expressions
An anonymous function or function expression has a regular function
declaration except its name is omitted The parameter types and return
type are specified as for regular functions The body of an anonymous
function can be an expression or a block The return type of the anonymous
function can be inferred automatically from the function if it is an expression
and has to be specified explicitly for the anonymous function if it is a block
Executing concordance(7)(document) and printing the result delivers:
<[(a, <[1, 3, 6, 8]>), (an, <[1, 3, 5, 6, 7]>), (and, <[2, 4, 8]>), (anonymous, <[2, 3, 5, 6, 8]>), (are, <[1, 2, 5]>), ... ]>
showing the word 'anonymous' appearing on lines 2, 3, 5, 6 and 8.
If we run concordance(4)(document) we have the following correct assertion:
assertEquals(
ListF.of(
WordNumbers("a", ListF.of(1, 3, 6, 8)),
WordNumbers("an", ListF.of(1, 3, 5, 6, 7)),
WordNumbers("and", ListF.of(2, 4, 8)),
WordNumbers("anonymous", ListF.of(2, 3, 5, 6, 8))
),
concordance
)
Here, our aim was to build a concordance of the words in a document, namely, a pairing of each word and the line numbers on which it appears. The Dogs library includes the MultiMap data type - a MultiMap is a special kind of Map: one that allows a collection of elements to be associated with each key. Using a MultiMap would have simplified this solution.
This simple idea scales to the more realistic business problem of an order taking system for a manufacturing company. The workflow for the order taking system can be broken into smaller and simpler stages: validation, pricing, etc. We can create a pipeline to represent the business process as:
class UnvalidatedOrder
class ValidatedOrder
class PricedOrder
class CompletedOrder
fun validateOrder(uvo: UnvalidatedOrder): ValidatedOrder = TODO()
fun priceOrder(vo: ValidatedOrder): PricedOrder = TODO()
fun completeOrder(po: PricedOrder): CompletedOrder = TODO()
val placeOrder: (UnvalidatedOrder) -> CompletedOrder = ::validateOrder pipe ::priceOrder pipe ::completeOrder
Of course, any of the stages might give rise to some sort of failure. For example, when validating an order we might find the unvalidated order may have omitted the quantity for an order item or referenced an unknown item and we will have to consider how to address failure.
The code for the Dogs library can be found at:
No comments:
Post a Comment