Thursday, June 21, 2012

function(groovy) #5


function(groovy) #5

If all you have is a hammer, everything looks like a nail.

Abstract: One of the challenges of a new paradigm is learning to think in an entirely new way. The ability to pass code as a parameter introduces a new style to the imperative programmer. A declarative style abstracts out generic pieces of machinery and customizes them via higher-order functions.

Ken Barclay
kenbarc@googlemail.com

Previous:           The iterator methods
Resources:        Chapter 12 Higher Order Functions

The iterator functions

Many functions to perform functional style List manipulation are defined as static methods and closures in the companion class GListF. The class is defined as abstract so there is no intention of creating an instance. It simply acts as a repository for these useful functions. The arrangement is not unlike the latest Groovy extension methods.

For example, this class defines the function map that transforms a list of values in the manner of the GDK's collect method using a function as its transformer. The map function is provided in two flavors: a curried form and an uncurried form. Of course, the function C is used to define the curried version from its uncurried version. The curried version is given the suffix C in its name:

/**
 * Function map takes two parameters: a function and a list. Function map
 *   applies the function parameter to each item in the list, delivering
 *   a new list.
 *
 * map:: (X -> Y) * [X] -> [Y]
 * mapC:: (X -> Y) -> [X] -> [Y]
 *
 * @param fxy     pure function:: X -> Y
 * @param ys      existing list
 * @return        new list of transformed values
 */
private static final <X, Y> List<Y> _map(final Closure<Y> f, final List<X> xs) { xs.collect(f) }
public static final map = GListF.&_map
public static final mapC = C(map)

For most applications function map will suffice. However, in some advanced examples we will find we wish to call mapC with its single function parameter.

Observe how the behavior is first defined with the _map method. The closures map and mapC are defined from it. The &. operator is used to get a reference to a method, and from it create a reference to a closure. The aim of this scheme is to define the _map method with the generic type parameters X and Y. Including both the parameter type and result type fully documents the values involved in using this method. Groovy does not support generic type parameters when defining closures and so this approach circumvents this limitation. A Groovy static type checker would assist greatly.

All the methods/closures defined in GListF present the Groovy list with an immutable interface. As a consequence, some of the operations are not especially efficient. This issue is revisited in a later blog where we implement a more efficient immutable list. Notwithstanding this, we can begin presenting solutions that are more declarative than would be possible using Groovy lists directly.



Mapping

The GDK's collect iterator method is used to iterate through the elements of the receiving list object and transforms each element into a new value using the closure parameter as a transformer, returning a list of transformed values.

A pictorial representation of how map behaves is shown in Figure 5.1. On the left we show some function (closure) f which when applied to a single parameter (represented by the circle) delivers a value (the square). Now, if we have a list of values of type circle, then map f list delivers a list of values of type square produced by applying f to each of the originals.

Figure 5.1: Application of map


As we noted, the companion class GListF contains a rich selection of functions (static closure definitions) for processing lists. Example 5.1 revisits the previous example, this time using the mapC function from GListF.

Example 5.1: Curried map

import static com.adt.groovy.data.immutable.list.GListF.mapC

    // multiply:: Integer * Integer -> Integer
final multiply = { final int x, final int y -> x * y}
    // twice:: Integer -> Integer
final twice = multiply.curry(2)
    // quadruple:: Integer -> Integer
final quadruple = multiply.curry(4)

    // twiceAll:: [Integer] -> [Integer]
final twiceAll = mapC(twice)
    // quadrupleAll:: [Integer] -> [Integer]
final quadrupleAll = mapC(quadruple)

final List<Integer> numbers = [1, 2, 3, 4]
assert twiceAll(numbers) == [2, 4, 6, 8]
assert quadrupleAll(numbers) == [4, 8, 12, 16]

It is relatively easy to prove that map(f g, xs) = (map f map g)(xs), where ● represents function composition. Applying f g to every element of a list is the same as applying g to every member of the list and then applying f to every member of the resulting list. The equality demonstrates how we might refactor our code to make it more efficient. If our code includes (map f map g) then we optimize it with map(f g). The former makes one pass across the initial list then a second pass across the result list. The latter makes only a single pass across the initial list reducing the amount of work to be done. This equality is demonstrated in Example 5.2.

Example 5.2: Map composition equality

import static com.adt.groovy.data.immutable.list.GListF.mapC
import static com.adt.groovy.data.immutable.list.GListF.map

    // twice:: Integer -> Integer
final twice = { final int x -> 2 * x}
    // successor:: Integer -> Integer
final successor = { final int x -> 1 + x}

    // composeTwiceSuccessor:: Integer -> Integer
final composeTwiceSuccessor = twice >> successor

    // composeMapTwiceMapSuccessor:: [Integer] -> [Integer]
final composeMapTwiceMapSuccessor = mapC(twice) >> mapC(successor)

final List<Integer> numbers = [1, 2, 3, 4]
assert map(composeTwiceSuccessor, numbers) == composeMapTwiceMapSuccessor(numbers)

The function composeTwiceSuccessor transforms an Integer into another Integer. The left side of the assert applies this transformation to the list of numbers. On the right side of the assert the function composeMapTwiceMapSuccessor first doubles the values in the numbers list, then increments by one the values in the resulting list.



Filtering

It is also common to select all the elements from a list with some given property. For example we might wish to choose only those overdrawn bank accounts from a list of accounts. We usually describe this as filtering. Example 5.3 introduces the higher order function filter in which the selection criterion is specified by the function parameter. The selection function is sometimes known as a predicate ― a function returning a Boolean value which describes whether the criterion has been met.

Example 5.3: The filter function

import static com.adt.groovy.data.immutable.list.GListF.filter

    // isEven:: Integer -> Boolean
final isEven = { final int x -> (x % 2 == 0)}

final List<Integer> numbers = [11, 12, 13, 14]
assert filter(isEven, numbers) == [12, 14]
assert filter(isEven, []) == []

def bk1 = new Book(title: 'Groovy Programming', author: 'Barclay', price: 22.90, category: 'CompSci')
def bk2 = new Book(title: 'Groovy in Action', author: 'Konig', price: 32.00, category: 'CompSci')
def bk3 = new Book(title: 'Grails', author: 'Rocher', price: 29.00, category: 'CompSci')
def bk4 = new Book(title: 'Thinking in Java', author: 'Eckel', price: 24.50, category: 'CompSci')

def List<Book> books = [bk1, bk2, bk3, bk4]

assert filter({bk -> (bk.price < 25.00)}, books) == [bk1, bk4]

The GDK's iterator method findAll is used to implement the filter function.

Earlier we showed how proving equality of two mappings can lead to more efficient programs. We can also show that:

filter p ● map f = map f ● filter(p ● f)

The equation says that a map followed by a filter can be replaced by a filter followed by a map. The right side is potentially more efficient than the left since the map could be applied to a shorter list. This is demonstrated by Example 5.4.

Example 5.4: Maps and filters

import static com.adt.groovy.data.immutable.list.GListF.filterC
import static com.adt.groovy.data.immutable.list.GListF.mapC

    // isWrittenBy:: String -> Book -> Boolean
final isWrittenBy = { final String auth -> { final Book book -> (book.author == auth)}}

    // priceIncrease:: BigDecimal -> Book -> Book
final priceIncrease = { final BigDecimal amount -> { final Book book -> new Book(title: book.title, author: book.author, price: book.price + amount, category: book.category)}}

final mapThenFilter = mapC(priceIncrease(2.00)) >> filterC(isWrittenBy('Ken Barclay'))
final filterThenMap = filterC(priceIncrease(2.00) >> isWrittenBy('Ken Barclay')) >> mapC(priceIncrease(2.00))

final book1 = new Book(title: 'Groovy Programming', author: 'Ken Barclay', price: 20.45, category: 'CompSci')
final book2 = new Book(title: 'Groovy in Action', author: 'Dierk Konig', price: 32.00, category: 'CompSci')
final book3 = new Book(title: 'Object Oriented Design', author: 'Ken Barclay', price: 25.50, category: 'CompSci')
final book4 = new Book(title: 'Programming Groovy', author: 'Venkat Subramaniam', price: 29.50, category: 'CompSci')
final List<Book> books = [book1, book2, book3, book4]

assert mapThenFilter(books) == filterThenMap(books)

Figure 5.2 describes the filter closure. If p represents some predicate function (closure, returning a boolean value), then when applied to a triangle it delivers true and when applied to a square it produces false. Now filter p list delivers a new list containing only those triangles that satisfy the predicate.

Figure 5.2: Application of filter




Folding

It is also common to reduce a collection to a single value. For example we might wish to sum or multiply all the elements in a list. This is the basis for a particular kind of higher order function called folding. Folding is used to fold a function (such as a summation) into a list of values.

The fold function reduces a list to a single value by "folding" a binary operator between successive elements of the list. For example, consider we have the list [x1, x2, x3, ..., xn], an initial value e, and the binary operator □, then:

fold □ e [x1, x2, x3, ..., xn] = e □ x1 □ x2 □ x3 □ ... □ xn

The fold function takes a binary function □, together with an initial value e and a list. Notice that the version of fold above is left-associative. In other words, the result it produces is equivalent to doing:

( ... (((e □ x1) □ x2) □ x3) ... □ xn)

Example 5.5 introduces the function foldLeft implemented using the GDK's each iterator method. It has three parameters: the binary function to be folded in, the initial value when starting the folding, and the list of values. The type signature for the function given as:

/**
 * foldLeft is a higher-order function that folds a binary function into a
 *   list of values. Effectively:
 *
 *   foldLeft(f, e, [x1, x2, ..., xn]) = (...((e f x1) f x2) f...) f xn
 *
 * foldLeft:: (Y -> X -> Y) * Y * [X] -> Y
 * foldLeftC:: (Y -> X -> Y) -> Y -> [X] -> Y
 *
 * @param f     pure binary function:: Y -> X -> Y
 * @param e       initial value
 * @param xs      existing list
 * @return        folded result
 */
private static final <X, Y> Y _foldLeft(final Closure<Y> f, final Y e, final List<X> xs) {
  Y res = e
  xs.each{ x -> res = f(res)(x) }
  return res
}
public static final foldLeft = GListF.&_foldLeft
public static final foldLeftC = C3(foldLeft)

Note that the function parameter is expected in its curried form. The function parameter combines a Y and an X to produce a Y. The initial value for foldLeft is of type Y and is combined with the first member of the list. The result of type Y is then combined with the second member of the list, and so on. Ultimately, a Y value is delivered. The companion class GListF includes the two variants foldLeft and foldLeftC.

Example 5.5: The foldLeft function

import static com.adt.groovy.fp.FunctionF.C
import static com.adt.groovy.data.immutable.list.GListF.appendC
import static com.adt.groovy.data.immutable.list.GListF.foldLeft

    // add:: Integer * Integer -> Integer
final add = { final int x -> { final int y -> x + y }}

final List<Integer> numbers = [11, 12, 13, 14]
final int sum = foldLeft(add, 0, numbers)
assert sum == 50

    // logicalAnd:: Boolean -> Boolean -> Boolean
final logicalAnd = { final boolean x -> { final boolean y -> x && y }}

final List<Boolean> truthful = [true, true, true]
final boolean allTruthful = foldLeft(logicalAnd, true, truthful)
assert allTruthful == true

final List<Integer> flatten = foldLeft(appendC, [], [[1, 2, 3], [4, 5], [6]])
assert flatten == [1, 2, 3, 4, 5, 6]

Folding is a surprisingly fundamental operation. Its generality allows us to define many standard list processing operations. Example 5.6 defines the primitive operation reverse in terms of a left fold. The initial value given to foldLeft is the empty list [] into which the reversed list is assembled. The binary function required by foldLeft must then have the signature [X] -> X -> [X] where the second parameter is prepended on to the list first parameter. Of course consC does just this but has its parameter in the opposite order. Functions flip and flipC performs the necessary transformation, converting the function X -> Y -> Z to Y -> X -> Z.

Example 5.6: Folding and primitive operations

import static com.adt.groovy.fp.FunctionF.flipC
import static com.adt.groovy.data.immutable.list.GListF.consC
import static com.adt.groovy.data.immutable.list.GListF.foldLeft

    // reverse:: [A] -> [A]
final reverse = {xs -> foldLeft(flipC(consC), [], xs)}

assert reverse([1, 2, 3, 4]) == [4, 3, 2, 1]
assert reverse(['Ken', 'John', 'Jessie']) == ['Jessie', 'John', 'Ken']

These examples illustrate how the functional style gives us new and different building blocks. Some of the examples used the curried variants mapC and filterC to produce specialist processing functions. No new code had to be developed other than the closure parameters. The curried forms are our factories for functions.

The examples also show a reduced use of control flow such as for statements. This offers many advantages. First, the code is easier to read and write. That means it will be easier to reason about its correctness. Control flow statements are usually accompanied by state variables that are updated. In its place we see code in which we sequentially bind results to identifiers then finish the code with the resulting expression.

These extension methods are not, in some cases. especially efficient. For example, the concatenate method produces a new list by joining together two existing lists. Immutable data structures, the subject of the next blog, can offer a solution.

2 comments:

Unknown said...

Sorry Ken, but it appears that the Figures are not showing up in the document?

Sean J. said...

Hello Ken,
I have really appreciated your writing on Groovy both here in your blog and in your book.

Apologies for using your blog comment to ask this, as it's not related to the post, but I'm running out of paths to the answer.

The Groovy Programming Book solutions/example/code site put up via the University is as I'm sure you know, down. Do you have an alternate location where I might find those. Or can you email them to me directly. I'm a new programmer and would like the assurance that I'm going about groovy properly.
Thanks.
seanjohnson20@gmail.com