Thursday, June 21, 2012

function(groovy) #5


function(groovy) #5

If all you have is a hammer, everything looks like a nail.

Abstract: One of the challenges of a new paradigm is learning to think in an entirely new way. The ability to pass code as a parameter introduces a new style to the imperative programmer. A declarative style abstracts out generic pieces of machinery and customizes them via higher-order functions.

Ken Barclay
kenbarc@googlemail.com

Previous:           The iterator methods
Resources:        Chapter 12 Higher Order Functions

The iterator functions

Many functions to perform functional style List manipulation are defined as static methods and closures in the companion class GListF. The class is defined as abstract so there is no intention of creating an instance. It simply acts as a repository for these useful functions. The arrangement is not unlike the latest Groovy extension methods.

For example, this class defines the function map that transforms a list of values in the manner of the GDK's collect method using a function as its transformer. The map function is provided in two flavors: a curried form and an uncurried form. Of course, the function C is used to define the curried version from its uncurried version. The curried version is given the suffix C in its name:

/**
 * Function map takes two parameters: a function and a list. Function map
 *   applies the function parameter to each item in the list, delivering
 *   a new list.
 *
 * map:: (X -> Y) * [X] -> [Y]
 * mapC:: (X -> Y) -> [X] -> [Y]
 *
 * @param fxy     pure function:: X -> Y
 * @param ys      existing list
 * @return        new list of transformed values
 */
private static final <X, Y> List<Y> _map(final Closure<Y> f, final List<X> xs) { xs.collect(f) }
public static final map = GListF.&_map
public static final mapC = C(map)

For most applications function map will suffice. However, in some advanced examples we will find we wish to call mapC with its single function parameter.

Observe how the behavior is first defined with the _map method. The closures map and mapC are defined from it. The &. operator is used to get a reference to a method, and from it create a reference to a closure. The aim of this scheme is to define the _map method with the generic type parameters X and Y. Including both the parameter type and result type fully documents the values involved in using this method. Groovy does not support generic type parameters when defining closures and so this approach circumvents this limitation. A Groovy static type checker would assist greatly.

All the methods/closures defined in GListF present the Groovy list with an immutable interface. As a consequence, some of the operations are not especially efficient. This issue is revisited in a later blog where we implement a more efficient immutable list. Notwithstanding this, we can begin presenting solutions that are more declarative than would be possible using Groovy lists directly.



Mapping

The GDK's collect iterator method is used to iterate through the elements of the receiving list object and transforms each element into a new value using the closure parameter as a transformer, returning a list of transformed values.

A pictorial representation of how map behaves is shown in Figure 5.1. On the left we show some function (closure) f which when applied to a single parameter (represented by the circle) delivers a value (the square). Now, if we have a list of values of type circle, then map f list delivers a list of values of type square produced by applying f to each of the originals.

Figure 5.1: Application of map


As we noted, the companion class GListF contains a rich selection of functions (static closure definitions) for processing lists. Example 5.1 revisits the previous example, this time using the mapC function from GListF.

Example 5.1: Curried map

import static com.adt.groovy.data.immutable.list.GListF.mapC

    // multiply:: Integer * Integer -> Integer
final multiply = { final int x, final int y -> x * y}
    // twice:: Integer -> Integer
final twice = multiply.curry(2)
    // quadruple:: Integer -> Integer
final quadruple = multiply.curry(4)

    // twiceAll:: [Integer] -> [Integer]
final twiceAll = mapC(twice)
    // quadrupleAll:: [Integer] -> [Integer]
final quadrupleAll = mapC(quadruple)

final List<Integer> numbers = [1, 2, 3, 4]
assert twiceAll(numbers) == [2, 4, 6, 8]
assert quadrupleAll(numbers) == [4, 8, 12, 16]

It is relatively easy to prove that map(f g, xs) = (map f map g)(xs), where ● represents function composition. Applying f g to every element of a list is the same as applying g to every member of the list and then applying f to every member of the resulting list. The equality demonstrates how we might refactor our code to make it more efficient. If our code includes (map f map g) then we optimize it with map(f g). The former makes one pass across the initial list then a second pass across the result list. The latter makes only a single pass across the initial list reducing the amount of work to be done. This equality is demonstrated in Example 5.2.

Example 5.2: Map composition equality

import static com.adt.groovy.data.immutable.list.GListF.mapC
import static com.adt.groovy.data.immutable.list.GListF.map

    // twice:: Integer -> Integer
final twice = { final int x -> 2 * x}
    // successor:: Integer -> Integer
final successor = { final int x -> 1 + x}

    // composeTwiceSuccessor:: Integer -> Integer
final composeTwiceSuccessor = twice >> successor

    // composeMapTwiceMapSuccessor:: [Integer] -> [Integer]
final composeMapTwiceMapSuccessor = mapC(twice) >> mapC(successor)

final List<Integer> numbers = [1, 2, 3, 4]
assert map(composeTwiceSuccessor, numbers) == composeMapTwiceMapSuccessor(numbers)

The function composeTwiceSuccessor transforms an Integer into another Integer. The left side of the assert applies this transformation to the list of numbers. On the right side of the assert the function composeMapTwiceMapSuccessor first doubles the values in the numbers list, then increments by one the values in the resulting list.



Filtering

It is also common to select all the elements from a list with some given property. For example we might wish to choose only those overdrawn bank accounts from a list of accounts. We usually describe this as filtering. Example 5.3 introduces the higher order function filter in which the selection criterion is specified by the function parameter. The selection function is sometimes known as a predicate ― a function returning a Boolean value which describes whether the criterion has been met.

Example 5.3: The filter function

import static com.adt.groovy.data.immutable.list.GListF.filter

    // isEven:: Integer -> Boolean
final isEven = { final int x -> (x % 2 == 0)}

final List<Integer> numbers = [11, 12, 13, 14]
assert filter(isEven, numbers) == [12, 14]
assert filter(isEven, []) == []

def bk1 = new Book(title: 'Groovy Programming', author: 'Barclay', price: 22.90, category: 'CompSci')
def bk2 = new Book(title: 'Groovy in Action', author: 'Konig', price: 32.00, category: 'CompSci')
def bk3 = new Book(title: 'Grails', author: 'Rocher', price: 29.00, category: 'CompSci')
def bk4 = new Book(title: 'Thinking in Java', author: 'Eckel', price: 24.50, category: 'CompSci')

def List<Book> books = [bk1, bk2, bk3, bk4]

assert filter({bk -> (bk.price < 25.00)}, books) == [bk1, bk4]

The GDK's iterator method findAll is used to implement the filter function.

Earlier we showed how proving equality of two mappings can lead to more efficient programs. We can also show that:

filter p ● map f = map f ● filter(p ● f)

The equation says that a map followed by a filter can be replaced by a filter followed by a map. The right side is potentially more efficient than the left since the map could be applied to a shorter list. This is demonstrated by Example 5.4.

Example 5.4: Maps and filters

import static com.adt.groovy.data.immutable.list.GListF.filterC
import static com.adt.groovy.data.immutable.list.GListF.mapC

    // isWrittenBy:: String -> Book -> Boolean
final isWrittenBy = { final String auth -> { final Book book -> (book.author == auth)}}

    // priceIncrease:: BigDecimal -> Book -> Book
final priceIncrease = { final BigDecimal amount -> { final Book book -> new Book(title: book.title, author: book.author, price: book.price + amount, category: book.category)}}

final mapThenFilter = mapC(priceIncrease(2.00)) >> filterC(isWrittenBy('Ken Barclay'))
final filterThenMap = filterC(priceIncrease(2.00) >> isWrittenBy('Ken Barclay')) >> mapC(priceIncrease(2.00))

final book1 = new Book(title: 'Groovy Programming', author: 'Ken Barclay', price: 20.45, category: 'CompSci')
final book2 = new Book(title: 'Groovy in Action', author: 'Dierk Konig', price: 32.00, category: 'CompSci')
final book3 = new Book(title: 'Object Oriented Design', author: 'Ken Barclay', price: 25.50, category: 'CompSci')
final book4 = new Book(title: 'Programming Groovy', author: 'Venkat Subramaniam', price: 29.50, category: 'CompSci')
final List<Book> books = [book1, book2, book3, book4]

assert mapThenFilter(books) == filterThenMap(books)

Figure 5.2 describes the filter closure. If p represents some predicate function (closure, returning a boolean value), then when applied to a triangle it delivers true and when applied to a square it produces false. Now filter p list delivers a new list containing only those triangles that satisfy the predicate.

Figure 5.2: Application of filter




Folding

It is also common to reduce a collection to a single value. For example we might wish to sum or multiply all the elements in a list. This is the basis for a particular kind of higher order function called folding. Folding is used to fold a function (such as a summation) into a list of values.

The fold function reduces a list to a single value by "folding" a binary operator between successive elements of the list. For example, consider we have the list [x1, x2, x3, ..., xn], an initial value e, and the binary operator □, then:

fold □ e [x1, x2, x3, ..., xn] = e □ x1 □ x2 □ x3 □ ... □ xn

The fold function takes a binary function □, together with an initial value e and a list. Notice that the version of fold above is left-associative. In other words, the result it produces is equivalent to doing:

( ... (((e □ x1) □ x2) □ x3) ... □ xn)

Example 5.5 introduces the function foldLeft implemented using the GDK's each iterator method. It has three parameters: the binary function to be folded in, the initial value when starting the folding, and the list of values. The type signature for the function given as:

/**
 * foldLeft is a higher-order function that folds a binary function into a
 *   list of values. Effectively:
 *
 *   foldLeft(f, e, [x1, x2, ..., xn]) = (...((e f x1) f x2) f...) f xn
 *
 * foldLeft:: (Y -> X -> Y) * Y * [X] -> Y
 * foldLeftC:: (Y -> X -> Y) -> Y -> [X] -> Y
 *
 * @param f     pure binary function:: Y -> X -> Y
 * @param e       initial value
 * @param xs      existing list
 * @return        folded result
 */
private static final <X, Y> Y _foldLeft(final Closure<Y> f, final Y e, final List<X> xs) {
  Y res = e
  xs.each{ x -> res = f(res)(x) }
  return res
}
public static final foldLeft = GListF.&_foldLeft
public static final foldLeftC = C3(foldLeft)

Note that the function parameter is expected in its curried form. The function parameter combines a Y and an X to produce a Y. The initial value for foldLeft is of type Y and is combined with the first member of the list. The result of type Y is then combined with the second member of the list, and so on. Ultimately, a Y value is delivered. The companion class GListF includes the two variants foldLeft and foldLeftC.

Example 5.5: The foldLeft function

import static com.adt.groovy.fp.FunctionF.C
import static com.adt.groovy.data.immutable.list.GListF.appendC
import static com.adt.groovy.data.immutable.list.GListF.foldLeft

    // add:: Integer * Integer -> Integer
final add = { final int x -> { final int y -> x + y }}

final List<Integer> numbers = [11, 12, 13, 14]
final int sum = foldLeft(add, 0, numbers)
assert sum == 50

    // logicalAnd:: Boolean -> Boolean -> Boolean
final logicalAnd = { final boolean x -> { final boolean y -> x && y }}

final List<Boolean> truthful = [true, true, true]
final boolean allTruthful = foldLeft(logicalAnd, true, truthful)
assert allTruthful == true

final List<Integer> flatten = foldLeft(appendC, [], [[1, 2, 3], [4, 5], [6]])
assert flatten == [1, 2, 3, 4, 5, 6]

Folding is a surprisingly fundamental operation. Its generality allows us to define many standard list processing operations. Example 5.6 defines the primitive operation reverse in terms of a left fold. The initial value given to foldLeft is the empty list [] into which the reversed list is assembled. The binary function required by foldLeft must then have the signature [X] -> X -> [X] where the second parameter is prepended on to the list first parameter. Of course consC does just this but has its parameter in the opposite order. Functions flip and flipC performs the necessary transformation, converting the function X -> Y -> Z to Y -> X -> Z.

Example 5.6: Folding and primitive operations

import static com.adt.groovy.fp.FunctionF.flipC
import static com.adt.groovy.data.immutable.list.GListF.consC
import static com.adt.groovy.data.immutable.list.GListF.foldLeft

    // reverse:: [A] -> [A]
final reverse = {xs -> foldLeft(flipC(consC), [], xs)}

assert reverse([1, 2, 3, 4]) == [4, 3, 2, 1]
assert reverse(['Ken', 'John', 'Jessie']) == ['Jessie', 'John', 'Ken']

These examples illustrate how the functional style gives us new and different building blocks. Some of the examples used the curried variants mapC and filterC to produce specialist processing functions. No new code had to be developed other than the closure parameters. The curried forms are our factories for functions.

The examples also show a reduced use of control flow such as for statements. This offers many advantages. First, the code is easier to read and write. That means it will be easier to reason about its correctness. Control flow statements are usually accompanied by state variables that are updated. In its place we see code in which we sequentially bind results to identifiers then finish the code with the resulting expression.

These extension methods are not, in some cases. especially efficient. For example, the concatenate method produces a new list by joining together two existing lists. Immutable data structures, the subject of the next blog, can offer a solution.

Monday, June 4, 2012

function(groovy) #4


function(groovy) #4

If all you have is a hammer, everything looks like a nail.

Abstract: The iterator methods on the collection classes take a higher order closure as a parameter. But they achieve much more when we recognize that we are relinquishing control over iteration in return for a general mechanism for use with collection frameworks. Higher order functions allow us to raise the abstraction level at which we operate.

Ken Barclay
kenbarc@googlemail.com

Previous:           Advanced closure
Resources:        Chapter 11 Closures

The iterator methods

Table 4.1 lists some of the iterator methods that can be used with the collection classes such as List and Map. Excepting method inject, all these methods are effectively defined on the Groovy Object class. The inject method is defined in the Collection class. Not all of the iterator methods are illustrated. For details of these and other methods see the Groovy documentation.

Table 4.1: Iterator methods

Name
Signature/description
any
boolean    any(Closure predicate)
Iterates over every element of this collection, and check that the predicate is valid for at least one element.
collect
List    collect(Closure transformer)
Iterates through this collection transforming each element into a new value using the closure as a transformer, returning a new List of transformed values.
each
void    each(Closure action)
Iterate through this object and execute the closure.
eachWithIndex
void    eachWithIndex(Closure action)
Iterate through this object and execute the closure with a counter.
every
boolean    every(Closure predicate)
Iterates over every element of this collection, and check that the predicate is valid for all elements.
find
Object    find(Closure predicate)
Find the first value matching the condition defined by the predicate closure. If no match is found, then return null.
findAll
List    findAll(Closure predicate)
Find all the values matching the condition defined by the predicate closure. If no match is found, then return the empty List.
grep
List    grep(Object filter)
Iterate over every element of this collection and return each object that matches the given filter – calling the isCase() method.
inject
Object    inject(Collection collection, Object init, Closure closure)
Iterates through this collection, passing the init value to the closure along with the first item. On subsequent iterations pass the previous closure value and the next iterated item. Upon completion return with the final closure value.

This group of methods take a closure (or function) as parameter. They are usually referred to as higher order functions. This capability is important for a number reasons. Higher order functions means that you can make assumptions how language elements will compose. Through this you can eliminate entire categories of methods in a class hierarchy by building general mechanisms that traverse structures such as lists and apply one or more functions (closures) to each element. This is the subject of this blog, illustrated with some examples.

            Example 4.1 shows method collect used with collections of people and 2-dimensional points. Class Person is defined with name and age properties. Class Point is defined with x and y properties and an implementation for the method equals. It also defines the method moveBy that produces a new Point instance with coordinates moved by the displacement given as parameters.

Example 4.1: Method collect

final Person p1 = new Person(name: 'Ken', age: 25)
final Person p2 = new Person(name: 'John', age: 31)
final Person p3 = new Person(name: 'Jessie', age: 22)

final List<Person> people = [p1, p2, p3]

assert people.collect{ p -> p.name } == ['Ken', 'John', 'Jessie']

final Point pt1 = new Point(x: 2, y : 3)
final Point pt2 = new Point(x: 4, y : 5)
final Point pt3 = new Point(x: 8, y : 12)

final List<Point> points = [pt1, pt2, pt3]

assert points.collect{ p -> p.moveBy(1, 2) } == [new Point(x: 3, y: 5), new Point(x: 5, y: 7), new Point(x: 9, y:14)]
.
The collect method is better known by the name map in the functional world ─ map some function across the elements of a collection.

            The findAll method is used to obtain the elements from a List or Map that match some criteria. In either case a List of successes is returned. If no such elements are found, then [] is returned. The criterion is specified by a predicate function. For a Map, the predicate must be defined in terms of one parameter, representing the map entry. Example 4.2 contains short illustrations. In the functional world this is usually known as filter.

Example 4.2: The findAll method

final List<Integer> numbers = [11, 12, 13, 14]

assert numbers.findAll{ elem -> elem > 12 } == [13, 14]
assert numbers.findAll{ elem -> elem > 14 } == []

assert ['John', 'Ken', 'Jessie'].findAll{ elem -> elem == 'Ken'} == ['Ken']

final Map<String, Integer> staffTel = ['Ken' : 1234, 'John' : 5678, 'Jessie' : 9012]

assert staffTel.findAll{entry -> entry.value > 5000}.collect{ elem -> elem.key}.contains('Jessie')

The inject method iterates through a collection, passing the initial value to the closure along with the first item of the collection. On subsequent iterations pass the previous closure result and the next iterated item from the collection. Upon completion of the iteration, return with the final closure value. Figure 4.1 describes the effect. First, the init value is used to initialize the res formal parameter of the closure. At the same time the first element from the list initializes the elem parameter. The final closure expression returns a value that re-initializes the res parameter with the second list element initializing the elem parameter. The process continues until the final list element is processed and the return from the closure is returned as the result of the inject method.

Figure 4.1: Description of the inject method

Example 4.3 shows the method inject used with collections of people and points. The first assert uses method inject to sum the ages of the people. In the second assert method inject is used to create a Map of People instances indexed by their name. In the third assert a List of the x values from the List of Points is produced. In the final assert a List of the pairs of the x and y values from the Point instances is created. The pairs are presented as sub-lists of length 2.

Example 4.3: Collections of people and 2D points

final Person p1 = new Person(name: 'Ken', age: 25)
final Person p2 = new Person(name: 'John', age: 31)
final Person p3 = new Person(name: 'Jessie', age: 22)

final List<Person> people = [p1, p2, p3]

assert people.inject(0){ res, p -> res += p.age } == 78
assert people.inject([:]){ res, p -> res[p.name] = p; res} == ['Ken': p1, 'John': p2, 'Jessie': p3]

final Point pt1 = new Point(x: 2, y : 3)
final Point pt2 = new Point(x: 4, y : 5)
final Point pt3 = new Point(x: 8, y : 12)

final List<Point> points = [pt1, pt2, pt3]

assert points.inject([]){ res, p -> res += p.x } == [2, 4, 8]
assert points.inject([]){ res, p -> res.add([p.x, p.y]); res} == [[2, 3], [4, 5], [8, 12]]

As we have shown, higher order functions means we can eliminate entire categories of methods on a collection class hierarchy. We build a general mechanism that traverses a collection and applies a higher order function to each element. For example, the collect method expects a function parameter that describes how to transform the elements of the collection. The findAll method expects a predicate closure that identifies the elements to be selected.

By comparison, re-implementing the examples in an imperative style would mean deploying for statements or iterator controlled while loops to manage the iteration. Further, we would be coding each atomic step when implementing the algorithm, often with state.

Relinquishing control over low-level details has been a constant trend in software development. Memory management and garbage collection are now given over to the runtimes. A functional or declarative style continues this development, taking control over iteration as demonstrated. Introducing a functional or declarative style as presented by the examples means we spend less effort in solving a problem (the how) and more on the processes (the what).