Friday, May 11, 2012

function(groovy) #2

function(groovy) #2

If all you have is a hammer, everything looks like a nail.

Abstract: The Groovy closure is central to making my coding more declarative. Different ways of employing the Groovy closure have all contributed to a more declarative style. The ability to combine simple closures into more complex closures encourages a compositional style not used in OO-land. Method bodies can be simplified by defining simple and complex closures that provide the required logic, often without recourse to any local method state variables.

Ken Barclay

Previous:           Introduction
Resources:        Chapter 11 Closures
                        Chapter 12Higher Order Functions

The Groovy closure

Fundamental to this work is the Groovy closure. A closure is an object that represents an executable block of code. Like other objects, a closure can be presented with a literal, can be  referenced by an identifier, have a method invoked on it, passed as a parameter in a method call or closure call, can be the property of some other object, or be a member of a collection such as a list. The closure was probably what first attracted me to Groovy.

A closure carries an implicit binding to all the variables referenced within it. In other words, the closure encloses a context around what it references. This means that any variables in scope when the closure is defined can be used and modified by the closure. This will prove very important in the next and following blogs.

Groovy supports writing anonymous closures that can be used in our programming when there is no requirement to define an identifier for it. Anonymous closures are expressed as closure literals. Since a closure literal is an instance of the class Closure, then to obtain some behavior from it, we invoke one of the class methods. For the moment we are principally interested in the method call that executes the closure. Example 2.1 illustrates.

Example 2.1: Closure literals

assert { int x -> return x * x }.call(2) == 4   // explicit return
assert { int x -> return x * x }(2) == 4           // implicit call
assert { int x -> 1 + x }(2) == 3                     // implicit return
assert { x -> 1 + x }(2) == 3                           // optional types

The closure maximum3 in Example 2.2 finds the larger of its three integer parameters. It defines the identifier maximum and binds it to a local closure. This is referred to as a nested closure since it is introduced in the body of the enclosing closure. Nested closures have the enclosing closure body in which they are defined as their scope. This means that they can only be referenced in their scope and have no existence outside of this scope. We will use this in what follows.

The context for the nested closure maximum includes those identifiers in the scope of the enclosing closure maximum3. This means that within the body of maximum we could refer directly to the identifiers x, y and z. It also means that the definition of maximum may not include definitions for these identifiers otherwise an error reporting that the current scope already includes the variable definition. As shown, we have had to name the formal parameters of maximum as anything other than x, y or z.

Example 2.2: Scope

    // maximum3:: Integer * Integer * Integer -> Integer
final Closure<Integer> maximum3 = { final int x, final int y, final int z ->

        // maximum:: Integer * Integer -> Integer
  final Closure<Integer> maximum = { final int a, final int b ->
    (a > b) ? a : b
  maximum(x, maximum(y, z))

assert maximum3(12, 34, 56) == 56
assert maximum3(56, 34, 12) == 56
assert maximum3(12, 34, 34) == 34
assert maximum3(12, 12, 12) == 12

A recursive closure is one that calls or invokes itself. What we cannot do is make a direct recursive call through the name of the closure as suggested by:

final factorial = { final int n ->
    return (n == 0) ? 1 : n * factorial(n - 1)                   // ERROR

The problem is that the identifier factorial is not bound to the closure literal until the closure definition is complete. Hence the recursive call in the closure body refers to the, as yet, unknown identifier factorial.

Examples 2.3 and 2.4 show alternative strategies. In Example 2.3 the identifier recFactorial is defined but uninitialized. On the next statament it is assigned to the closure literal. This time the recursive call is accepted since the recFactorial identifier is known to the compiler. The approach, of course, is dependent on assignment in which an identifier is bound once to its default null value, then assigned a closure reference.

Example 2.3: Recursive closures

    // factorial:: Integer -> Integer
final Closure<Integer> factorial = { final int n ->
  def recFactorial
  recFactorial = { final int m ->                     // recFactorial:: Integer -> Integer
    (m == 0) ? 1 : m * recFactorial(m - 1)

assert factorial(5) == 120
assert factorial(1) == 1
assert factorial(0) == 1

In Example 2.4 we avoid this problem by defining recFactorial to have an additional formal parameter, self, that when the closure is called is bound to the closure itself. Within the closure body we can use this identifier to make the recursive call. The enclosing closure factorial calls recFactorial and passes recFactorial as the first actual parameter.

Example 2.4: Recursive closures

    // factorial:: Integer -> Integer
final Closure<Integer> factorial = { final int n ->
  final recFactorial = { final self, final int m ->          // recFactorial:: Closure * Integer -> Integer
    (m == 0) ? 1 : m * self(self, m - 1)
  recFactorial(recFactorial, n)

assert factorial(5) == 120
assert factorial(1) == 1
assert factorial(0) == 1

Closures that call themselves as their last action are called tail recursive. Closure recFactorial is made tail recursive by introducing a second accumulating parameter. We see in Example 2.5 how Groovy's trampoline method is then used to make calls sequentially using only a single stack frame until the result (a very, very, very large number) is delivered.

Example 2.5: Tail recursive factorial closure

    // factorial:: BigInteger -> BigInteger
final factorial = { final BigInteger n ->
  def recFactorial
  recFactorial = { final BigInteger m, final BigInteger acc ->   // recFactorial:: BigInteger -> BigInteger
    (m == 0) ? acc : recFactorial.trampoline(m - 1, m * acc)
  recFactorial = recFactorial.trampoline()
  recFactorial(n, 1G)

println factorial(500G)

Recursive functions generally bring one of our declarative features to the table, namely the absence, or at least a reduction, of state. One could imagine an implementation for factorial in terms of a program loop with a control variable as well as a local variable to accumulate the result. So where has the state gone in the recursive solutions? In effect, the state is conveyed through the actual parameter values. Tagging the formal parameters as final further emphasizes our intention not to vary these references. Further, it is generally easier to reason about a recursive function than an imperative one since it is expressed in terms of use-cases ─ a base case and one or more than recursive cases.

Closures can also be parameters to methods and to other closures. As we will shortly demonstrate, closures can also be the return value from a method/closure call. Invariably the closures have one or more parameters and deliver a result value. In effect they operate like mathematical functions. As a consequence we shall consider the terms closure and closure function (or simply function) as synonyms. We will use the term function when describing a transformation of inputs into an output. The term closure will be reserved for when we are discussing some technical aspect of the Groovy constructs.

Through closure functions as parameters and closure functions as results we are able to fully exploit the machinery of these general functions, such as function composition. The resulting closure function definitions can be both more concise and more readable than traditional approaches. This can sometime prove  a difficult and alien concept for the imperative, object-oriented programmer, but one worth learning.

In Example 2.6 function twice is a higher order function as it expects a function as its first parameter. The interpretation of the signature for twice is that it returns an Integer as its value. Two parameters are required, the second of which is an Integer. The first parameter is the function parameter. Its signature is Integer -> Integer showing that we expect a function of this type as the first actual parameter. The effect of twice is to repeatedly apply the function to the numeric parameter.

Example 2.6: Simple higher order closure function

    // triple:: Integer -> Integer
final Closure<Integer> triple = { final Integer x -> 3 * x }

    // quadruple:: Integer -> Integer
final Closure<Integer> quadruple = { final Integer x -> 4 * x }

assert triple(2) == 6
assert quadruple(3) == 12

    // twice:: (Integer -> Integer) * Integer -> Integer
final Closure<Integer> twice = { final Closure<Integer> f, final Integer x ->
  return f(f(x))

assert twice(triple, 2) == 18
assert twice(quadruple, 3) == 48

Perhaps one of the more important characteristics of functions is composition, wherein you can define one function whose purpose is to combine other functions. Using composition, two or more simple functions can be combined to produce a more elaborate one. For example, suppose we have two arithmetic functions f and g, as in z = f(t) and y = g(t). One way of composing these two functions would be to first compute y, and then to compute z from y, as in y = g(x) followed by z = f(y). One could get the same result with the single composition written z = f(g(x)). We write this as f g, meaning function f composed with function g.

Function composition lets us compose a solution by combining separately designed and implemented functions. Structuring a large solution into separate functions simplifies the programming activity. The functions can be defined, implemented and tested separately. Simple functions can be used to construct more elaborate functions, reducing the burden and complexity of a system. The functions can also be defined for one application and re-used in other applications.

The latest compiler releases for Groovy now includes overloaded versions of the leftShift operator (<<) and the rightShift operator (>>) that apply to closures. The << operator acts exactly as function composition as described above. The expression c1 << c2 first applies the function c2 to its argument then function c1 to the result. Equally c1 >> c2 is the forward composition operator, applying c1 to its argument then c2 to the result. Example 2.7 illustrates.

Example 2.7: Function composition

final successor = { final int x -> return 1 + x }
final triple = { final int x -> return 3 * x }
final quadruple = { final int x -> return 4 * x }

assert successor(5) == 6
assert triple(5) == 15
assert quadruple(5) == 20

final twelveTimes = triple << quadruple
final threePlusOne = triple >> successor

assert twelveTimes(5) == 60
assert threePlusOne(5) == 16

Since a closure is an object it can be an element in a list. A list of closures could then act as the list of handlers for a menu. A dynamically changing menu could be mirrored by a dynamically changing list of closures. Anyone of these closures might be assembled dynamically as we have presented here.

The command design pattern provides a way to encapsulate behavior into a class. But wrapping the behavior I want in a command class is an artifact of OO programming. The command design pattern can be achieved with little ceremony using Groovy functions.

Various factory design patterns are used throughout OO-land. The earlier examples showed currying as a mechanism for developing a family of functions from a single definition. In effect currying is a form of factory for functions.


Michael said...

Nice postings ken, but that font is really hard to read a lot of copy in. A non italic or non-cursive style would be far easier on my eye and probably a lot of others who stare at a monitor for 10+ hrs a day.

Tim Yates said...

FYI, the recursive factorial closure just before example 2.3 can be rewritten to wrap the recursive closure call in another closure, and then use to start the recursion