<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3255923023602927617</id><updated>2011-07-29T00:48:05.386-07:00</updated><category term='External DSLs'/><category term='Scala SwingBuilder'/><category term='Combinator Parsers'/><category term='MapReduce Groovy closure functional programming'/><category term='Groovy'/><title type='text'>Ken Barclay</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>7</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-6072610500764784738</id><published>2010-09-02T02:40:00.000-07:00</published><updated>2010-11-18T04:09:22.349-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scala SwingBuilder'/><title type='text'>Scala: SwingBuilder Essentials</title><content type='html'>&lt;style&gt;  code {    color: #151B8D;    font: normal 100% bold Courier  }    pre {    color: black;    background: #EAE5E5;    font: normal 100% bold Courier    border: 1px solid black;    width: 100%;  }/* Syntax highlighting----------------------------------------------- */  .type { color: rgb(0,44,221); }  .keyword { color: #151B8D; font-weight: bold; }  .javadoc_comment { color: #E42217; font-style: italic; }  .comment { color: #E42217; font-style: italic; }  .operator { color: rgb(0,124,31); }  .plain { color: rgb(0,0,0); }  .literal { color: rgb(188,0,0); }  .javadoc_tag { color: rgb(147,147,147); background-color: rgb(247,247,247); font-style: italic; font-weight: bold; }  .separator { color: rgb(0,33,255); }/* Page Structure----------------------------------------------- */  #outer-wrapper {    width:940px;    margin:0 auto;    text-align:left;    font: normal normal 100% 'Trebuchet MS',Verdana,Arial,Sans-serif;  }/* Header----------------------------------------------- */  #header {    background:#335577 no-repeat left top;    margin:22px 0 0 0;    padding:8px 0 0 0;    color:#ffffff;  }    #header h1 {    margin:0;    padding:10px 30px 5px;    line-height:1.2em;    font: normal bold 200% Verdana,Arial,Sans-serif;    text-align: center;  }  #header .description {    margin:0;    padding:5px 30px 10px;    line-height:1.5em;    font: normal normal 100% Verdana,Arial,Sans-serif;  }/* Main----------------------------------------------- */  #main {    background:#FAF8CC no-repeat left top;    margin:22px 0 0 0;    padding:8px 0 0 0;    color:#ffff;    font: normal normal 90% Verdana,Arial,Sans-serif;  }  &lt;/style&gt;&lt;br /&gt;&lt;br /&gt;&lt;div id="header"&gt;&lt;h1&gt;Scala Programming&lt;/h1&gt;&lt;div class="description"&gt;A programmer's blog&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div id="main"&gt;A distinguishing feature of the Groovy programming language is its builders.&lt;br /&gt;A Groovy builder lets you imbed hierarchical structures directly into your&lt;br /&gt;code. Groovy supports a range of builders including XML builder, Ant builder&lt;br /&gt;and Swing builder. The Grails web application development framework, built on&lt;br /&gt;Groovy, also supports JSON and Hibernate criteria builders. Interestingly, this&lt;br /&gt;range of builders is developed from a single common framework.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Builders offer a convenient way to construct hierarchical models with a direct&lt;br /&gt;correspondence between the hierarchy and the generated code. The following&lt;br /&gt;snippet demonstrates how, using a Scala &lt;code&gt;SwingBuilder&lt;/code&gt;, we might specify a&lt;br /&gt;simple UI:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;frame('title -&amp;gt; "Centigrade to Fahrenheit conversion", 'location -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; Point(100, 100), 'size -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; Dimension(500, 300)){&lt;br /&gt;  panel('layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")){&lt;br /&gt;      &lt;br /&gt;    panel('constraints -&amp;gt; "dock north", 'layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")) {&lt;br /&gt;      label('text -&amp;gt; "Centigrade to Fahrenheit conversion", 'constraints -&amp;gt; "alignx right")&lt;br /&gt;    }&lt;br /&gt;      &lt;br /&gt;    panel('layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")) {&lt;br /&gt;      label('text -&amp;gt; "Centigrade")&lt;br /&gt;      label('text -&amp;gt; "Fahrenheit", 'constraints -&amp;gt; "wrap")&lt;br /&gt;      textField('text -&amp;gt; "", 'columns -&amp;gt; 20)&lt;br /&gt;      textField('text -&amp;gt; "", 'columns -&amp;gt; 20, 'editable -&amp;gt; &lt;span class="keyword"&gt;false&lt;/span&gt;)&lt;br /&gt;    }&lt;br /&gt;      &lt;br /&gt;    panel('constraints -&amp;gt; "dock south", 'opaque -&amp;gt; &lt;span class="keyword"&gt;false&lt;/span&gt;) {&lt;br /&gt;        button('text -&amp;gt; "Close")&lt;br /&gt;    }&lt;br /&gt;      &lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;Widgets are created by &lt;i&gt;factory method calls&lt;/i&gt; into the &lt;code&gt;SwingBuilder&lt;/code&gt; object.&lt;br /&gt;Simple widgets such as &lt;code&gt;label&lt;/code&gt; initialize their properties through a series&lt;br /&gt;of &lt;i&gt;named parameters&lt;/i&gt;. The property is identified with a &lt;code&gt;Symbol&lt;/code&gt; and its initial&lt;br /&gt;value. Container widgets such as &lt;code&gt;frame&lt;/code&gt; and &lt;code&gt;panel&lt;/code&gt; also include a code block&lt;br /&gt;of the sub-components managed by it.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The widgets are named after their Swing counterpart. Hence the method call&lt;br /&gt;&lt;code&gt;frame&lt;/code&gt; constructs a &lt;code&gt;JFrame&lt;/code&gt; object while &lt;code&gt;label&lt;/code&gt; constructs a &lt;code&gt;JLabel&lt;/code&gt; object. The&lt;br /&gt;methods have two variants as illustrated by &lt;code&gt;label&lt;/code&gt;:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;label('property -&amp;gt; value, ...)&lt;br /&gt;label(&lt;span class="keyword"&gt;new&lt;/span&gt; MyLabel(...), 'property -&amp;gt; value, ...)&lt;/pre&gt;The first version creates a &lt;code&gt;JLabel&lt;/code&gt; and initializes it properties. The second&lt;br /&gt;version initializes the given &lt;code&gt;JLabel&lt;/code&gt;. This can be used to introduce specialized&lt;br /&gt;widgets. Both calls are also supplied with an implicit &lt;code&gt;SwingBuilder&lt;/code&gt; object&lt;br /&gt;responsible for the creation an initialization of the widgets. For example, the main panel&lt;br /&gt;above might use the custom class &lt;code&gt;RadialGradientPanel&lt;/code&gt; to deliver a panel decorated&lt;br /&gt;with a gradient painter:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_3wxYrWcJcFI/TOUUqXcdFcI/AAAAAAAAAM0/Z_CCYC2wJbM/s1600/demo02.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="192" ox="true" src="http://4.bp.blogspot.com/_3wxYrWcJcFI/TOUUqXcdFcI/AAAAAAAAAM0/Z_CCYC2wJbM/s320/demo02.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;pre&gt;panel(&lt;span class="keyword"&gt;new&lt;/span&gt; RadialGradientPanel(0.0F, 0.0F, 400.0F, List(0.0F, 0.5F, 1.0F), List(royalBlue, royalBlue4, black)), 'layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")) { ... }&lt;/pre&gt;Running this code produces the application:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;To interact with the application we introduce &lt;i&gt;event handlers&lt;/i&gt; through the same&lt;br /&gt;named parameter mechanism. The names are the event handlers of the corresponding&lt;br /&gt;Swing listener class. Pressing a button or entering RETURN into a text field&lt;br /&gt;produces an &lt;code&gt;Action&lt;/code&gt; event processed by the &lt;code&gt;actionPerformed&lt;/code&gt; method in the&lt;br /&gt;registered &lt;code&gt;ActionListener&lt;/code&gt; object.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In the above application we introduce the &lt;i&gt;synthetic property&lt;/i&gt; &lt;code&gt;id&lt;/code&gt; into the&lt;br /&gt;second text field. The implicit &lt;code&gt;SwingBuilder&lt;/code&gt; object maintains a mapping between&lt;br /&gt;the id and the widget which can be retrieved with the method &lt;code&gt;getWidgetID&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;frame('title -&amp;gt; "Centigrade to Fahrenheit conversion", 'location -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; Point(100, 100), 'size -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; Dimension(500, 300)){&lt;br /&gt;  panel(&lt;span class="keyword"&gt;new&lt;/span&gt; RadialGradientPanel(0.0F, 0.0F, 400.0F, List(0.0F, 0.5F, 1.0F), List(royalBlue, royalBlue4, black)), 'layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")) {&lt;br /&gt;      &lt;br /&gt;    panel('opaque -&amp;gt; &lt;span class="keyword"&gt;false&lt;/span&gt;, 'constraints -&amp;gt; "dock north", 'layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")) {&lt;br /&gt;      label('text -&amp;gt; "Centigrade to Fahrenheit conversion", 'foreground -&amp;gt; Color.WHITE, 'constraints -&amp;gt; "alignx right")&lt;br /&gt;    }&lt;br /&gt;      &lt;br /&gt;    panel('opaque -&amp;gt; &lt;span class="keyword"&gt;false&lt;/span&gt;, 'layout -&amp;gt; &lt;span class="keyword"&gt;new&lt;/span&gt; MigLayout("fill")) {&lt;br /&gt;      label('text -&amp;gt; "Centigrade", 'foreground -&amp;gt; Color.WHITE)&lt;br /&gt;      label('text -&amp;gt; "Fahrenheit", 'foreground -&amp;gt; Color.WHITE, 'constraints -&amp;gt; "wrap")&lt;br /&gt;      textField('text -&amp;gt; "", 'columns -&amp;gt; 20, 'actionPerformed -&amp;gt; doConvert)&lt;br /&gt;      textField('id -&amp;gt; "fahrenheitTF", 'text -&amp;gt; "", 'columns -&amp;gt; 20, 'editable -&amp;gt; &lt;span class="keyword"&gt;false&lt;/span&gt;)&lt;br /&gt;    }&lt;br /&gt;      &lt;br /&gt;    panel('opaque -&amp;gt; &lt;span class="keyword"&gt;false&lt;/span&gt;, 'constraints -&amp;gt; "dock south") {&lt;br /&gt;        button(&lt;span class="keyword"&gt;new&lt;/span&gt; GlossButton(), 'text -&amp;gt; "Close", 'actionPerformed -&amp;gt; doClose)&lt;br /&gt;    }&lt;br /&gt;      &lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The first text field includes the synthetic property &lt;code&gt;actionPerformed&lt;/code&gt;.&lt;br /&gt;The value for this property will refer to a Scala function. The &lt;code&gt;SwingBuilder&lt;/code&gt;&lt;br /&gt;creates an instance of a class that refers to this function. The class implements&lt;br /&gt;the Swing &lt;code&gt;ActionListener&lt;/code&gt; interface and its &lt;code&gt;actionPerformed&lt;/code&gt; method calls the function.&lt;br /&gt;The instance of the class is registered as a listener on the text field.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The event handlers are simple Scala functions:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span class="keyword"&gt;val&lt;/span&gt; doClose = (event: ActionEvent) =&amp;gt; System.exit(0)&lt;br /&gt;&lt;br /&gt;&lt;span class="keyword"&gt;val&lt;/span&gt; doConvert = (event: ActionEvent) =&amp;gt; {&lt;br /&gt;  &lt;span class="keyword"&gt;val&lt;/span&gt; centigradeTF = event.getSource().asInstanceOf[JTextField]&lt;br /&gt;  &lt;span class="keyword"&gt;val&lt;/span&gt; centigrade = centigradeTF.getText().toDouble&lt;br /&gt;  &lt;span class="keyword"&gt;val&lt;/span&gt; fahrenheitTF = builder.getWidgetID("fahrenheitTF").asInstanceOf[JTextField]&lt;br /&gt;  fahrenheitTF.setText(((9 * centigrade + 160)/5).toString)&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The builder code is, of course, just Scala. We are, therefore, free to mix in&lt;br /&gt;any other Scala code. The factory methods return a reference to the widget it&lt;br /&gt;creates so we could, for example, say:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span class="keyword"&gt;val&lt;/span&gt; centigrade = textField('text -&amp;gt; "", 'columns -&amp;gt; 20)&lt;/pre&gt;&lt;br /&gt;then use this identifier elsewhere, providing we respect the usual scoping&lt;br /&gt;rules.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The project is very much in its infancy. Presently the library supports&lt;br /&gt;a sizeable subset of the Swing components: label, textField, button, toggleButton,&lt;br /&gt;checkBox, radioButton (buttonGroup), list, scrollPane, panel, tabbedPane&lt;br /&gt;and frame. More will follow.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In the next blog I will show how to introduce an MVC architecture into an&lt;br /&gt;application.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-6072610500764784738?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/6072610500764784738/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=6072610500764784738' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/6072610500764784738'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/6072610500764784738'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2010/09/scala-swingbuilder-essentials.html' title='Scala: SwingBuilder Essentials'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_3wxYrWcJcFI/TOUUqXcdFcI/AAAAAAAAAM0/Z_CCYC2wJbM/s72-c/demo02.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-8033147255603878083</id><published>2008-03-16T05:27:00.000-07:00</published><updated>2008-03-16T05:40:53.530-07:00</updated><title type='text'>Groovy Combinator Parsing</title><content type='html'>&lt;span style="font-family:arial;"&gt;&lt;strong&gt;Groovy Combinator Parsing&lt;br /&gt;Part 4: Abstract Syntax Trees&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;Associated with parsing is the simultaneous construction of an abstract&lt;br /&gt;syntax tree (AST). In this final article we will investigate how this&lt;br /&gt;is achieved with combinator parsers.&lt;br /&gt;&lt;br /&gt;When a parse succeeds the success object carries a result value that&lt;br /&gt;represents the recognised element (together with the remaining input&lt;br /&gt;to be parsed). In the case of the letter parser shown in part 1, the&lt;br /&gt;recognised element is the single character:&lt;br /&gt;&lt;br /&gt;assert letter('abc###') == ... success with 'a' as the result ...&lt;br /&gt;&lt;br /&gt;Equally, the parser:&lt;br /&gt;&lt;br /&gt;def letterAndDigit = seqC(letter, digit)&lt;br /&gt;assert letterAndDigit('a123###') == ... success with ['a', '1'] as the result ...&lt;br /&gt;&lt;br /&gt;Because we are using a &lt;em&gt;seqC&lt;/em&gt; combinator, it returns a list of two elements&lt;br /&gt;produced by the two parsers.&lt;br /&gt;&lt;br /&gt;In part 2 we saw the parser for a Groovy identifier:&lt;br /&gt;&lt;br /&gt;def identifier = seqC(identifierStart, noneOrMoreC(identifierRest))&lt;br /&gt;&lt;br /&gt;where the permissible characters that may start a Groovy identifier are&lt;br /&gt;defined by the parser &lt;em&gt;identifierStart&lt;/em&gt;, and all subsequent characters by&lt;br /&gt;the parser &lt;em&gt;identifierRest&lt;/em&gt;. The &lt;em&gt;noneOrMoreC&lt;/em&gt; combinator will produce a&lt;br /&gt;list of the result characters:&lt;br /&gt;&lt;br /&gt;assert identifier('bbc2###') == ... success with ['b', ['b', 'c', '2']] ...&lt;br /&gt;&lt;br /&gt;Here, the outer list is produced by &lt;em&gt;seqC&lt;/em&gt; and the inner list through &lt;em&gt;noneOrMoreC&lt;/em&gt;.&lt;br /&gt;Normally, a lexer would wrap these individual characters into some token&lt;br /&gt;that denotes the completed identifier. Consider that we have such a class:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;class Identifier {&lt;br /&gt;  def identifier // String&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The combinator &lt;em&gt;mapC&lt;/em&gt; transform the value of a parser &lt;em&gt;p&lt;/em&gt; into a different value:&lt;br /&gt;&lt;br /&gt;def mapC = { p, f -&gt; ... transform the result of parser p by f ... }&lt;br /&gt;&lt;br /&gt;The parameter &lt;em&gt;f&lt;/em&gt; is the closure that performs the transformation. Given:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;def toIdentifier = { chList -&gt; return new Identifier(identifier: chList[0] + chList[1].join()) }&lt;br /&gt;def identifier = mapC(seqC(identifierStart, noneOrMoreC(identifierRest)), toIdentifier)&lt;br /&gt;assert identifier('bbc2###') == ... success with Identifier object ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;A larger example would be the Groovy while statement:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;whileStatement: "while" LPAREN strictContextExpression RPAREN compatibleBodyStatement&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The corresponding definition is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;def whileStatement = mapC(seqCs([whileKeyword, lParen, expression, rParen, compatibleBodyStatement]), toWhileStatement)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;with the mapping transformer constructing an object of the class &lt;em&gt;WhileStatement&lt;/em&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;class WhileStatement {&lt;br /&gt;  def expression&lt;br /&gt;  def body&lt;br /&gt;}&lt;/pre&gt;&lt;pre&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Further Developments&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The combinators described in this series of articles have been stable since the&lt;br /&gt;summer of 2007. They have allowed me to develop a number of simple external DSLs.&lt;br /&gt;&lt;br /&gt;Since a BNF grammar defining some language has a fixed structure, then it is&lt;br /&gt;also possible to define combinators that will parse BNF production rules. Here&lt;br /&gt;the BNF rules act as the external DSL. Action code added to the parser can&lt;br /&gt;produce the combinators for the grammar automatically. In effect we have a&lt;br /&gt;simple compiler-compiler.&lt;br /&gt;&lt;br /&gt;Work is underway to produce a variant of the current combinators to give&lt;br /&gt;support to error reporting when a parse detects an input error. This has&lt;br /&gt;involved four small classes and relatively minor changes to the combinators.&lt;br /&gt;This new work is presently being tested.&lt;br /&gt;&lt;br /&gt;I am also investigating how the combinators scale when defining parsers for&lt;br /&gt;large grammars. Specifically, I am using the combinators to produce parsers&lt;br /&gt;for large subsets of Groovy. Currently I have successfully created parsers for&lt;br /&gt;expressions, statements, declarations, class and interface definitions. I&lt;br /&gt;anticipate being able to parse all the Groovy language, which will then be&lt;br /&gt;used for various tools.&lt;br /&gt;&lt;br /&gt;The work, however, is still very much a work-in-progress. For example, results&lt;br /&gt;from developing a parser for a large grammar has revealed opportunities for&lt;br /&gt;optimisations. So expect to see continual enhancements of the library.&lt;br /&gt;The sources for this work will be posted shortly on the Groovy site. Users are welcome&lt;br /&gt;to experiment and use the library. Feedback would be welcomed.&lt;br /&gt;&lt;br /&gt;Ken Barclay&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-8033147255603878083?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/8033147255603878083/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=8033147255603878083' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/8033147255603878083'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/8033147255603878083'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2008/03/groovy-combinator-parsing_16.html' title='Groovy Combinator Parsing'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-8100617369840641512</id><published>2008-03-09T05:49:00.000-07:00</published><updated>2008-03-09T06:25:39.586-07:00</updated><title type='text'>Groovy Combinator Parsing</title><content type='html'>&lt;strong&gt;&lt;span style="font-family:arial;"&gt;Groovy Combinator Parsing&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:arial;"&gt;Part 3: Ambiguous Grammars&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;The previous article introduced the basic combinators provided by theGParsec library. We showed how they can be combined and how they define parsers that mirror the production rules of the grammar.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;We regularly find that our production rules are mutually dependent. For example, with Groovy we have:&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  whileStatement : 'while' '(' expression ')' compatibleBodyStatement&lt;br /&gt;  ifStatement : ...&lt;br /&gt;  statement : whileStatement ifStatement ...&lt;br /&gt;  blockBody : ( statement separators )*&lt;br /&gt;  compoundStatement : '{' blockBody '}'&lt;br /&gt;  compatibleBodyStatement : compoundStatement statement&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;In this extract the &lt;em&gt;compatibleBodyStatement&lt;/em&gt; is used to define a &lt;em&gt;whileStatement&lt;/em&gt;. The &lt;em&gt;compatibleBodyStatement&lt;/em&gt; is either a &lt;em&gt;compoundStatement&lt;/em&gt; or a &lt;em&gt;statement&lt;/em&gt;, and the latter is any of the permissible statements, including the &lt;em&gt;whileStatement&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;To define a Groovy parser for a &lt;em&gt;whileStatement&lt;/em&gt;, we would first have to define a parser for &lt;em&gt;compatibleBodyStatement&lt;/em&gt;. This would require a parser definition for &lt;em&gt;statement&lt;/em&gt; which would, in turn, require &lt;em&gt;whileStatement&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;Mutual dependencies are readily handled using the &lt;em&gt;lazyC&lt;/em&gt; combinator:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def lazyC = { p -&gt;&lt;br /&gt;    return { inp -&gt;&lt;br /&gt;      return p[0](inp)&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here &lt;em&gt;p&lt;/em&gt; represents a proxy for the actual parser, held as the first element of a list. Our productions are now defined in Groovy as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def compatibleBodyStatementProxy = []&lt;br /&gt;  def whileStatement = seqCs([whileKeyword, lParen, expression, rParen, lazyC(compatibleBodyStatementProxy)])&lt;br /&gt;  def ifStatement = ...&lt;br /&gt;  def statement = altCs([whileStatement, ifStatement, ...])&lt;br /&gt;  def blockBody = noneOrMoreC(seqC(statement, separators))&lt;br /&gt;  def compoundStatement = seq3C(lBrace, blockBody, rBrace)&lt;br /&gt;  def compatibleBodyStatement = altC(compoundStatement, statement)&lt;br /&gt;  compatibleBodyStatementProxy &amp;lt;&amp;lt; compatibleBodyStatementProxy&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;that can then be used to define a &lt;em&gt;whileStatement&lt;/em&gt;. At the end of the code we place the actual &lt;em&gt;compatibleBodyStatement&lt;/em&gt; into the proxy.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Ambiguous Grammars&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Many grammars are ambiguous and require an arbitrary lookahead to determine the correct alternative in a production rule to follow. For example, Groovy statements can begiven an identifying label. Potentially, that identifier may be a label on a statement or an identifier used on the left of an assignment statement. The Groovy production rule is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  statement : whileStatement ifStatement ... labelledStatement&lt;br /&gt;  labelledStatement : identifier ':' statement statement&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;em&gt;tryC&lt;/em&gt; combinator behaves like its parser parameter, but pretends its has notconsumed any input when it fails. Consider the parser &lt;em&gt;altC(tryC(p), q)&lt;/em&gt;. Even when the parser &lt;em&gt;p&lt;/em&gt; fails while consuming some input, the choice combinator will try thealternate &lt;em&gt;q&lt;/em&gt;. The resulting Groovy definitions are:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def statement = altCs([whileStatement, ifStatement, ..., labelledStatement])&lt;/pre&gt;&lt;br /&gt;def labelledStatement = altC(tryC(seq3C(identifier, colon, statement)), statement) &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-family:Courier New;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;p&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Compiling&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;In the final installment we shall consider how we can adapt our parsers to construct abstract syntax trees to represent the parsed input. We will also look at further developments of this work.&lt;br /&gt;&lt;br /&gt;Ken Barclay&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-8100617369840641512?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/8100617369840641512/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=8100617369840641512' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/8100617369840641512'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/8100617369840641512'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2008/03/groovy-combinator-parsing_09.html' title='Groovy Combinator Parsing'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-3084111871181125164</id><published>2008-03-02T08:42:00.000-08:00</published><updated>2008-03-02T09:04:13.443-08:00</updated><title type='text'>Groovy Combinator Parsing</title><content type='html'>&lt;p&gt;&lt;span style="font-family:arial;"&gt;&lt;strong&gt;Groovy Combinator Parsing&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;&lt;strong&gt;Part 2: Combinators&lt;br /&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/span&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;The first article introduced the &lt;em&gt;satisfyC&lt;/em&gt; combinator. From it we are able to generate any number of parsers by providing a predicate test closure. For example, we were able to define simple parsers that recognised a letter or a digit in the inputstream.&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;The GParsec library comprises a number of such combinators. They are summarised in the table below.&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;&lt;em&gt;satisfyC&lt;/em&gt;: The &lt;em&gt;satisfyC&lt;/em&gt; combinator consumes a single input when the predicate succeeds.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;&lt;em&gt;altC (alt3C, altCs):&lt;/em&gt; is the choice combinator. Given two parsers it only looks at its second alternative if the first has not consumed any input - regardless of the final value. The two variants work the same way, &lt;em&gt;alt3C&lt;/em&gt; is given three parsers and &lt;em&gt;altCs&lt;/em&gt; a list of parsers.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;&lt;em&gt;seqC (seq3C, seqCs):&lt;/em&gt; is the sequencing combinator. It runs two parsers in succession and if successful, returns the result of the two parsers. Again, &lt;em&gt;seq3C&lt;/em&gt; and &lt;em&gt;seqCs&lt;/em&gt; are simple variants. &lt;/span&gt;&lt;/p&gt;&lt;span style="font-family:arial;"&gt;&lt;p&gt;&lt;br /&gt;&lt;em&gt;noneOrMoreC, oneOrMoreC:&lt;/em&gt; The combinator &lt;em&gt;noneOrMoreC&lt;/em&gt; applies a parser zero or more times to an input stream. The result from each application of the parser are returned in a list. Not surprisingly, the combinator &lt;em&gt;oneOrMoreC&lt;/em&gt; is used for non-empty sequences. &lt;/p&gt;&lt;p&gt;&lt;em&gt;optionalC:&lt;/em&gt; The combinator &lt;em&gt;optionalC&lt;/em&gt; may succeed in parsing some input. It always returns success. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;Many of these combinators are programmed in a manner similar to combinator &lt;em&gt;satisfyC&lt;/em&gt;. Some are defined in terms of the others. For example we might define:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def seq3C = { p, q, r -&gt; return seqC(seqC(p, q), r) }&lt;br /&gt;  def oneOrMoreC = { p -&gt; return seqC(p, noneOrMorec(p)) }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;These definitions reveal how we might combine the combinators to construct useful parsers. For example, given the definition for the permissible characters that may start a Groovy identifier as &lt;em&gt;identifierStart&lt;/em&gt;, and all subsequent characters by the parser &lt;em&gt;identifierRest&lt;/em&gt;, then we might define the parser for an &lt;em&gt;identifier&lt;/em&gt; as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def identifierStart = altC(letter, charP('_'))&lt;br /&gt;  def identifierRest = altC(identifierStart, digit)&lt;br /&gt;  def identifier = seqC(identifierStart, noneOrMoreC(identifierRest))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Other simple parsers are:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def digits = oneOrMoreC(digit)&lt;br /&gt;  def integer = digits&lt;br /&gt;  def decimal = seqC(digits, seqC(period, digits))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;A key characteristic of combinator parsers is they give rise to parser definitions that mimic the production rules of the grammar. For example, a variable definition in Groovy comprises one or more variable declarators separated by commas. A variable declarator is an identifier with an optional initializer. The productions are:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  variableDeclarator : identifier ( '=' expression )?&lt;br /&gt;  variableDefinitions : variableDeclarator ( ',' variableDeclarator)*&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Assuming we have parsers for single characters, identifier and expression, then we define the parsers as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def variableDeclarator = seqC(identifier, optionalC(seqC(assign, expression)))&lt;br /&gt;  def variableDefinitions = seqC(variableDeclarator, noneOrMoreC(seqC(comma, variableDeclarator)))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Observe the one-to-one correspondence between the production rules and the parser definitions.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Problem grammars&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;We have shown how we define parsers using the range of combinators provided by GParsec. However, some grammars present problems as a result of including ambiguities. This is the subject of our next article.&lt;br /&gt;&lt;br /&gt;Ken Barclay&lt;/span&gt; &lt;p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-3084111871181125164?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/3084111871181125164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=3084111871181125164' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/3084111871181125164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/3084111871181125164'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2008/03/groovy-combinator-parsing.html' title='Groovy Combinator Parsing'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-4850637584824649533</id><published>2008-02-27T10:28:00.000-08:00</published><updated>2008-02-28T00:40:03.165-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Groovy'/><category scheme='http://www.blogger.com/atom/ns#' term='Combinator Parsers'/><title type='text'>Groovy Combinator Parsing</title><content type='html'>&lt;div align="left"&gt;Groovy Combinator Parsing&lt;br /&gt;Part 1: Combinator Parsing&lt;br /&gt;&lt;br /&gt;This work was inspired by the many recent articles on Domain Specific Languages (DSLs). Most authors when describing DSLs consider only internal DSLs i.e. those embedded in a host language such as Groovy and exploiting the meta-programming facilities. Commonly, external DSLs are given little coverage due to the perceived difficulties. Generally the problems are due to external DSLs requiring input from compiler technology. For most application programmers this is considered a specialised area and requiring a steep learning curve.&lt;br /&gt;&lt;br /&gt;Most DSL articles refer the reader to generator parsers such as Lex/Yacc or Antlr. Undoubtedly these are specialist tools that will be unfamiliar to most application developers.&lt;br /&gt;&lt;br /&gt;In contrast to parser generators, the functional programming community have developed combinator parsers. They offer a set of combinators to express grammars. These combinators are manipulated as first class values and can be combined to define new combinators for the application. One advantage of combinator parsers is they avoid the integration of differing compiler tools.&lt;br /&gt;&lt;br /&gt;The &lt;em&gt;JParsec&lt;/em&gt; library offers a Java implementation for a combinator parser. This is a high quality product with some excellent features. However, it is a Java implementation constructed in a classical OO manner. It features a number of classes, some quite complex with a large number of methods.&lt;br /&gt;&lt;br /&gt;The &lt;em&gt;GParsec&lt;/em&gt; library described in this series of articles demonstrates the construction and application of a combinator parser library based on first principles. Further, in developing the library I showcase some of the excellent features of Groovy.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;A Simple Combinator&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;A common requirement of a parser is to ensure the next part of the input satisfies some grammar rule. For example in Groovy we might expect an identifier to follow the class keyword in a class declaration. The combinator &lt;em&gt;satisfyC&lt;/em&gt; is used for this purpose. It is developed as a Groovy closure and has the form:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;def satisfyC = { pred -&gt;&lt;br /&gt;  return { inp -&gt;&lt;br /&gt;    if(inp.size() == 0)&lt;br /&gt;      return ... fail ...&lt;br /&gt;    else if(pred(inp[0]))&lt;br /&gt;      return ... success ...&lt;br /&gt;    else&lt;br /&gt;      return ... fail ...&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;em&gt;satisfyC&lt;/em&gt; closure accepts a single parameter and returns a closure object.&lt;/div&gt;&lt;div align="left"&gt;The returned closure is the required parser. Observe how calling this closure with differing actual parameters will deliver a family of different parsers.&lt;br /&gt;&lt;br /&gt;The &lt;em&gt;pred&lt;/em&gt; parameter of closure &lt;em&gt;satisfyC&lt;/em&gt; is used in the definition of the returned closure. This is sometimes known as a free variable. It is bound to the actual parameter when satisfyC is called and is the reason for the differing parsers that can be delivered by this one closure definition.&lt;br /&gt;&lt;br /&gt;The &lt;em&gt;pred&lt;/em&gt; parameter of &lt;em&gt;satisfyC&lt;/em&gt; is a predicate closure, one that is applied to a single parameter and returns a boolean result. It is used to test the first item in the input stream, denoted by the &lt;em&gt;inp&lt;/em&gt; parameter. If the test succeeds then the parser reports success otherwise it reports failure when the test fails or when it encounters the end of the input.&lt;br /&gt;&lt;br /&gt;The success and failure of a parse is captured by simple objects. A successful parse will generate and object that has the result of the parse and the remaining input as its state.&lt;br /&gt;&lt;br /&gt;We can start to produce simple character parsers with &lt;em&gt;satisfyC&lt;/em&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def isLetter = { ch -&gt; return Character.isLetter(ch) }&lt;br /&gt;  def isDigit = { ch -&gt; return Character.isDigit(ch) }&lt;br /&gt;  def letter = satisfyC(isLetter)&lt;br /&gt;  def digit = satisfyC(isDigit)&lt;br /&gt;  assert letter('abc###') == ... success with bc### still in the input stream ...&lt;br /&gt;  assert letter('123###') == ... fail ...&lt;br /&gt;  assert digit('123###') == ... success with 23### still in the input stream&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Where the parse succeeds, then the character that has been recognised is maintained&lt;br /&gt;as the result by the success object. The remainder of the input is also maintained&lt;br /&gt;by the success object. This remaining input is then made available for processing&lt;br /&gt;by another parser.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The GParsec Library&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The library comprises a number of such combinators. Many are no more complex than &lt;em&gt;satisfyC&lt;/em&gt;. Some are simply combinations of others. The complete library is only some 200 LOC! In the second article in this series we shall consider these combinators and how they are combined to construct useful parsers.&lt;br /&gt;&lt;br /&gt;Ken Barclay&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p align="left"&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-4850637584824649533?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/4850637584824649533/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=4850637584824649533' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/4850637584824649533'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/4850637584824649533'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2008/02/groovy-combinator-parsing.html' title='Groovy Combinator Parsing'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-6030082779675818205</id><published>2008-01-08T10:50:00.000-08:00</published><updated>2008-01-10T06:36:33.606-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='MapReduce Groovy closure functional programming'/><title type='text'>Google's MapReduce as Groovy closures</title><content type='html'>Google’s MapReduce programming model [&lt;a href="http://labs.google.com/papers/mapreduce.html"&gt;http://labs.google.com/papers/mapreduce.html&lt;/a&gt;] serves for processing and generating large data sets. Users specify a mapper function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reducer function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Here we illustrate two simple examples: (a) for counting occurrences of words in documents; and (b) for creating an inverted index of the words in documents.&lt;br /&gt;&lt;br /&gt;Both examples use the same mapReduce function, implemented as a Groovy closure. The function accepts two parameters representing, respectively, the mapper and reducer functions (also closures). The mapper, reducer and mapReduce functions exploit many of the functional programming techniques supported through Groovy closures. To aid the development we have prepared an abstract class ListFunctions which defines many of the standard list processing functions (closures) such as map, filter, composition, etc. The abstract class MapFunctions provides similar functions when applied to Maps.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Example 1: Word count&lt;br /&gt;&lt;br /&gt;import com.adt.fp.ListFunctions as LF&lt;br /&gt;import com.adt.fp.MapFunctions as MF&lt;br /&gt;&lt;br /&gt;def mapReduce(mapper, reducer) {&lt;br /&gt;def reducePerKey = LF.composition.curry(MF.mapWithKey.curry({ k, v -&gt; v}),&lt;br /&gt;LF.composition.curry(MF.filterWithKey.curry({ k, v -&gt; (v != null)}),&lt;br /&gt;MF.mapWithKey.curry(reducer)))&lt;br /&gt;&lt;br /&gt;def groupByKey = { kvList -&gt;&lt;br /&gt;def insert = { mp, kv -&gt; return MF.insertWith(LF.join, kv[0], [kv[1]], mp) }&lt;br /&gt;return LF.foldL(insert, MF.empty(), kvList)&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;def mapPerKey = LF.composition.curry(LF.concat,&lt;br /&gt;LF.composition.curry(LF.map.curry(LF.uncurry(mapper)), MF.toList))&lt;br /&gt;&lt;br /&gt;return LF.composition.curry(reducePerKey, LF.composition.curry(groupByKey, mapPerKey))&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;def flip = { f, x, y -&gt; return f(y, x) }&lt;br /&gt;def mkPair = { x, y -&gt; return [x, y] }&lt;br /&gt;&lt;br /&gt;def tokenize = { str -&gt; return str.tokenize() }&lt;br /&gt;def sum = LF.foldL.curry({ x, y -&gt; x + y }, 0)&lt;br /&gt;&lt;br /&gt;def occurMapper = { key, words -&gt; return LF.map(flip.curry(mkPair, 1), tokenize(words)) }&lt;br /&gt;def occurReducer = { key, occurs -&gt; sum(occurs) }&lt;br /&gt;&lt;br /&gt;def wordOccurrenceCount = mapReduce(occurMapper, occurReducer)&lt;br /&gt;&lt;br /&gt;def docs = ['doc1': 'fold the fold', 'doc2': 'appreciate the unfold']&lt;br /&gt;&lt;br /&gt;assert wordOccurrenceCount(docs) == ["unfold":1, "the":2, "appreciate":1, "fold":2]&lt;br /&gt;&lt;br /&gt;Example 2: Inverted index&lt;br /&gt;&lt;br /&gt;import com.adt.fp.ListFunctions as LF&lt;br /&gt;import com.adt.fp.MapFunctions as MF&lt;br /&gt;&lt;br /&gt;def mapReduce(mapper, reducer) {&lt;br /&gt;// …&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;def flip = { f, x, y -&gt; return f(y, x) }&lt;br /&gt;def mkPair = { x, y -&gt; return [x, y] }&lt;br /&gt;&lt;br /&gt;def tokenize = { str -&gt; return str.tokenize() }&lt;br /&gt;&lt;br /&gt;def indexMapper = { key, words -&gt; return LF.map(flip.curry(mkPair, key), tokenize(words)) }&lt;br /&gt;def indexReducer = { key, docs -&gt; docs.unique().sort() }&lt;br /&gt;&lt;br /&gt;def invertedIndex = mapReduce(indexMapper, indexReducer)&lt;br /&gt;&lt;br /&gt;def docs = ['doc1': 'fold the fold', 'doc2': 'appreciate the unfold']&lt;br /&gt;&lt;br /&gt;assert invertedIndex(docs) ==&lt;br /&gt;["unfold":["doc2"], "the":["doc1", "doc2"], "appreciate":["doc2"], "fold":["doc1"]]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The mapReduce function returns the function (closure) that performs the required processing. It is defined in terms of its mapper function parameter and its reducer function parameter. The definition for mapReduce is in terms of its support functions mapPerKey, groupByKey and reducePerkey [http://www.cs.vu.nl/~ralf/MapReduce/]. The helper function mapPerKey exports the Map input data to a List then maps the mapper function over this List. The support function groupByKey groups intermediate values by intermediate keys. Finally, reducePerKey maps the reducer function over the groups of intermediate data.&lt;br /&gt;&lt;br /&gt;In Example 1, the mapper function makes the pairs [word, 1] for every word in the source input. It is implemented by mapping the function flip.curry(mkPair, 1) over each word. This clever function uses mkPair, to produce the pair and flip ensures they are in the correct order. The reducer function for this example simply sums each of the 1s in the intermediate values.&lt;br /&gt;&lt;br /&gt;In Example 2, the mapper function delivers the pairs [documentedID, word] for every word in the source input. The reducer function removes any duplicates and sorts the List of documentIDs from the intermediate values.&lt;br /&gt;&lt;br /&gt;Sample code and a more expansive decription can be found at [&lt;a href="http://www.dcs.napier.ac.uk/~cs05/groovy/googleMAPREDUCE.zip"&gt;Groovy MapReduce&lt;/a&gt;].&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-6030082779675818205?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/6030082779675818205/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=6030082779675818205' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/6030082779675818205'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/6030082779675818205'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2008/01/googles-mapreduce-programming-model.html' title='Google&apos;s MapReduce as Groovy closures'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3255923023602927617.post-2183154262016838114</id><published>2007-11-08T10:31:00.000-08:00</published><updated>2008-01-08T10:49:50.847-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Groovy'/><category scheme='http://www.blogger.com/atom/ns#' term='External DSLs'/><category scheme='http://www.blogger.com/atom/ns#' term='Combinator Parsers'/><title type='text'>Combinator Parsers and External DSLs</title><content type='html'>&lt;span style="font-family:arial;"&gt;For the past few months I have been investigating the opportunities and challenges of Domain Specific Languages (DSL) in Groovy. One issue surrounding external DSLs (eDSL) is the need to compile the DSL into code that accesses the application API. This often suggests parser generators such as javaCC or Antlr. If they are unfamiliar to the developer they can present a steep learning curve. They also require that additional code for the DSL has to be developed in addition to the application code.&lt;/span&gt;&lt;br /&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;Another consideration would be to use a parser combinator framework such as Jparsec. This is a classic Java implementation of a combinator parser exhibiting all the usual OO characteristics. It is also a very large library with many classes, many methods, etc.&lt;/p&gt;While seeking a solution to some of these issues I also sought one that would exhibit some of the strengths of Groovy. The approach described in [&lt;/span&gt;&lt;a href="http://www.dcs.napier.ac.uk/~cs05/groovy/externalDSL.zip"&gt;&lt;span style="font-family:arial;"&gt;Combinator Parser&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;] exploits Groovy closures. Each closure defines a simple combinator parser that can be combined into more elaborate parsers. The parser is then transformed with yet another combinator into a compiler [&lt;/span&gt;&lt;a href="http://www.dcs.napier.ac.uk/~cs05/groovy/externalDSL.zip"&gt;&lt;span style="font-family:arial;"&gt;External Domain Specific Languages&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;]. Writing a parser/compiler is then no more complex than writing assignments that reflect the DSL grammar. For example:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;pre&gt;&lt;br /&gt;def factor = altC(identifier, integer)  // factor: identifier  integer&lt;br /&gt;def term = sepByC(factor, multiply)     // term: factor { '*' factor }*&lt;br /&gt;def expr = sepByC(term, add)            // expr: term { '+' term }*&lt;br /&gt;&lt;/pre&gt;Here, the &lt;em&gt;altC&lt;/em&gt; combinator parsers alternatives, while the &lt;em&gt;sepByC&lt;/em&gt; parses one or more occurences of the first parser separated by the second parser. The sepByC combinator is, in fact, an assembly of the more primitive combinators &lt;em&gt;seqC&lt;/em&gt; (a sequence of two parsers) and &lt;em&gt;noneOrMoreC&lt;/em&gt; (none or more occurrences of a parser).&lt;br /&gt;&lt;br /&gt;Each combinator is defined as a method that returns a closure (the parser). For example, the &lt;em&gt;satisfyC&lt;/em&gt; combinator appears as:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;def satisfyC = { pred -&gt;&lt;br /&gt;    return { inp -&gt; if(pred(inp[0])) ... success ... else ... fail ... }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;The closure determines if the first item in the input stream satisfies some predicate (a closure with a single parameter, returning a boolean). Observe how the predicate is a 'free' variable in the closure definition and is set by the call to &lt;em&gt;satisfyC&lt;/em&gt;. Examples might include:&lt;br /&gt;&lt;br /&gt;def isLetter = { ch -&gt; ... is ch a letter? ... }&lt;br /&gt;def isDigit = { ch -&gt; ... is ch a digit? ... }&lt;br /&gt;def letter = satisfyC(isLetter)&lt;br /&gt;def digit = satisfyC(isDigit)&lt;br /&gt;&lt;br /&gt;from which it is easy to define:&lt;br /&gt;&lt;br /&gt;def identifier = seqC(letter, noneOrMoreC(altC(letter, digit)))&lt;br /&gt;def integer = oneOrMoreC(digit)&lt;br /&gt;&lt;br /&gt;Pleasingly, each combinator is only a few lines of Groovy code [&lt;/span&gt;&lt;a href="http://www.dcs.napier.ac.uk/~cs05/groovy/externalDSL.zip"&gt;&lt;span style="font-family:arial;"&gt; Combinators.groovy&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;]. The complete set is defined in less than 200 LOC! The combinators are generic insofar as the input stream may be a character sequence (a character parser) or a token stream ( a parser).&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;Currently, I am working to extend the code with support for error reporting. Additionally, I have defined a combinator parser that reads a BNF definition and generates the combinator parser for that grammar - a kind of combinator parser compiler compiler.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I should be pleased to receive feedback from other developers.&lt;br /&gt;&lt;br /&gt;Ken Barclay&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3255923023602927617-2183154262016838114?l=kenbarclay.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kenbarclay.blogspot.com/feeds/2183154262016838114/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3255923023602927617&amp;postID=2183154262016838114' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/2183154262016838114'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3255923023602927617/posts/default/2183154262016838114'/><link rel='alternate' type='text/html' href='http://kenbarclay.blogspot.com/2007/11/combinator-parsers-and-external-dsls.html' title='Combinator Parsers and External DSLs'/><author><name>Ken Barclay</name><uri>http://www.blogger.com/profile/04361889154891370742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry></feed>
