I’d like to share something I’ve been working on in the last few days (between other things), which I think is an interesting case study on the intersection between FP and Java.
I started from some existing code that sorely needed a refactoring. The feature covered by that code is easily said: XML transformations. Basically the problem statement is this: we want to define some generic transformations over BPMN process definitions, that are represented as XML documents.
The transformations needed are:
To select elements and attributes XPath expressions has been used. For example we need them to define where to add elements/attributes or what elements/attributes to remove. In general, the existing solution was implemented using the standard Java DOM model.
The problems with the original solution were:
There was a clear need for a redesign.
I’m going to sketch a solution to this problem using Java 8 code and type signatures written in a Haskell-Java Frankenstein notation. Since this solution is oriented towards FP, I’m going to use Functional Java to support me in this task (which BTW is a good library for this kind of stuff).
The core idea is that, conceptually, you can think of an XML transformation as a pure function:
XmlTransformation :: Document -> Document
This function can be expressed using the Functional Java (FJ) F interface:
public interface XmlTransformation extends F<Document, Document> {}
Since we don’t want to suffer from primitive obsession, we define some useful types like:
newtype XPathExpression = XPathExpression String
With just this basic abstraction in place, we can implement our basic transformations. For example:
public class AddNode implements XmlTransformation {
private AddNode(XPathExpression parentExpr, Element e)
{
// [...]
}
public static XmlTransformation addNode(XPathExpression parentExpr,
Element e)
{
return new AddNode(parentExpr, e);
}
public Document f(Document d) {
// [...]
}
}
For simplicity, I won’t bother with exceptions and I assume that if a transformation fails it simply return the unchanged input document. Everything remains more or less the same by redefining
XmlTransformation
as:public interface XmlTransformation extends Try1<Document, Document, Exception> {}
But at the end of this post we’ll see an alternative way to manage transformation errors.
At this point we have the terms (basic transformations) but we lack a way to combine them. Well, we can already combine them since in the end they are just functions, but the syntax is not the most convenient:
// Document document;
// Element e1, e2;
XmlTransformation a = addNode(xpathExpression("/a/b"), e1);
XmlTransformation b = addNode(xpathExpression("//c"), e2);
Document transformed = b.f(a.f(document));
I’ve statically imported
AddNode.addNode
andXPathExpression.xpathExpression
static factory methods [sic] for brevity.
Well, XmlTransformation
s are just functions, so we could use the composition operator to combine them. Unfortunately, we need some upcasts to make this work:
// Document document;
// Element e1, e2;
XmlTransformation a = addNode(xpathExpression("/a/b"), e1);
XmlTransformation b = addNode(xpathExpression("//c"), e2);
Document transformed = Function.compose((F<Document, Document>) b,
(F<Document, Document>) a)
.f(document);
Anyway, what we want here is an easy and elegant way to combine two transformation and obtain another one. Basically:
composeT :: (XmlTransformation, XmlTransformation) -> XmlTransformation
This way, starting with basic transformations as building blocks, we can define arbitrary complex transformations just by composing simpler ones. It’s like growing a language for XML transformations.
If this sounds interesting, you may be interested in the paper “Why Functional Programming Matters” by John Hughes.
But what if we want to compose an arbitrary number of transformations? What we need now is a way to build a data transformation pipeline by composing our transformations:
transformWith :: [XmlTransformation] -> XmlTransformation
Yes, we could just write a loop with a call to composeT
, but by recognizing a much more fundamental abstraction lurking underneath this problem we’ll be able to write a very concise, flexible and IMO elegant combinator for our XmlTransformation
function.
It all becomes much easier when you recognize that you are staring in the eyes a monoid. The gist of it is that unary functions forms a monoid over the function composition operation. In other words, XmlTransformation
forms a monoid over our composeT
function. Indeed:
a -> a
.composeT
is a binary operation, which is associative.Since all the monoid laws are satisfied, we can define a proper Monoid
instance for our XmlTransformation
type:
Monoid<XmlTransformation> m = Monoid.monoid((t1, t2) -> composeT(t2, t1),
Function.<XmlTransformation>identity());
Then we can use this monoid instance to write transformWith
in a much easier way:
public static XmlTransformation transformWith(List<XmlTransformation> xs)
{
return m.sumLeft(xs);
}
We could even define another method/function to make things nicer:
public static XmlTransformation transform(Document d, XmlTransformation... xs)
{
List<XmlTransformation> ts = Array.array(xs).toList();
return m.sumLeft(ts).f(d);
}
What we’ve done here is actually defining a very small DSL. We have some basic terms (our transformations):
addElement
removeElement
addAttribute
removeAttribute
And ways to combine them:
composeT
transformWith
This is all we need (with a lot of static imports and some tactical use of varargs) to define XML transformations in a declarative way:
Document transformed =
transform(document,
addElement(xpathExpression("//b"), node("<foo/>")),
addAttribute(xpathExpression("/a/d"), attribute("bar", "baz")),
removeElement(xpathExpression("//g"))
//[...]
);
This solution IMHO is clean, concise, expressive, flexible and very powerful. It’s very easy to define new basic transformations (remember: each of them can be tested independently!) and to compose them to build more complex ones (the power of abstraction). Their composition is already been taken care of, once and for all.
It’s also easy to extend it to have a more sophisticated error-handling strategy. For example by accumulating errors in Either
values:
XmlTransformation :: Either<NonEmptyList<String>>, Document>
-> Either<NonEmptyList<String>>, Document>
The implementation of composeT
is just a little bit longer (but easy), and the Monoid definition doesn’t change at all!
Perhaps a comparison like this is a bit unfair, but with Haskell you get all this already for free!
Prelude> :i (->)
data (->) a b -- Defined in `GHC.Prim'
instance Monad ((->) r) -- Defined in `GHC.Base'
instance Functor ((->) r) -- Defined in `GHC.Base'
instance Applicative ((->) a) -- Defined in `GHC.Base'
instance Monoid b => Monoid (a -> b) -- Defined in `GHC.Base'
;)