Archive | Book Review RSS feed for this section

Skimmers Guide for Week 8 of The Well Grounded Java Developer

24 Mar

“As you can imagine, people are somewhat passionate about their favorite web framework!”

What did we read about?

This week we finish our first book.

  • Chapter 13. Rapid web development

  • Chapter 14. Staying well-grounded

The chapter on rapid development begins by looking at the why Java is less than ideal for rapid web development.  It explains why static typing and the need to build and deploy slow down web development.  Matt Raible’s rating scheme was used to assess the available web frameworks and Grails came out on top. An extended practical introduction to Grails, a Groovy based framework, followed.  The chapter concluded with a brief look at Clojure’s Complojure framework.

The final chapter looked to the future and the new features promised for Java 8.  The introduction of Lambdas will allow for a more functional style of programming using plain old Java.  We also read about the modularity of Project Jigsaw, but the text is a little out of date.  Jigsaw has been postponed to Java 9.  The book concludes by looking further into the future and the promises of Meta Object Protocols, Coroutines and Tuples.

What stood out?

  • The Grails walkthrough covers a LOT of ground.  If you can spare the time to work through it you won’t be disappointed.

  • For those who are not yet ready to abandon Java there are plenty of useful tips showing how to make Java web development less painful.  They point to tools like Spring Roo and JRebel.

  • The comprehensive tooling of Grails and the design simplicity of Compojure make for an interesting contrast.

If you read nothing else this week…

  • “Criteria in selecting a web framework” uses Matt Railbles rating scheme to take an objective view of the frameworks available.  It uses a weighted rating scheme based on 20 criteria to rank the frameworks.  It’s good to see rational reasoning in a field dominated by tool evangelism.  Remember to change the weightings to reflect your needs.  How does your favourite framework score?

  • If you haven’t got time to work through the Grails example then please spend a couple of hours on Compojure.  It shows how good design can makes things easy through simplicity.

Programming Concurrency on the JVM

15 Feb

“Programming Concurrency on the JVM: Mastering Synchronization, STM, and Actors” by Venkat Subramaniam

Are you an experienced Java programmer? Do you want to explore your options of managing and making use of concurrency on the JVM using either Java or other JVM languages such as  Scala, Clojure, Groovy and JRuby? Or maybe you’ve heard about the Software Transactional Memory (STM) or actor-based model and are wondering if there is a way you could make use of them while still writing for the JVM? If so, this book is for you.

The book is the fourth one by the same author featured by the Pragmatic Bookshelf. As it was the case for all books under this label that I’ve read, this book is pleasing to read. 280 pages make it not very long and a good part of it is code. The book is available both as a paper book and as an eBook, in three formats: epub, PDF and mobi/Kindle.

The author uses simple but nicely chosen problems as a basis for the discussion of different approaches to concurrent programming. The source code for examples used in the book is available from the publisher’s site. Unfortunately, these are not runnable programs. For instance, in java files, there are packages declared, but there is no corresponding folder structure. So, in order to run the code, you will need to move some bits around. This is an inconvenience that could have been easily avoided.

The book is well structured and besides the usual introduction and summary parts, consists of four parts: “Strategies for Concurrency”, “Modern Java/JDK Concurrency”, “Software Transactional Memory” and “Actor-Based Concurrency”. I found the “Recap” paragraph at the end of every chapter very useful. It provides nice summary and will be a useful reference when reaching for the book next time.

The book starts with highlighting the problems that concurrency is trying to solve and points that all the benefits are not for free – “The Power and Perils of Concurrency”. What follows is a discussion of possible ways of turning your sequential program into concurrent one with consideration of whether the application is IO or computation intensive and how it affects the number of threads needed. One of the first questions you will need to answer when writing a concurrent application is how to deal with state. The author quickly explores three available options: shared mutability, isolated mutability and pure immutability. These first chapters nicely prepare the ground for further parts.

In the “Modern Java/JDK Concurrency” part, the author looks at the new threading APIs introduced in Java 5 and later. A message for a reader is to forget about old threading APIs and benefit from:

  • new and better ways of managing a pool of threads and scheduling tasks for concurrent execution,
  • fine-grained synchronization using locks,
  • new concurrent data structures

The section on Java 7 Fork-Join API is a definite sign that the book is up to date.

As a nice bonus, at the end of this part, the author presents a small program with a few flaws and walks through the process of eliminating them.

As the author suggests himself, the book just gives an overview of solutions available in Java and the JDK. If you need one that goes more into details you should grab “Java Concurrency in Practice” by Brian Goetz.

Almost half of the book is taken by parts on Software Transactional Memory and Actors. These parts have similar structure. There is one chapter devoted for introduction and discussion of the concept, with examples in one or two of the languages on the JVM. The chapter following shows how to implement a solution in different JVM languages, often by using a library discussed previously. The author diligently highlights potential problems when using libraries across JVM languages and shows how to work around them. I personally liked the few pages of theory before digging in to the code examples.

STM is explained in the context of Clojure which has support for it built into the core language. We are shown how STM enables lock-free programming and eliminates the need for synchronization. The author discusses suitability of STM for different problems and shows with an example that it is not meant to be used in applications that have a lot of write collisions. The book also covers an alternative implementation of STM in Java – Multiverse and support for it in Akka.

In the part on actor-based concurrency, the author uses the Java and Scala APIs of Akka to walk us through different types of actors, ways of exchanging messages between them and coordinating them. Since the examples in Java and Scala are not that different, I think it would be better if only one language was used in the introduction chapter and the examples in the other language were in the chapter that follows. Having examples in both slightly distracts. As it was the case for STM, the author concludes the chapter with a discussion of the limitations of the actor model.

The second chapter in this part contains examples in JRuby, as well as in Groovy using GPars.


Although not my daily bread, I write some concurrent code every now and then. I reached for this book mainly to learn about STM and actors and see when and how I could benefit from these approaches to concurrency. And I was not disappointed. The author did very good job in explaining them and provided a comprehensive overview of available implementations.

Regarding the title, I would say that the word “mastering” may be a little overstatement. I think the book is an excellent starting point and once you choose an approach for you problem there is further reading to be done.

The book provides nice coverage of options available when programming concurrent applications on the JVM, and although the implementation might not be written in the same language you are working with, they are still very viable options.

I really enjoyed reading the book.

About the Reviewer

Tomasz Wróbel is a developer who spends most of his work day building web applications in Java.  In his free time he explores functional programming with Scala and plays a little bit with Android.

Book Review – Clojure In Action

25 Nov

Disclaimer: The reviewed version was from the Manning Early Access Program in PDF form.


Clojure In Action is a great book for a developer looking to expand their skills into a new language area. It assumes the reader has a reasonable level of understanding of major technologies and practices. This is not a beginners book. The writing is thoughtful and clear. A common theme throughout the book is that concepts are developed in much the same way that one might develop a solution in the real world. For example in the section on messaging, we first look at a cheap and cheerful message passing system, then the author observes that the solution mixes “interfacing with the messaging system” and  “handling the messages”. The simple solution is then enhanced to allow a clean separation. The system is then further developed into a large scale distributed processing system with robust error handling etc.

My feelings after reading the book are that learning clojure will make you a better developer, even if you don’t use the language directly. The approach taken by the language, as discussed in this book, will help you think about problems in different ways. Be warned tho, if you don’t have familiarity with LISPs there is a lot to learn before you can progress. The author addresses this early on, when he talks in chapter 3 about the ‘productivity ravine’. This is that familiar feeling when learning a new way of doing something you experience a steep drop in productivity and have to work hard to climb back up (and hopefully) surpass your previous level.

I’d argue that because the LISP concepts are less familiar than most the ravine is particularly deep in this case. There are certain concepts that you really have to conquer before you can really do anything. The first few forays into the clojure REPL will be a frustrating experience. Having said that the book is structured with this is mind. I would advise really studying part 1 closely to build a good foundation before moving onto real world examples in part 2 or you won’t be making the most of the book.

Structure of the Book

The book is divided into two parts

  1. Getting Started – Need to know all this to ‘grok’ clojure
  2. Getting Real – Can dip in and out as required as a reference to learn the clojure slant on a particular technology

Part I – Getting Started

The Getting Started section is a well pitched overview of the clojure language structure and delves into some of the compelling features that make clojure attractive for certain classes of programs. It tackles head on the barriers, perceived or otherwise, to learning the language. The most common phrase you will hear when anyone discusses a lisp based language like clojure is that the syntax is ‘hard’, or that ‘all those parentheses’ make life difficult. The author makes a good case why these criticisms are unwarranted.

In my view the unfamiliarity, rather than the complexity of the syntax is what jars with newcomers to the language. One of the key points about LISP is that there really is very *little* syntax and that it’s so uniform. An interesting observation is that since clojure is so much more compact than java, often if you compare two equivalent programs, one written in clojure, the other in java – it will be the clojure program that will have *less* parentheses and they won’t be as structured!

I found this first section really useful in consolidating my rudimentary knowledge. I enjoyed (re)discovering some aspects of software that I had forgotten. The section on how clojure handles polymorphism via multi-methods was a great enlightenment on the limitations of C++ “inheritance based” dispatch. The great Alan Kay is quoted here to highlight that there are other ways to achieve the goal….

“Actually, I invented the term Object Orientation and I can tell you that C++ is not what I had in mind”.

The author avoids the trap of simply telling us what’s wrong with a particular approach and instead takes the positive path of explaining which of the limitations clojure overcomes.

Rich Hickey (Clojure’s creator) realised that, with the myriad java libraries out there, any JVM language MUST interact with java. I was a bit disappointed with the coverage of this key area in chapter 5. Sure all the key points are covered, but I felt that they deserved a more practical approach, more examples to properly illustrate the clojure forms here.

Chapter 6 about state and concurrency follows a, hopefully by now, familiar pattern. First we read about the problems with the mish-mash of state/identity and traditional approaches to handling these conflicts. Immutability as a default is well justified in the text. We then learn about the tools clojure provides to tackle the different aspects of this thorny problem. This is an area where clojure has made some bold choices. The success of the language as a whole will hinge on whether the these choices allow us to solve the upcoming problems with the an acceptable level of fuss and ceremony.

Part II – Getting Real

The Getting Real section takes a practical approach to real world problems. It will serve as a useful reference for how to get started with various tasks in clojure.

First it starts off in chapter 8 with an introduction to support for Test Driven Development(TDD) specifically in clojure. The section expands into how to deal with method stubbing and mocking and ends with advice about organizing tests. In common with the earlier sections a good conceptual level of TDD knowledge is assumed, this section is not going to convince you that TDD is the right or wrong thing for you, but if you’re already drinking the TDD kool-aid, you’ll feel very at home here.

The book starts shifting up through the gears in Chapter 9 about data storage. It has example code for connecting to MySQL, HBASE and Redis is discussed. Obviously each of these technologies have their own chunk of bookshelf dedicated to them, so here we’re just learning enough to be dangerous! The book shows how easy evaluating and hacking around on these technologies is in the clojure REPL. The HBASE example stores a deep object graph and retrieves it in a few dozen lines, quite refreshing if you’re used to battling with JPA / Hibernate annotations in Java (the text doesn’t provide anything like a production ready solution, but the potential is clear).

This demonstrates both the power of clojure but also highlights one of the major barriers to learning. Clojure code is very ‘dense’. You can get a lot done in a relatively small number of lines, but to understand those lines you need to have internalized the key APIs. I found it useful to have a REPL open as I read so I could invoke the ‘doc’ and ‘find-doc’ function easily to help me decode the code I was reading.

The chapter on “clojure and the web” talks about the leading clojure web frameworks. One of the key attractions of web programming is that you can very easily see the fruits of your labours, this chapter will help you see something in your browser pretty quickly, but I don’t believe that, currently, clojure is really the right tool for writing complex web apps. The text doesn’t do much to change my mind. There is very active development in the clojure community in this space, so I’ll certainly revisit this view in a few months time.  Simple web-services may be a good fit here however.

Chapter 11, about scaling though messaging, is where the power of clojure really starts to come through. By the end of the chapter we have developed a distributed processing system in around 200 lines of code. If you’re like me you will spend a long time studying those 200 lines to really decipher the meaning. The amazing possibilities of macros comes through here, but again this is where the reader really needs to put the time in to understand what’s going on. This isn’t a criticism of the book, I found myself going back to earlier descriptions a few times and realised that I hadn’t properly followed the authors explanations the first time round, but that with a bit more knowledge on my part the fog cleared.

Chapter 12 introduces some common ‘large scale’ tasks and talks about clojure / Hadoop integration. Again the author’s style here is helpful in quickly seeing the opportunities clojure can offer in this space.

Chapters 13 – 15 fill in the gaps in our clojure learning with information about the last few features of clojure that haven’t been covered yet. The macro section is excellent but difficult. The author notes that.

“Finally, we’ll look at writing macros that generate other macros. This can be tricky, and we’ll look at a simple example of such a macro. Understanding macro-generating macros is a sign of being on the path to macro zen”

(page 358)

He also addresses the motivations for moving computation from run-time to compile time, a key benefit of macros, that I hadn’t properly appreciated before reading this book. We see an example of generating, at compile time, a cryptographic tableau for a rotation based cypher.

I don’t feel I’ve achieved clojure zen or even macro zen yet, but I’m more confident that I’m on the right path with the help of this book.


About the Reviewer

Neale Swinnerton (@sw1nn) has been a professional software developer for over 20 years, mostly in the financial services industry. He recently started investigating startup opportunities.

Working Effectively with Legacy Code

4 Nov

There are some books that change the way in which we think about code. These are the books that are always close to hand. They become dog-eared and worn through constant use. We come to know their pages intimately and they become more a creed than a reference for us.

For me Michael Feathers‘ Working Effectively with Legacy Code is one such book.  It is impossible for me to write a subjective review, I’m just to close to the text.  I’m sure my gushing about how great the book is would soon get boring.  Instead I will try to give a overview of what the book is about so that you can make up your own mind.

There is a crisis coming: a software maintenance crisis.  Simple economic analysis tells us that the crisis is inevitable.  The effort needed to change software increases as it ages.  New software is being written all the time.  The code that companies depend upon is constantly growing and ageing.  The number of skilled programmers is finite and increases slowly.  Something has got to give.

Colossus, the world's first electronic programmable computer.

If we are to cope with this crisis then we must learn better ways to work with our legacy code.  Despite the vital need very few books have been written on the subject.  Working Effectively with Legacy Code is one of the few and by far the most practical.  The author defines legacy code as code without tests, and therefore focuses on the task of introducing tests as the most important skill required for dealing with legacy code.

The book is split into three section.  In the first section the problem of changing is existing code is explained.  In the second section is set out as a FAQ for maintenance programmers.  The final section defines a set of dependency breaking techniques that developers can rely upon for safely introducing tests.

The Mechanics of Change

The book starts by introducing a model of software change.   Instead of rehashing tired old material written on the subject the author provides his own personal perspective on the subject.  This is a pragmatic approach that has clearly been developed the hard way in the legacy software trenches.  It also has a strong agile bias, which is clear in the definition given for Legacy Software:

Legacy code is simply code without tests…  Code without tests is bad code.  It doesn’t matter how well written it is it doesn’t matter how pretty or object-oriented or well-encapsulated it is.  With tests we can change the behaviour of our code quickly and verifiably.  Without them, we really don’t know if your code is getting better or worse.

Page xvi

How can we change software?  We can alter it’s structure, modify existing functionality, introduce new functionality or change the way in which resources are used.  We can make these changes by adding features, fixing bugs, refactoring or optimising.

The biggest challenge lies not in what we change, but in ensuring that we don’t inadvertently change something we shouldn’t. Every time we make a change their is a considerable risk that we introduce new bugs. This is why tests are so important, without them we can break important functionality without realising.

If the code doesn’t have automated tests, then our first goal should be to introduce them. We should never try to make changes without the tests being in place to serve as a safety net. This leads us to a dilemma: how can we introduce the tests if we cannot make changes?  Is it realistic to refuse to do any bug fixes or change requests until 100% test coverage has been achieved?

To navigate through this dilemma a Legacy Code Change Algorithm is presented:

  1. Identify change points.
  2. Find test points.
  3. Break dependancies.
  4. Write tests.
  5. Make changes and refactor.

All of the techniques described in the book have the goal of achieving one of these steps.

There are three critical concepts that must be understood before these techniques can be successfully adopted: sensing, separation and seams.  Sensing and separation are the two goals of dependancy breaking.  Sensing involves accessing the values computed so that they can be tested.  Separation involves accessing the code so that it can be loaded into a test harness.

The key method is the faking of collaborators.  Here the book is showing it’s age a little since it talks about creating fake objects and briefly mentions mocks.  Obviously, this areas has moved on a lot since the book was written, but the theory is still sound.

Seams are the places that allow us to inject new behaviour without having to change the code.  With seams the code can be pulled open to allow access.  There are three types of Seam: pre-processor, link and object.  As Java programmers we mostly concerned with object seams.

The section concludes with a discussion of the tools available at the time of writing.  Again, the books age is apparent in this discussion.

A centrifugal governor

Changing Software

The second section reads like an extended FAQ.  It presents a list of problems such as “my application has no structure”, “I need to make a change, what methods should I test” and “I need to make many changes in one area.”  For each problem there is a discussion of techniques that may be used to overcome it.  There is a wide range of possible techniques, adapted to fit the problem being discussed.

The techniques for “my application has no structure” are high level and designed to promote discussion and collaboration.    One technique is to tell the story of the system: describing the most important aspects of the system in a few simple sentences and then drilling down into the details.  Another technique is to use Naked CRC cards where blank cards are placed on the table but not written on.  Instead they act as counters in the discussion, being moved around to represent the various activities within the code base.

The techniques for “I need to make a change, what methods should I test” are more practical.  In this chapter one of the most interesting tools in the book is described: effect sketches.  Effect sketches are used to describe how the different elements of the code effect each other, allowing us to reason forward about the effects that a change in one place may have elsewhere.  Mark Needham has written an interesting blog post on effect sketches:

The chapter “I need to make many changes in one area” demonstrates how effect sketches might be used.  The problem is when there are several closely related classes that all need to be changed.  Breaking dependancies for every class to introduce tests requires a lot of effort before any changes can be made, which may not be practical.  The answer is to take a step back and write a single well placed test that will cover the whole area being changed.  The trick is in finding the right place to put the covering test.

The place to put the covering test is called a pinch point:

A pinch point is a natural encapsulation boundary.  When you find a pinch pint, you’ve found a narrow funnel for all of the effects of a large piece of code.  If the method BillingStatement.makeStatement is a pinch point for a bunch of invoices and items, we know where to look when the statement isn’t what we expected.

Page 182

Section two is full of techniques like these that help resolve some of the conflicting concerns faced whilst working with Legacy Code.  You probably won’t use all of them, but you are sure to find some of them invaluable.

Advice is given for all of the following situations related to each stage in the Legacy Code Change Algorithm:

  1. Identify change points
    • I don’t understand the code well enough to change it.
    • My application has no structure.
  2. Find test points.
    • I need to make a change.  What methods should I test?
    • I need to make many changes in one area.
  3. Break dependancies.
    • I can’t get this class into a test harness.
    • I can’t run this method in a test harness.
    • Dependencies on libraries are killing me.
    • My application is all API calls.
  4. Write tests.
    • I need to make a change, but I don’t know what tests to write.
    • My test code is in the way.
    • I need to change a monster method and I can’t write tests for it.
  5. Make changes and refactor.
    • I don’t have much time and I have to change it.
    • It takes forever to make a change.
    • How do I add a feature?
    • My project is not object oriented.  How do I make safe changes?
    • This class is too big and I don’t want it to get any bigger.
    • I’m changing the same code all over the place.
    • How do I know that I’m not breaking anything?

Finally there is a pep talk to help keep us going.

  • We feel overwhelmed.  It isn’t going to get any better.

Dependency-Breaking Techniques

The final section provides a catalogue of refactorings that allow the structure to be changed while preserving behaviour.  There is some overlap here with the refactorings described in Martin Fowler’s book, but these differ because they are adapted for use before the unit tests are in place.  The following is a brief overview of each technique:

  1. Adapt Parameter
    Introduce a simple interface to encapsulate parameters that are difficult to fake. For example, if a method is only using one or two methods of  complex object then a simple interface with just those two methods is introduced.  A production implementation of that interface wraps the complex objects while the test implementation is a simple mock.
  2. Break Out Method Object
    Separate a long method into a separate method object of it’s own.  Local variables within the method can then become instance variables that can accessed externally.
  3. Definition Completion
    This technique is specific to C and C++ but may be adapted.  The definitions are included from the header file and completed with test implementations.  Test code must be kept strictly separate to avoid clashes at link time.
  4. Encapsulate Global Reference
    It is a truth universally acknowledged that global variables are bad.  Making those global references instance references on a global class is one way to deal with them.
  5. Expose Static Method
    Move some code out of a method and into a static method where it is accessible to testing.  This is an intermediate technique that allows us to get some tests in place.  Then bolder refactorings can be used to make the code right.
  6. Extract and Override Call
    If there is just one method call that is getting in the way of using introducing tests we can extract and override it.  This is used often and can be achieved using the Extract Method refactoring.
  7. Extract and Override Factory Method
    Object creation within constructors can be a nuisance.  Extracting that code into a factory method where it can be overridden is a powerful technique if the language will allow it.  It’s not possible in C++ because a constructor cannot call virtual functions in it’s subclasses.
  8. Extract and Override Getter
    Where object creation is causing problems it can be encapsulated in a getter that can then be overriden in a testing subclass.  This  can be used to deal with object creation within constructors for C++ programmers.
  9. Extract Implementer
    Extracting an interface is a powerful method, but naming can be a problem without an automated refactoring tool.  In that case you might want to extract all of the implementation into a subclass and turn the class into an interface.  This is a tricky refactoring and thankfully Java developers don’t need to use it much.
  10. Extract Interface
    Programming to interfaces is the first principle of reusable object oriented design.  Where this principle has not been applied it is easily and safely introduced with a simple refactoring.
  11. Introduce Instance Delegator
    Static methods can be useful, but they can cause grief with static cling if the method’s behaviour is difficult to achieve in test.  Creating an instance method that acts as a proxy can solve this problem.  
  12. Introduce Static Setter
    Some consider singletons to be an anti-pattern.  If a singleton is causing you problems then introducing a static setter will allow you to fake it’s behaviour.  Be careful, though, because you will have to make the constructor non-private.
  13. Link Substitution
    Some poor deprived programmers are still working in a procedural language and are denied the pleasure of polymorphism.  An alternative technique is to replace one function with another through link substitution.  Alternative definitions are placed in include files that can be switched at build time.
  14. Parameterize Constructor
    An easy way to avoid object construction within a constructor is to pass the object in as a parameter.  One small concern is that this can widen dependencies across the system.
  15. Parameterise Method
    Object construction within methods can also be a problem.  Again, passing the object in as a parameter can help.
  16. Primitivise Parameter
    Sometimes we have to make things worse before we can make them better;  especially when dealing with monster dependencies within a complex domain model.  This technique requires us to develop new functionality in a free function that takes primitive parameters and then have the domain object delegate to that function.  Not desirable, but sometimes necessary.
  17. Pull Up Feature
    Move methods into an abstract superclass so that they can be accessed from a subclass for testing.
  18. Push Down Dependency
    Move problematic dependencies into a sub class and out of the way.
  19. Replace Function with Global Pointer
    Used in procedural languages to introduce a Link Seam.
  20. Replace Global Reference with Getter
    Global variables are a real nuisance when introducing tests.  By encapsulating the global with a getter method it allows a test subclass to override the global with something more suitable.
  21. Subclass and Override Method
    Problematic dependancies can be avoided by creating a subclass and overriding the troublesome method.
  22. Supersede Instance Variable
    Used with C++ because virtual function calls in constructors cannot be overridden.
  23. Template Redefinition
    Used in languages like C++ to exploit generic templates.
  24. Text Redefinition
    Used in interpreted languages like Ruby that allow methods to be redefined at runtime .

Coherence 3.5 Book Review

11 Oct

I’ve noticed Oracle Coherence is a technology that is frequently mentioned in job postings, and I’ve decided to get to know it better.  To be honest, I’ve been avoiding these distributed cache solutions for a while now.  I was involved in a project that used Terracotta and the experience left me rather cynical about the approach.  I have always loved my relational database and believed in achieving scale through a clustered RDBMS and a load balanced, stateless web application.

I’ve believed that the whole NoSQL movement can safely be ignored.  For some reasons the Object Oriented community have always had a problem with Relational Databases despite lacking a credible alternative.  Different approaches have come and they’ve gone.  The early efforts, such as Smalltalk’s image files and Python’s ‘pickling‘ always seemed to me to be like reinventing the wheel and ending up with a triangle.

Over the years the wheels have gained more facets to become almost bearable.  The funny thing is that when you move far enough away from the one-sided circle, you start to get something that begins to resemble a circle.  A ten sided decagon looks very like a circle and the thousand sided Chiliagon is indistinguishable from a circle to the human eye.

I tell you this so that you can understand that I am a reluctant reader of this book.  We developers are always having to learn new technologies to keep our skills fresh.  I love reading a good book, but I come to this book as a utilitarian task that must be completed.  I have to learn about this technology, and I look for a book to be written by a competent author who will share with me not just the flawed theory but more importantly their experiences of dealing with all the subsequent difficulties.

Pakt Publishing have a good record in this area, producing niche books written by people with an enthusiasm for their esoteric areas, which is why I chose the book “Oracle Cohesion 3.5” by Aleksandar Seovic, Mark Falco and Patrick Peralta.  Seovic wrote the majority of the book and Falco and Peralta both contributed individual chapters.  They did a great job, far exceeding what I had hoped for.  The book helped me to see that I have been wrong.

The first chapter sets off to an excllent start. It begins by looking at the problem that we want to solve: the challanges of achieving perforamnce, scalabiilty and availability. The problem is well framed with the underlying issues of latency, bandwidth, performance and state quickly introduced. There follows a breif survey of the database solutions of replication, clustering and sharding. It is assumed that the reader is an architect who is already familiar with these concepts, but that doesn’t stop the authors from providing an excellent overview. As they provide an objective review of the pros and cons of each solution it is clear that they have a solid grasp on the subject .

With the groundwork done Oracle Coherence is introduced, and here the objectivity disappears. The author’s bias for their choosen subject is clear, but this isn’t a problem. The whole reasons for wanting to read this book is to get the expert’s perspective and we should expect experts to be biased.  Thankfully the author does not attempt to hide their bias behind a dry tone. Instead they allow their enthusiasm to shine through with a conversational style and flowing text.

The following pages are pre-occupied with Coherence’s pros. This worried me, as I had once inherited a system where the architect had believed the ‘Snap In, Speed Up and Scale Out’ claims of the Terracotta marketing. They had used it to solve performance problems without addressing issues with the applicaiton architecture and database design.  It didn’t work. It the author attempted to claim that the problems of speed and scalability could be solve simply be introducing Coherence they would lose all credibility. I was pleased to see that they did not:

Coherece can take you a long way towards meeting your perforamcne, scalability and availability objectives, but that doesn’t mean that you can simply add it to the system as an afterthought and expect all your problems will be solved… Doing this requires carefult consideration of the problems and evalutation of the alternatives.

Page 30

The chapter then concludes by considering the importance of design, monitoring and team education.  Quite right.  The author had won me over and I was looking forward to what was to follow.

Moving to the second chapter involves a shift in gears: from discussing the high level architectural issues to the very low level activities of downloading and running Coherence locally. So many books fall down in this regard, providing instructions that simply don’t work and forcing the reader to solve difficult problems in order to keep up. This is the first hurdle where readers are lost.

First I have to download Coherence, then get it up and running and finally start writing some code. At each step anything can go wrong. Installation involves finding the distribution, signing up to Oracle’s developer network and unzipping the content. This all goes smoothly.  The links provided still work and signing up to the Oracle Developer Network was painless.  The book told me everything I needed to know.  It seems strange that the Author uses Jetbrain’s IDEA rather than Eclipse but this doesn’t cause me any problems.  The dependencies are simple and the ideas are easily adapted.

Some hands on tutorials follow and I’m impressed by Coherence’s simplicity.  It only requires one jar, with some optional extras.  I can create and populate caches either from the command line or through the simple API.  It’s all very simple, perhaps too simple.  The chapter concludes with a useful cache loader example and some sage advice for testing and debugging.  Comments here directly address my concerns regarding over simplification.

However, you should bear in mind that by doing so you are not testing your code in the same environment it will eventually run in. All kinds of things change internally when you move from a one-node cluster to two-node cluster.

Page 72

If I am to use Coherence within my own architecture I need to understand what lies beneath.  What concerns need to be addressed?  What strategies might I adopt?  The two chapters that follow, regarding cache planning and implementing a domain model, sets all of these things out with clarity.  After the gentle warm-up of preceding chapters the reader has to work hard to get though, but the effort is well worth it.   Some of the concepts were familiar to me from database clustering, such as the replicated and partitioned topologies.  New concepts, such as backing maps and the near cache, are also introduced in the third chapter.

In the fourth chapter a Domain Driven strategy is presented.  Familiarity with Eric Evan’s book is assumed here, and I would hate to have to work through this chapter without knowing it well.  The  concepts from Chapter 3 are given practical application through the Domain Driven patterns such as entities, aggregates and repositories.

The discussion around Entities is worth the price of the book alone.  Consider, for example, the following observation:

One of the most common mistakes that beginners make is to treat Coherence as an in-memory database and create caches that are too finely grained. For example, they might configure one cache for orders and a separate cache for line items.

While this makes perfect sense when using a relational database, it isnt the best approach when using Coherence. Aggregates represent units of consistency from a business perspective, and the easiest way to achieve atomicity and consistency when using Coherence is to limit the scope of mutating operations to a single cache entry.

Page 118

As an architect looking to use Coherence this is exactly the type of knowledge I am looking for.  Learning this the hard way could be so very expensive.  It also challenges my own perception of Coherence as a type of database.

The chapter continues to discuss deep issues such as identity management and data affinity.  The chapter concludes with a discussion of the implications of Object Serialisation and schema evolution.  It’s tough going, and it took my a long time to get through.  I found myself regularly having to go back and reread sections before I could begin to understand them.  This does not reflect badly on the authors, they have made this information as accessible as they possibly could without losing substance.

Making my way through these chapters was a rewarding experience.  I learned a lot, but I couldn’t hep the nagging doubt that all of this detail justified my belief in the Relational Database approach.  The Relational Model provides abstraction that allow a developer to avoid having to understand these things.

Chapter 5 reinforced this opinion.  Querying the data gird involves the definition of Value Extractors and Aggregators, which are clearly explained.  Practical strategies are introduced to lighten the load, such as the definition of Filter Builder that enables a query in the following format:

Filter filter = new FilterBuilder(ReflectionExtractor.class)
.equals("getCustomerId", 123)
.greater("getTotal", 1000.0)

That’s nice but isn’t it easier to just use SQL?  Isn’t this a lot of hard work to reinvent the database?  Compare the above with the equivalent SQL:

select * from Customer
where id = 123
and total > 1000

Isn’t this exactly the case I was talking about earlier, where the many faceted complexity begins to resemble the simple solution?


Had all my hard work been a waste, the type of accidental complexity I’ve been trying to avoid?  Perhaps not.  As I read on the use of Aggregations shows one possible benefit:

By using an aggregator, we limit the amount of data that needs to be moved across the wire to the aggregator instance itself, the partial results returned by each Coherence node the aggregator is evaluated on, and the final result. This reduces the network traffic significantly and ensures that we use the network as efficiently as possible. It also allows us to perform the aggregation in parallel, using full processing power of the Coherence cluster.

Page 184

Certainly, this is also possible with an RDBMS and some good design but the developer does not have direct control over it.  Anybody who has every spent there days poring over execution plans and statistics trying to introduce just the right indexes and hints required to persuade a reluctant  optimiser will know just how frustrating it can be.  The ability to directly define the parallel paths to take is powerful and desirable.

The next chapter, 6, builds on this by  introducing Parallel and In-Place Processing, a powerful technique that shows just why Coherence might be chosen over a RDBMS.  Three methods are provided: Entry Processors, the Invocation Service and CommonJ Work Manager specification.  These methods allows the processing to be distributed across the cluster as well as the data.  Not only does this avoid the need to move data across the network it also allows processing to be completed in parallel.  Chapter 7 discusses the processing of Data Grid Events and and expands further on the potential for an alternative architecture based on processing map events.  Listeners can be registered to respond after a change has occurred.  Triggers can execute before the event, with the option of transforming or rejecting the update.

It was reading these chapters that I finally started to understand the author’s enthusiasm for Coherence.  The passion was clear in the previous chapters, but unfathomable.  I had considered it a piece of middleware, something to be introduced later in the development cycle to improve performance and scalability.  The potential for concurrency changes all of this.  Coherence becomes a platform to be targeted, an alternative architecture.  To be honest, I can already think of a project or two that I could have used it on.  I am now regretting my past reluctance to consider new ideas.

When I compare SQL with the Coherence filter I fail to take into consideration all of the JDBC code or Hibernate configuration that is needed to join the Object and Relational worlds, and the cultural separation that has grown between the developers and the database administrators.  Working through the examples the hard work in the preceding chapters pays off  as I get to build upon the foundations laid.  True, the result is almost relational, but the integration between code and data is so much more elegant.  It provides the architect and developers with more control and more opportunities to discover a clean and effective solution.  It shows a path towards a simpler solution that combines both the data and the code.

The relational database returns in chapters 8, with a discussion of how the persistence layer might be implemented.  The patterns of cache-aside, read-through, write-through and write-behind are all familiar.  The relevant implementation details and described and practical matters considered.  The relevant low level details of Coherence are introduced.   The RDBMS isn’t the only possible persistent store.  The backing map could also cover services or a legacy application.

Chapters 9 to 11 introduces details for implementing the transport layer.  There are two proprietary application layers available: The Tangasol Cluster Message Protocol (TCMP) and Coherence*Extend.  TCMP is UDP based and used internally by Coherence.   It is intended for clusters that sit together on a single LAN.  Coherence*Extend is intended for access across wide area networks, and can be used to access a Coherence cluster from .Net or C++.

Chapter 12 concludes the book with sage advice regarding selecting the right tool to achieve performance, scalability and high availability.  This chapter puts Coherence into it’s context within the software architects toolset.  The last page quotes Charles Connell on the topic of beautiful software.

Beautiful programs work better, cost less, match user needs, have fewer bugs, run faster, are easier to fix, and have a longer life span.

Beautiful software is achieved by creating a wonderful whole which is more than the sum of its parts. Beautiful software is the right solution, both internally and externally, to the problem presented to its designers.

The author concludes that Coherence is beautiful software, and he has made a strong case.  I had began the book with a utilitarian purpose but I have finished with an aesthetic appreciation.  I had hoped that the book might help move my career forward a few steps and in instead it has set me upon a whole new path.