There are some books that change the way in which we think about code. These are the books that are always close to hand. They become dog-eared and worn through constant use. We come to know their pages intimately and they become more a creed than a reference for us.
For me Michael Feathers‘ Working Effectively with Legacy Code is one such book. It is impossible for me to write a subjective review, I’m just to close to the text. I’m sure my gushing about how great the book is would soon get boring. Instead I will try to give a overview of what the book is about so that you can make up your own mind.
There is a crisis coming: a software maintenance crisis. Simple economic analysis tells us that the crisis is inevitable. The effort needed to change software increases as it ages. New software is being written all the time. The code that companies depend upon is constantly growing and ageing. The number of skilled programmers is finite and increases slowly. Something has got to give.
If we are to cope with this crisis then we must learn better ways to work with our legacy code. Despite the vital need very few books have been written on the subject. Working Effectively with Legacy Code is one of the few and by far the most practical. The author defines legacy code as code without tests, and therefore focuses on the task of introducing tests as the most important skill required for dealing with legacy code.
The book is split into three section. In the first section the problem of changing is existing code is explained. In the second section is set out as a FAQ for maintenance programmers. The final section defines a set of dependency breaking techniques that developers can rely upon for safely introducing tests.
The Mechanics of Change
The book starts by introducing a model of software change. Instead of rehashing tired old material written on the subject the author provides his own personal perspective on the subject. This is a pragmatic approach that has clearly been developed the hard way in the legacy software trenches. It also has a strong agile bias, which is clear in the definition given for Legacy Software:
Legacy code is simply code without tests… Code without tests is bad code. It doesn’t matter how well written it is it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests we can change the behaviour of our code quickly and verifiably. Without them, we really don’t know if your code is getting better or worse.
How can we change software? We can alter it’s structure, modify existing functionality, introduce new functionality or change the way in which resources are used. We can make these changes by adding features, fixing bugs, refactoring or optimising.
The biggest challenge lies not in what we change, but in ensuring that we don’t inadvertently change something we shouldn’t. Every time we make a change their is a considerable risk that we introduce new bugs. This is why tests are so important, without them we can break important functionality without realising.
If the code doesn’t have automated tests, then our first goal should be to introduce them. We should never try to make changes without the tests being in place to serve as a safety net. This leads us to a dilemma: how can we introduce the tests if we cannot make changes? Is it realistic to refuse to do any bug fixes or change requests until 100% test coverage has been achieved?
To navigate through this dilemma a Legacy Code Change Algorithm is presented:
- Identify change points.
- Find test points.
- Break dependancies.
- Write tests.
- Make changes and refactor.
All of the techniques described in the book have the goal of achieving one of these steps.
There are three critical concepts that must be understood before these techniques can be successfully adopted: sensing, separation and seams. Sensing and separation are the two goals of dependancy breaking. Sensing involves accessing the values computed so that they can be tested. Separation involves accessing the code so that it can be loaded into a test harness.
The key method is the faking of collaborators. Here the book is showing it’s age a little since it talks about creating fake objects and briefly mentions mocks. Obviously, this areas has moved on a lot since the book was written, but the theory is still sound.
Seams are the places that allow us to inject new behaviour without having to change the code. With seams the code can be pulled open to allow access. There are three types of Seam: pre-processor, link and object. As Java programmers we mostly concerned with object seams.
The section concludes with a discussion of the tools available at the time of writing. Again, the books age is apparent in this discussion.
The second section reads like an extended FAQ. It presents a list of problems such as “my application has no structure”, “I need to make a change, what methods should I test” and “I need to make many changes in one area.” For each problem there is a discussion of techniques that may be used to overcome it. There is a wide range of possible techniques, adapted to fit the problem being discussed.
The techniques for “my application has no structure” are high level and designed to promote discussion and collaboration. One technique is to tell the story of the system: describing the most important aspects of the system in a few simple sentences and then drilling down into the details. Another technique is to use Naked CRC cards where blank cards are placed on the table but not written on. Instead they act as counters in the discussion, being moved around to represent the various activities within the code base.
The techniques for “I need to make a change, what methods should I test” are more practical. In this chapter one of the most interesting tools in the book is described: effect sketches. Effect sketches are used to describe how the different elements of the code effect each other, allowing us to reason forward about the effects that a change in one place may have elsewhere. Mark Needham has written an interesting blog post on effect sketches: http://www.markhneedham.com/blog/2009/11/04/reading-code-unity/.
The chapter “I need to make many changes in one area” demonstrates how effect sketches might be used. The problem is when there are several closely related classes that all need to be changed. Breaking dependancies for every class to introduce tests requires a lot of effort before any changes can be made, which may not be practical. The answer is to take a step back and write a single well placed test that will cover the whole area being changed. The trick is in finding the right place to put the covering test.
The place to put the covering test is called a pinch point:
A pinch point is a natural encapsulation boundary. When you find a pinch pint, you’ve found a narrow funnel for all of the effects of a large piece of code. If the method BillingStatement.makeStatement is a pinch point for a bunch of invoices and items, we know where to look when the statement isn’t what we expected.
Section two is full of techniques like these that help resolve some of the conflicting concerns faced whilst working with Legacy Code. You probably won’t use all of them, but you are sure to find some of them invaluable.
Advice is given for all of the following situations related to each stage in the Legacy Code Change Algorithm:
- Identify change points
- I don’t understand the code well enough to change it.
- My application has no structure.
- Find test points.
- I need to make a change. What methods should I test?
- I need to make many changes in one area.
- Break dependancies.
- I can’t get this class into a test harness.
- I can’t run this method in a test harness.
- Dependencies on libraries are killing me.
- My application is all API calls.
- Write tests.
- I need to make a change, but I don’t know what tests to write.
- My test code is in the way.
- I need to change a monster method and I can’t write tests for it.
- Make changes and refactor.
- I don’t have much time and I have to change it.
- It takes forever to make a change.
- How do I add a feature?
- My project is not object oriented. How do I make safe changes?
- This class is too big and I don’t want it to get any bigger.
- I’m changing the same code all over the place.
- How do I know that I’m not breaking anything?
Finally there is a pep talk to help keep us going.
- We feel overwhelmed. It isn’t going to get any better.
The final section provides a catalogue of refactorings that allow the structure to be changed while preserving behaviour. There is some overlap here with the refactorings described in Martin Fowler’s book, but these differ because they are adapted for use before the unit tests are in place. The following is a brief overview of each technique:
- Adapt Parameter
Introduce a simple interface to encapsulate parameters that are difficult to fake. For example, if a method is only using one or two methods of complex object then a simple interface with just those two methods is introduced. A production implementation of that interface wraps the complex objects while the test implementation is a simple mock.
- Break Out Method Object
Separate a long method into a separate method object of it’s own. Local variables within the method can then become instance variables that can accessed externally.
- Definition Completion
This technique is specific to C and C++ but may be adapted. The definitions are included from the header file and completed with test implementations. Test code must be kept strictly separate to avoid clashes at link time.
- Encapsulate Global Reference
It is a truth universally acknowledged that global variables are bad. Making those global references instance references on a global class is one way to deal with them.
- Expose Static Method
Move some code out of a method and into a static method where it is accessible to testing. This is an intermediate technique that allows us to get some tests in place. Then bolder refactorings can be used to make the code right.
- Extract and Override Call
If there is just one method call that is getting in the way of using introducing tests we can extract and override it. This is used often and can be achieved using the Extract Method refactoring.
- Extract and Override Factory Method
Object creation within constructors can be a nuisance. Extracting that code into a factory method where it can be overridden is a powerful technique if the language will allow it. It’s not possible in C++ because a constructor cannot call virtual functions in it’s subclasses.
- Extract and Override Getter
Where object creation is causing problems it can be encapsulated in a getter that can then be overriden in a testing subclass. This can be used to deal with object creation within constructors for C++ programmers.
- Extract Implementer
Extracting an interface is a powerful method, but naming can be a problem without an automated refactoring tool. In that case you might want to extract all of the implementation into a subclass and turn the class into an interface. This is a tricky refactoring and thankfully Java developers don’t need to use it much.
- Extract Interface
Programming to interfaces is the first principle of reusable object oriented design. Where this principle has not been applied it is easily and safely introduced with a simple refactoring.
- Introduce Instance Delegator
Static methods can be useful, but they can cause grief with static cling if the method’s behaviour is difficult to achieve in test. Creating an instance method that acts as a proxy can solve this problem.
- Introduce Static Setter
Some consider singletons to be an anti-pattern. If a singleton is causing you problems then introducing a static setter will allow you to fake it’s behaviour. Be careful, though, because you will have to make the constructor non-private.
- Link Substitution
Some poor deprived programmers are still working in a procedural language and are denied the pleasure of polymorphism. An alternative technique is to replace one function with another through link substitution. Alternative definitions are placed in include files that can be switched at build time.
- Parameterize Constructor
An easy way to avoid object construction within a constructor is to pass the object in as a parameter. One small concern is that this can widen dependencies across the system.
- Parameterise Method
Object construction within methods can also be a problem. Again, passing the object in as a parameter can help.
- Primitivise Parameter
Sometimes we have to make things worse before we can make them better; especially when dealing with monster dependencies within a complex domain model. This technique requires us to develop new functionality in a free function that takes primitive parameters and then have the domain object delegate to that function. Not desirable, but sometimes necessary.
- Pull Up Feature
Move methods into an abstract superclass so that they can be accessed from a subclass for testing.
- Push Down Dependency
Move problematic dependencies into a sub class and out of the way.
- Replace Function with Global Pointer
Used in procedural languages to introduce a Link Seam.
- Replace Global Reference with Getter
Global variables are a real nuisance when introducing tests. By encapsulating the global with a getter method it allows a test subclass to override the global with something more suitable.
- Subclass and Override Method
Problematic dependancies can be avoided by creating a subclass and overriding the troublesome method.
- Supersede Instance Variable
Used with C++ because virtual function calls in constructors cannot be overridden.
- Template Redefinition
Used in languages like C++ to exploit generic templates.
- Text Redefinition
Used in interpreted languages like Ruby that allow methods to be redefined at runtime .