Project Management

Sustainable Test-Driven Development

Test-driven development is a very powerful technique for analyzing, designing, and testing quality software. However, if done incorrectly, TDD can incur massive maintenance costs as the test suite grows large. This is such a common problem that it has led some to conclude that TDD is not sustainable over the long haul. This does not have to be true. It's all about what you think TDD is, and how you do it. This blog is all about the issues that arise when TDD is done poorly—and how to avoid them.

About this Blog


Recent Posts


The Importance of Test Failure

Mock Objects, Part 1

Mock Objects, Part 2

Mock Objects, Part 3


A question that we are often asked is: “What is the difference between Acceptance Test Driven Development (ATDD) and Test Driven Development (TDD)?” These two activities are related by name but otherwise seem to have little to do with each other. 

ATDD is a whole-team practice where the team members discuss a requirement and come to an agreement about the acceptance criteria for that requirement. Through the process of accurately specifying the acceptance criteria -- the acceptance test -- the team fleshes out the requirement, discovering and corroborating the various assumptions made by the team members and identifying and answering the various questions that, unanswered, would prevent the team from implementing or testing the system correctly.

The word acceptance is used in a wide sense here:

  • The customer agrees that if the system, which the team is about to implement, fulfills the acceptance criteria then the work was done properly
  • The developers accept the responsibility for implementing the system
  • The testers accept the responsibility for testing the system

This is a human-oriented interaction that focuses on the customer, identifying their needs. These needs are specified using the external, public interfaces of the system. 

TDD, on the other hand is a developer-oriented activity designed to assist the developers in writing the code by strict analysis of the requirements and the establishment of functional  boundaries, work-flows, significant values, and initial states. TDD tests are written in the developer’s language and are not designed to be read by the customers. These tests can use the public interfaces of the system, but are also used to test internal design elements. 

We often see the developers take the tests written through the ATDD process and implement them with a unit testing framework.

Requirements from the customer

Before we continue, we need to ask ourselves -- what is a requirement? It is something that the customer needs the system to do. But who is the customer? 

In truth, every system has more than one customer... dozens at times:

  • Stakeholders
  • End users, of different types
  • Operators
  • Administrators (DB, network, user, storage)
  • Support (field, customer, technical)
  • Sales, marketing, legal, training
  • QA and developers (e.g., traces and logs, simulators for QA)
  • etc...

All requirements coming from all of these different customers must be addressed, identified and expressed through the ATDD process. For example:

  • The legal department needs an End User Legal Agreement (EULA) to be displayed when the software is first run, and for the end user to check off the agreement before the system can be used.  This is of no interest to the end users (who we sometimes think of as ‘the customers’), in fact might be an annoyance to them, but is required for the system to be acceptable to the lawyers.
  • The production support team needs all error messages in the system to be accompanied by error codes that can be reported along with the condition that caused the error.  Here again, end users are not interested in these codes, but they can be crucial for the system to be acceptably supported.

And let us not forget the the developers are customers too, who else do we build tracers and loggers for? This is an obvious, publicly visible facet of the developer’s work. But when do we need these facilities? When we try to fix bugs. When we want to understand how the system works. When we work on the system for any reason.  

In other words, when we do maintenance to the system.

Maintainability is a requirement

We need our maintenance to be as easy as possible. No car owner would like to disassemble the car’s engine just to change a windshield wiper; nor would they want to worry that by changing a tire they have damaged the car’s entertainment system. 

Maintainability is a crucial requirement for any software system. Software system maintenance should be fast, safe and predictable. You should be able to make a change fast, without breaking anything, and you need to be able to tell me reliably how long it will take. We expect this of our car mechanic as well as our software developer. So although maintainability is primarily the concern of the developer it definitely affects the non-technical customers. 

The way maintainability manifests itself in software is through design. Design principles are to developers as mathematics is to physicists. It’s the basis of everything that we do. If we do not pay attention to the system’s design as it is developed,it will quickly become unmanageable. 

How often, however, have you seen “maintainability” as a requirement? We’ve never seen it. We call it the “hidden requirement.” It’s always there but no one talks about it. And because we don't talk about it, we forget it about it; we focus on fulfilling the written requirements thinking that we will be done when we complete them. And very quickly, the system turns very hard and unsafe to change.  We are accumulating technical debt, which we could just call “the silent killer.” 

If maintainability is such a crucial requirement, where is the acceptance criteria for it? Who is the customer for this requirement? The development team.  

We need to prove to the customer that the design as was perceived was implemented, and that this design is in fact maintainable, that the correct abstractions exist, that object factories do what they are supposed to, that the functional units operate the way they should, etc... 

Indeed there is something that we can do, in the developers’ own language -- code -- that does precisely these things. It’s TDD. 

One key purpose of TDD is to prove and document the design of the system, hence proving and documenting its maintainability.

TDD is developer-facing ATDD

ATTD is about the acceptability of the system to its various customers.  When the specific customer is the development team then the tests are about the acceptability of the system’s design and resulting maintainability.  Our focus in this work is acceptability in this sense: is the design acceptable?  Is our domain understanding sufficient and correct?  Have we asked enough questions, and were they the right ones?  A system that fails to meet these acceptance criteria will quickly become too expensive to maintain and thus will fail to meet the needs of those who use it. 

Software that fails to meet a need is worthless.  It dies.  So, here again, failing to pass the “maintainability” acceptance criteria is the silent killer.  TDD is the answer to this ailment. 

Note to readers: This was a philosophical treatise. Specific, practical examples abound and will constitute much of our work here, so, read on.

Posted on: February 12, 2021 02:21 AM | Permalink | Comments (0)

The Importance of Test Failure

The typical process of Test-Driven Development goes something like this:

  1. Write a test that expresses one required behavior of the system.
  2. Create just enough production code (a “stub”) to allow the test to compile, and fail.
  3. Run the test and watch it fail (red).
  4. Modify the production code just enough to allow the test to pass.
  5. Run the test and watch it pass (green).
  6. Refactor the production code for quality, running the tests as you do so.
  7. Return to Step 1 until all required behaviors are implemented (aka: rinse, repeat).

There are variations (we’ll suggest a few in another blog), but this is fairly representative. We create process steps like this to guide us, but also to create agreement across the team about what we’re going to do and when we’re going to do it.  Sometimes, however, it seems unnecessary to follow every step every time, rigidly, when we’re doing something that appears to be simple or trivial. 

In particular, developers new to TDD often skip step #3 (Run the test and watch it fail) when it appears completely obvious that the test is going to fail.  The attitude here is often something like “it’s silly to run a test I absolutely know is going to fail, just so I can say I followed the process.  I’m not a robot, I’m smart, thinking person, and the steps are just a guideline anyway.  Running a test that’s obviously going to fail is a waste of my time.” 

In general, we agree with the sentiment that we don’t want to blindly follow process steps without thinking.  We are also very pragmatic about the value of developer time, and agree that it should not be wasted on meaningless activities.

However, it is absolutely crucial that every test is run in the failure mode before implementation code is created to make it pass.  Every time, always, no exceptions. Why do we say this?

First of all, this step really does need to be habitual.  We don’t want to have to decide each and every time whether to run the test or not, as this decision-making itself takes time.  If it’s a habit, it becomes like breathing in and out; we do it, but we don’t think about it. [1]

Frankly, running the tests should not feel like a big burden anyway -- if it is, then we suspect the tests are running too slowly, and that’s a problem in and of itself.  We may not be managing dependencies adequately, or the entity we’re testing may have excessive coupling to other entities in our design, etc...[2]  The pain of slow tests is an indicator that we’re making mistakes, and if we avoid the pain we don’t fix the mistakes. Once the tests start feeling “heavy”, we won’t run them as often, and the TDD process will start to gradually collapse. 

Running the tests can never have zero cost, but we want the cost to be so low (in terms of time) that we treat it as zero.  Good TDD practitioners will sometimes run their tests just because, at the moment, they are not sure what to do next.  When in doubt, we run the tests.  Why not? 

And let’s acknowledge that writing tests takes effort.  We want all our effort to be paid back, otherwise it is waste.  One place where a test repays us for writing it is whenever it fails.  If we’re working on a system and suddenly a test fails, one thing we will surely think is “whoa, we’re glad we wrote that test”... because it just helped us to avoid making a mistake. 

If... it can fail

It is actually very easy (everyone does it eventually) to accidentally write a test which in truth can never fail under any circumstances.  This is very bad.  This is worse than no test at all.  A test which can never fail gives us confidence we don’t deserve, makes us think we’ve clearly specified something about the system when we have not, and will provide no regression coverage when we later need to refactor or enhance the system. Watching the test fail, even once, proves that it can fail and thus has value. 

Furthermore, the “surprise” passing of a test can often be a source of useful information. When a test passes unexpectedly we now must stop and investigate why this has happened.  There are multiple possibilities:

  • We’ve written a test that cannot fail, as we said.  The test is therefore badly written.
  • This test is a duplicate of another test already written.  We don’t want that.
  • We got lucky; something in our language or framework already does what we want, we just didn’t know that or we forgot.  We were about to write code we don’t need.  This would be waste.

So: test failure validates that the test is meaningful and unique, and it also confirms that the code we’re about to write is useful and necessary.  For such a simple thing, it provides an awful lot of value. 

Also, the practice should really be to run all the tests in step 3, and observe that only the test we just wrote is failing.  Similarly, we should really run all the tests in step 5 (Run the test and watch it pass), and observe that all the tests are now green; that the only change was that the test we just wrote went from red to green. 

Running all the tests gives us a level of confidence about what we’ve done that simply cannot be replaced by any other kind of certainty.  We may think “I know this change could not possibly effect anything else in the system”, but there is nothing like seeing the tests all pass to give us complete certainty.  When we have this certainty, we will move faster because we know we have a safety net, and our energy will remain at a relatively high level throughout the day, which will also speed up the development process. 

Confidence is a very rare coin in software development.  Anything that offers confidence to us is something we want to adhere to.  In TDD we are constantly offered moments of confirmation:

  • The test failing confirms the validity of the test.
  • The test passing confirms the validity of the system.
  • The other tests also passing confirms that we have no hidden coupling.
  • The entire suite passing during refactoring confirms that we are, in fact, refactoring.

Always observe the failing test before you write the code.  You’ll be glad you did, and if you don’t you will certainly, eventually, wish you had.   

And, finally, here is a critical concept that will help you remember all of this:

In TDD, success is not getting to green.  Success is the transition from red to green.

Therefore, without seeing the red we cannot succeed.


[1] And we apologize for the fact that you are now conscious about your breathing.  It’ll pass.

[2] We will dig into the various aspects of TDD and its relationship to design and code quality in another blog.  For now, we’ll just stipulate the correlation.

Posted on: February 12, 2021 02:09 AM | Permalink | Comments (0)

Mock Objects, Part 1

Narrow Specifications

When writing tests as specifications, we strive to create a very narrow focus in each individual test.  We want each test to make a single, unique distinction about the system.  This creates a clear specification when we later read the test, and also creates the maximum value when the test fails.

If a test makes a single distinction, then when it fails we know what the specific problem is.

If a test makes a unique distinction, then any particular problem will not cause multiple tests to fail at the same time.

Unfortunately, systems are created by coupling entities together (the coupling is, in effect, the system) and thus we often have various objects and systems which are present and operating (we say “in scope”) at the time a test is running, but which are not part of what the test is specifying.

We say it this way:

A given test will test everything which is in scope but which is not under the control of the test.

If we only wish to test/specify one, single and unique thing, then everything else which is in scope must somehow be brought under the control of the test.  This can turn out to be lots of things:

  • The system clock
  • Random numbers
  • The graphical user interface
  • The file system
  • The database
  • The network
  • Other objects
  • Sharable libraries
  • Other systems we depend on
  • Hardware
  • Etc…

If the behavior we are testing/specifying has dependencies on any of these “other” things then we must take control of them in the test so that the only thing we are focusing on is the one thing we are not controlling. Mocks are a big part of solving this problem.

An Analogy

Sometimes the best way to understand something well, and to retain that understanding, is to find an analogy to something we already understand.

Let’s say we were not making software, but manufacturing cars.  We would have testing to do, certainly, including the crash-worthiness of the vehicles we were planning to sell to the public.  A car, one could say, has an operational dependency: a driver.  However, we don’t want to test the car’s crash-worthiness with an actual driver in the driver’s seat!  We’ll likely kill the poor fellow.  So, we replace the driver with one of these:

A Crash Test Dummy


This, of course is a crash test dummy.  This is a good analogy for a mock object for two reasons.

First, we “insert” the crash test dummy into the car, because we do not want to test the driver but rather the car.   This allows the tester to “control the driver” in various ways.  Mocks are used in this way.

Second, there are different kinds of crash test dummies, of different levels of complexity, depending on what we need from them.  This is also true of mocks.

  1. Sometimes we just need something of the proper weight to be present in the driver's seat so that the test is realistic.  For this purpose, we might just use a sandbag, or a simple block of wood of the right weight.  Sometimes our mock objects are like this; simply dead “nothing” objects that act as inert placeholders.
  2. Other times we need to conduct various test scenarios with the same crash test dummy.  Perhaps one where the driver’s hands are at “10 and 2” on the steering wheel, then another where one hand is on the wheel while the other is on the stick shift, then yet another where the dummy is taking the place of a passenger with its feet up on the dash, or one sitting in the backseat, or facing backwards in a car seat, etc....  For these kinds of tests we would need an articulated dummy that can be put into different positions for these different scenarios.  We do this with mocks too, if needed, and when we do we say the mock is “conditionable”.
  3. Finally, sometimes we would need to know the lethality of a crash scenario, and thus need to measure what happened to the crash test dummy (and thus what would have happened to an actual person in the same crash).  For this, we would put various sensors in the dummy; perhaps a pressure plate in the chest, shock sensors on all the limbs, an accelerometer in the head, etc… These sensors would all measure these various effects during the crash, and record them into a central titanium-clad storage unit buried deep in the dummy.  After the crash is over the testers could plug into the storage unit and download the data to perform an analysis of the effects of the crash.  We also can do this with mocks, and when we do we say the mock is “inspectable”.

The amount of sophistication and complexity in our mocks needs to be kept at a minimum, as we are not going to test them (they are, in fact, a part of the test, and will be validated by initial failure).  If all we need is a block of wood, then that’s all we’re going to use.

An Example of Software Mocking

Let’s say we’re writing software to automate the movement of a tractor.  Farm equipment today is often highly sophisticated including microprocessors, touchscreens, GPS, wireless internet connections via cell towers, and so forth.  One behavior that is needed “turning a red warning light on when the tractor gets too close to the edge of the planting area”.  In specifying  this "Boundary Alarm" we would have dependencies on two aspects of the tractor hardware: the GPS unit that tells us our location at any given point in time and the physical dashboard warning light that we want to turn on.

If we had interfaces already for these two hardware points, the design would likely look something like this:



+CheckFieldBoundary() is what we want to specify/test.  If the GPS reports our location is too close to leaving the planting area, then the BoundaryAlarm object should call ActivateDashLight() on the DashLight interface.

We are not testing the GPS, nor are we testing the light (not that we would not ever, but we are not at this moment, in this test).  These things are in scope, however, and so we must bring them under the control of the test.  Since we are fortunate enough in this case that these dependencies are currently represented by interfaces, we can easily create mocks as implementations of these interfaces.  Also, we got lucky in that the BoundaryAlarm constructor takes implementations of GPS and DashLight, allowing us to easily inject our mocks.

We will obviously have to address situations where we don’t have these advantages, and we shall.  But for now let’s just focus on what the mocks do and how they do it, then we’ll examine various techniques for creating and injecting them.

Mock Implementations


Note that the mocks have additional methods added to them that are not defined in the interfaces they mock:

  • In the case of MockGPS, we added SetCurrentLocation().  The real GPS, of course, gets this location by measuring signals from the satellites that orbit the earth.  Our MockGPS requires no satellites and in fact does nothing other than return whatever we tell it to.  This is an example of making the mock “conditionable”.
  • In the case of MockDashLight we have added Boolean values (which start at false) to track whether the two methods ActivateDashLight() and DeactivateDashLight() were called or not.  We’ve also added two methods which simply return these values, namely DashLightActivated() and DashLightDeactivated().  This is an example of making the mock "inspectable".

These extra methods for conditioning and inspecting the mocks will not be visible to the  BoundaryAlarm object because the mock instances will be implicitly up-cast when they are passed into its constructor (assuming a strongly-typed language).  This is is essentially encapsulation by casting.

Let’s look at some pseudocode:

public class BoundaryAlarmTest {
    public void TestBoundaryAlarmActivatesDashLightWhenNeeded() {
        // Setup
        GPS mockGPS = new MockGPS();
        DashLight mockDashLight = new MockDashLight();
        BoundaryAlarm testBoundaryAlarm =
             new BoundaryAlarm(mockGPS, mockDashLight);
        Location goodLocation = // location inside planting area
        Location badLocation = // location in danger of leaving
        // Trigger lower boundary

        // Verify lower boundary

        // Trigger upper boundary

        // Verify upper boundary

public class MockGPS implements GPS {
    private Location testLocation;
    public Location GetLocation() {
        return testLocation;
    public void SetLocation(Location aLocation)) {
        testLocation = aLocation;

public MockDashLight implements DashLight {
    private boolean activateGotCalled = false;
    private boolean deactivateGotCalled = false;
    public void ActivateDashLight() {
        activateGotCalled = true;
    public void DeactivateDashLight() {
        deactivateGotCalled = true;
    public bool DashLightActivated() {
        return activateGotCalled;
    public bool DashLightDeactivated() {
        Return deactivateGotCalled;

public class BoundaryAlarm {
    public BoundaryAlarm(GPS aGPS, DashLight aDashLight){}
    public void CheckFieldLocation(){}

(Obviously we have skipped over what a Location is and how that works.  One can easily imagine it might contain latitude and longitude members, something like that)

The test will obviously fail since CheckFieldLocation() does nothing at all… so we watch it fail, which validates the test, and only then we put in the logic that causes BoundaryAlarm to turn on the light when it should.  The test drives the development of the behavior.

One thing we need to point out here is that, as simple and straightforward as this is, we’ve actually gone a bit too far.  There is nothing in our test that has anything to do with when and how the dashboard light should be deactivated.  In fact, we might not (at this point) even know the rules about this.  Our mock, therefore, really should not contain any capability regarding the DeactivateDashLight() method; at least, not yet.  We never want to make mocks more complicated than necessary, and we also have no failing test to prove that this part of the mock is valid and accurate.  This is all we should do for now:

Minimal Mocking

Adding more capability to this mock later, or even creating another mock for the purpose of specifying the deactivation of the light, will not be hard to do.  What is hard is keeping track of capabilities we may build in anticipation of a need, when we do not know if and when that need will arrive.  Also, if this “just in case” capability is actually wrong or non-functional, we can easily fail to notice this or lose track of it.

Whenever we add anything to our test suite, whether it be a mock or a test or an assertion or whatever, we want to see a failing test as soon as possible to prove that the validity of that thing.  Remember a test is only valid if it can fail and, specifically, if it fails for the reason we intend in writing it.

...To Be Continued...


Mock Objects, Part 1

Mock Objects, Part 2

Mock Objects, Part 3

Posted on: February 12, 2021 02:02 AM | Permalink | Comments (0)

Mock Objects, Part 2


There are many ways to create a mock object by hand.  You will likely come up with your own techniques, which may make use of language elements and idioms made possible by the particular languages and frameworks you work with.  It is important to know more than one technique because under various circumstances during the development process we are able and unable to change different things.  Also, we may be dependent on the work of other teams or organizations who might not have created an ideal situation for us if we seek to write the kinds of tests this book is about.

For example, let’s say the group who implemented the GPS system did not create a separate interface for their implementation, but instead just created a concrete API object that gives us access to the hardware:

No Interface for the GPS dependency

Perhaps the DashLight implementation is part of our responsibility, and so we’ve opted to create a separate interface which makes mocking easy.  The GPS team might not have (note the GPS object above is a concrete implementation which we are directly dependent upon).  To test our object (BoundaryAlarm) we must bring the GPS under the control of the test.

I suppose we could take our testing laptop out into the field, hook it up to the real global positioning hardware, and physically move the system around the field, running the test in different locations.  At some point, no doubt, such testing will take place.  But remember, in TDD we are not really testing, we are specifying, and the spec should be useful and runnable at any point in time, under any circumstances.  So, we must mock the GPS class.

Let’s start simply:

Direct Inheritance

Here we have simply sub-classed the “real” GPS class.   As was the case when GPS was an interface, our mock will up-cast and appear (to BoundaryAlarm) to be the real thing.  The extra method SetCurrentLocation() will be available to the test, for conditioning purposes, because the test will hold the reference to the mock in a down-cast to its actual type.

But this technique may not work, or may product negative effects.

  • If GPS is not able to be sub-classed (it is final, or sealed, or whatever your language would call a class that cannot be sub-classed), then this is obviously not possible.
  • If the language you are using has both virtual and non-virtual methods, the GetCurrentLocation() method must be a virtual method, otherwise the up-cast of the mock will cause the original method to actually be called, rather than the mock’s method, and the test will not work.
  • In sub-classing, you create strong coupling between the mock and the original class.  One effect of this is the fact that when the test creates an instance of the mock (using new MockGPS()), the class loader will also create an instance of the base class (GPS) in the background.  It must do this, as the original implementation methods are available to the sub-class (via Base() or super() or a similar method, or by direct access).  If merely creating an instance of GPS is a disadvantage (it slows down the test, or requires that the actual hardware must be present when the test runs, etc…) then sub-classing like this is something we don’t want.

Obviously if we could change the GPS class, we’d opt to create that separate interface and eliminate all of these problems.  But what if we cannot?

We could wrap the interface in an adapter (we’ll leave DashLight out of the discussion from this point forward):

Wrapping the dependency

GPSWrapper is our class, which we created just for this purpose, so we could mock it.  Clearly we would have to change BoundaryAlarm in this case, as it no longer directly depends on GPS but rather on the GPSWrapper, and these two are not interchangeable.  But, if we can change BoundaryAlarm and cannot change GPS, then this technique is appropriate.

Note that the adapter and the “real” GPS have the same method name in this example.  This is for simplicity; we could have called the method in the adapter anything we chose.  If the team implementing the real GPS chose a method name we disliked (maybe we think it is unclear, or overly generic) this is also an opportunity to change this, and to make our code more readable.

Note also that the adapter is kept extremely simple; we are not going to be able to test it in TDD (we will do so in integration testing, but those are not run frequently) and so we really don’t want it to have any kind of complex behavior that might fail.  We’ll probably do something like this:

public GPSWrapper {
    private GPS myGPS;
    public GPSWrapper() {
        myGPS = new GPS();
    public virtual Location GetCurrentLocation() {
        return myGPS.GetCurrentLocation();

There’s very little to this, and that’s intentional.  Whenever we create object like this (to wrap any dependency, like the database, the UI, the network, whatever) we call them “periphery objects” because they live on the boundary between the system we are developing and are responsible for, and other systems that we seek to control by mocking.  We obviously cannot write TDD-style tests for these objects, and so we always keep them as minimal as possible.

This solves most of the problems of direct inheritance.  Even if GPS is sealed and has non-virtual methods, our wrapper does not have these problems since we can create it any way we like.  However, note that the inheritance from the mock to the wrapper still means that the real wrapper will be instantiated at test time, and since the wrapper creates an instance of the original GPS object we may still find our test is too slow, or can only be run in the presence of the hardware, etc…  If this is a problem, we can take this one step further, and create an interface for wrapping:

Interface for Wrapping

This eliminates the inheritance coupling between the mock and the implementation entirely, and thus we create no instance of the actual wrapper implementation (GPSWrapperImp) at test time.  Our tests will run fast, be completely repeatable, and will not require the actual GPS system to run.

These are all techniques for creating mocks by hand.  Another approach is to use a mocking framework, a tool that creates these mocks for you.  We’ll examine an example of such a tool a bit later on, and also discuss what we like and dislike about the use of such tools.  In any case, whether you automate you mocks or write them by hand, every developer should know how to handcraft them, how they work, and what they do.

Next, in part 3, we’ll deal with different ways of injecting the mock into the class under test.  In these examples the issue was simple: the constructor of BoundaryAlarm takes its two dependencies as parameters, allowing the test to send in the mock instead of the actual object.  But what if we didn’t have this?  We need more techniques, and we’ll examine a few.



Part 1:

Part 2:

Part 3:

Posted on: February 12, 2021 01:51 AM | Permalink | Comments (0)

Mock Objects, Part 3

Dependency Injection

Imagine that the example use previously was implemented differently:

public class BoundaryAlarm {

     private GPS myGPS;

     private DashLight myDashLight;

     public BoundaryAlarm(){

           myGPS = new GPSImpl();

           myDashLight = new DashLightImpl();


     public void CheckFieldLocation(){

     // Implementation unchanged



The difference is that the implementations of the GPS and DashLight dependencies (namely GPSImpl and DashLightImpl) are created in the constructor of BoundaryAlarm directly.  No matter what technique we use to create mock implementations of these dependencies, it does not matter because the mocks will never be used.

Remember, we are forced to test everything which is in scope but which is not under the control of the test.  If we write tests for BoundaryAlarm in its current state, we will also be testing the objects (and the hardware) it depends upon.  We will have a test that can fail for many different reasons, and thus when it fails we will have to investigate to find out which of those reasons is the culprit.  This will waste our time and will drastically reduce the value of the testing effort.

We need, somehow, to inject the dependencies into the class we’re testing, so that at test time we can inject the mocks instead, putting the test in control of the dependencies (leaving only BoundaryAlarm’s implementation as the thing being tested).

Let’s note that the example above really contains a bad bit of design.  The BoundaryAlarm class is really two different things: it is the consumer (Client) of the GPS and DashLight services, and the creator of them as well.  Any given entity in a system should have one of these relationships, not both: the client of a service couples to the way the service is used, but not necessarily its concrete type, whereas the creator of an object always couples to its concrete type.  Here we feel the pain of this bad design decision in the difficultly we’re having in trying to test just what we wish to specify, and no more.

Here again this is a question of technique, and here again we could probably come up with any number of ways of solving this problem.  We need to know more than one way because the context of our problems will be different at times, and a given technique that worked will in one situation may be totally inappropriate in another.  We’ll look at four different ones here.

1.       Direct Injection

This is basically what we did  initially; we allowed the dependency to be handed in via the constructor (and, btw, from this point forward we’ll just use a single dependency for brevity -- GPS).  We could, alternately, have provided a setter:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = new GPSImpl();


public void setGPS(GPS aGPS) {

     myGPS = aGPS;


     public void CheckFieldLocation(){

     // Implementation unchanged



This makes it possible for the test to inject the mock, but note that the setter does not remove the new GPSImpl() offending code from the BoundaryAlarm.  If we go completely back to the previous approach:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(GPS aGPS){

           myGPS = aGPS;


     public void CheckFieldLocation(){

     // Implementation unchanged



Then BoundaryAlarm is a client of the GPS service, but is not also the creator of it.  The problem, of course, is that any entity (not just the test) can hand in any implementation of the GPS service, and this could represent a security problem in some domains.   Some unscrupulous individual could create an implementation that did something unwanted or illegal, and we’ve left ourselves open.  It’s hard to imaging such an implementation of a GPS unit, so here direct injection might be just fine… but imagine a banking application with dependencies on a funds transfer subsystem, and the possibilities become alarming.

So, another way to go is to use a design pattern: Service Locator

2.       The Service Locator Pattern

The service locator approach basically says that it’s better not to have a client object create its own service object(s), but rather to delegate this creation to a factory or registry object.  The implementation of such a factory could vary widely based on the nature of the service object(s) being created, whether there is variation in their types or construction, whether there are multiple clients with different application contexts, etc…

In our case, there is only one client and only one service version, and so we’d implement the locator simply:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = GPSLocator.Getinstance().GetGPS();


     public void CheckFieldLocation(){

     // Implementation unchanged



class GPSLocator {

     private static GPSLocator _instance = new GPSLocator();

     private GPS theGPS;


     private GPSLocator(){

           theGPS = new GPS();


     public static GPSLocator Getinstance() {

           return _instance;


     public GPS GetGPS() {

           return theGPS;


     // For testing

     public void setGPS(GPS aGPS) {

           theGPS = aGPS;


     public void resetGPS() {

           theGPS = new GPS();



Note that the service locator () is a singleton [1].  This is important because we need to ensure that the instance of the locator being used is the same one that the test will interact with.  Also note the “for testing” methods that allow the test to change the implementation that the locator returns to something beneficial for testing (a mock, in this case) and then reset the locator to work in its normal way when the test concludes.

Of course, this could represent a similar security threat as with direct dependency injection, and so most likely these additional methods would be removed when the code was shipped.  The advantage here is that client objects tend to grow in number whereas locators and other forms of factory objects are much less likely to do so, and so we’ve reduced the amount of maintenance we have to do.

3 – Dependency Injection Frameworks

As with the creation of mocks, injecting them can also be done with automation.  A dependency injection framework (or DIF) can essentially provide a service locator for you.

There are many different tools for doing this, and as we do not want to focus overmuch on tools (TDD is not about tools per se) we will simply give a single example.  We are not advocating for this particular tool, it’s just one we’ve seen used fairly frequently.  It is called Structure Map [2], and it happens to be a .Net tool.  There are plenty of such tools for Java and other languages/frameworks.

First, we change the BoundaryAlarm code:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = ObjectFactory.GetInstance();


     public void CheckFieldLocation(){

     // Implementation unchanged



ObjectFactory is a class that is provided by Structure Map.  For it to work, we need to bind to a resource that will map a type referenced to a real type.  Structure Map can actually map to many different kinds of resources, but we’ll choose XML here as it makes for an easy-to-understand example.

// In StructureMap.config




     ConnectionString="...." />


All this does in ensure that every time the type “GPS” is specified, ObjectFactory will return an instance of GPSImpl.  Now, in the test we do this:


public void testBoundaryAlarm() {

     GPSMock mock = new GPSMock();

     ObjectFactory.InjectStub(typeof(GPS), mock);

     BoundaryAlarm testAlarm = new BoundaryAlarm();

     //The body of the test, then



Again, this is just an example using this particular tool.  The advantage here over writing your own service locators is that these tools typically have a way of disabling the injection (disabling the InjectStub() method in this case) before the code is shipped, which further reduces code maintenance while not leaving the “door open” for miscreants in the future. 

4 – Endo Testing [3]

Our first three techniques all centered upon the idea of “building the dependency elsewhere”. In direct dependency injection, it is built by the client/test.  When using a service locator, it is built by the locator.  In using a DIF, the framework tool creates it.  Endo testing is a way to avoid creating this separation while still allowing the dependency to be injected.

Techniques are good to know about, but beware the overuse of any code tricks you happen to know.  Remember that good separation is probably a good idea anyway, and just because you know how to do something does not mean you should.

To start, we change BoundaryAlarm in a different way: 

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = MakeGPS();


     public void CheckFieldLocation(){

     // Implementation unchanged


protected virtual GPS MakeGPS() {

     return new GPSImpl();



We have not gone to the extent of creating a new entity to manage the instantiation of the GPS dependency, we’ve simple done it in its own method.  This would be a very simple refactor of the original code, something most IDE’s would do for you as part of their built-in refactoring tool suite.

However, note that the method is both protected and virtual (marked that way in a language like C#, or that way by default in a language like Java).  That’s the trick: it allows the test of BoundaryAlarm to actually create a subclass to be tested.  Ideally, this will be a private, inner class of the test:


public class BoundaryAlarmTest

     private GPSMock mock;


     public void initializeTest() {

           Mock = new GPSMock();


public void testBoundaryAlarmRespondsToGPS() {

           BoundaryAlarm testAlarm = new TestableBoundaryAlarm();

           //The body of the test, conditioning the mock

           //as needed


private class TestableBoundaryAlarm : BoundaryAlarm {

     protected override MakeGPS() {

           return mock;



The trick here is that the test creates a subclass of the class being testing (TestableBoundaryAlarm subclasses BoundaryAlarm) but the only thing it overrides is the method that builds the dependency, causing it to build the mock instead.  The test is essentially “reaching inside” the class under test to make this one single change.

This trick can be used to solve other problems involving dependencies.  For example, we often develop code that couples directly to system/frameworks entities such as the date/time API, a network socket, the system console, random number generation, and the like.  We can wrap access to simple entities in local methods, and then override those methods in the test.

Let’s take the system console.  If you’re writing directly to it, it’s very difficult for the test to see what you’ve written, and specify what it should be.  You could create a mock of the class that represents the console, and perhaps you would.  But, if it was a simple issue and making a mock seemed like overkill, you could wrap the console in a local method.

public class MessageSender {

     public void SendMessage() {

           Output(“Ground control to Major Tom.”);


     protected virtual void Output(String message) {



Now, in our test, we simply subclass MessageSender, override the Output method to simply write the message into a local String, or keep a log of all writes, or whatever is most convenient to the test.  You could do the same with a network connection, or an http request, or… whatever.     


All systems are built using dependencies between entities and yet these entities must be tested in isolation from one another, in order to create narrow, precise specifications of their behaviors.  The ability to create and inject mocks of other entities when testing a given entity that is dependent on them is crucial in TDD.  The more techniques we learn, and the more we innovate as technology changes, the more efficient we can be in breaking these dependencies.   

One thing we will want to examine is the role of Design Patterns in all of this.  If a pattern is (as we believe) a collection of best practices for solving problems that recur, surely the pattern should include best practices for testing it.  Once we figure out how to test, say, the Chain of Responsibility (where the mock object goes, what it does, etc…) we see no reason to ever have to figure it out again.

So, stay tuned for that discussion!

[1] Unfamiliar with the Singleton Pattern?  See:

[2] And you can get it here:

[3] This technique was originally suggested by Alex Chaffee and Bill Pietri at IBM’s Developerworks: see

Mock Objects, Part 1

Mock Objects, Part 2

Mock Objects, Part 3

Posted on: February 11, 2021 08:49 AM | Permalink | Comments (0)

"Time is a great teacher, but unfortunately it kills all its pupils."

- Berlioz