Project Management

Sustainable Test-Driven Development

by
Test-driven development is a very powerful technique for analyzing, designing, and testing quality software. However, if done incorrectly, TDD can incur massive maintenance costs as the test suite grows large. This is such a common problem that it has led some to conclude that TDD is not sustainable over the long haul. This does not have to be true. It's all about what you think TDD is, and how you do it. This blog is all about the issues that arise when TDD is done poorly—and how to avoid them.

About this Blog

RSS

Recent Posts

TDD Tests as “Karen”s

ATDD and TDD

The Importance of Test Failure

Mock Objects, Part 1

Mock Objects, Part 2

Mock Objects, Part 3

Dependency Injection

Imagine that the example use previously was implemented differently:

public class BoundaryAlarm {

     private GPS myGPS;

     private DashLight myDashLight;

     public BoundaryAlarm(){

           myGPS = new GPSImpl();

           myDashLight = new DashLightImpl();

}

     public void CheckFieldLocation(){

     // Implementation unchanged

}

}

The difference is that the implementations of the GPS and DashLight dependencies (namely GPSImpl and DashLightImpl) are created in the constructor of BoundaryAlarm directly.  No matter what technique we use to create mock implementations of these dependencies, it does not matter because the mocks will never be used.

Remember, we are forced to test everything which is in scope but which is not under the control of the test.  If we write tests for BoundaryAlarm in its current state, we will also be testing the objects (and the hardware) it depends upon.  We will have a test that can fail for many different reasons, and thus when it fails we will have to investigate to find out which of those reasons is the culprit.  This will waste our time and will drastically reduce the value of the testing effort.

We need, somehow, to inject the dependencies into the class we’re testing, so that at test time we can inject the mocks instead, putting the test in control of the dependencies (leaving only BoundaryAlarm’s implementation as the thing being tested).

Let’s note that the example above really contains a bad bit of design.  The BoundaryAlarm class is really two different things: it is the consumer (Client) of the GPS and DashLight services, and the creator of them as well.  Any given entity in a system should have one of these relationships, not both: the client of a service couples to the way the service is used, but not necessarily its concrete type, whereas the creator of an object always couples to its concrete type.  Here we feel the pain of this bad design decision in the difficultly we’re having in trying to test just what we wish to specify, and no more.

Here again this is a question of technique, and here again we could probably come up with any number of ways of solving this problem.  We need to know more than one way because the context of our problems will be different at times, and a given technique that worked will in one situation may be totally inappropriate in another.  We’ll look at four different ones here.

1.       Direct Injection

This is basically what we did  initially; we allowed the dependency to be handed in via the constructor (and, btw, from this point forward we’ll just use a single dependency for brevity -- GPS).  We could, alternately, have provided a setter:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = new GPSImpl();

}

public void setGPS(GPS aGPS) {

     myGPS = aGPS;

}

     public void CheckFieldLocation(){

     // Implementation unchanged

}

}

This makes it possible for the test to inject the mock, but note that the setter does not remove the new GPSImpl() offending code from the BoundaryAlarm.  If we go completely back to the previous approach:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(GPS aGPS){

           myGPS = aGPS;

}

     public void CheckFieldLocation(){

     // Implementation unchanged

}

}

Then BoundaryAlarm is a client of the GPS service, but is not also the creator of it.  The problem, of course, is that any entity (not just the test) can hand in any implementation of the GPS service, and this could represent a security problem in some domains.   Some unscrupulous individual could create an implementation that did something unwanted or illegal, and we’ve left ourselves open.  It’s hard to imaging such an implementation of a GPS unit, so here direct injection might be just fine… but imagine a banking application with dependencies on a funds transfer subsystem, and the possibilities become alarming.

So, another way to go is to use a design pattern: Service Locator

2.       The Service Locator Pattern

The service locator approach basically says that it’s better not to have a client object create its own service object(s), but rather to delegate this creation to a factory or registry object.  The implementation of such a factory could vary widely based on the nature of the service object(s) being created, whether there is variation in their types or construction, whether there are multiple clients with different application contexts, etc…

In our case, there is only one client and only one service version, and so we’d implement the locator simply:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = GPSLocator.Getinstance().GetGPS();

}

     public void CheckFieldLocation(){

     // Implementation unchanged

}

}

class GPSLocator {

     private static GPSLocator _instance = new GPSLocator();

     private GPS theGPS;

 

     private GPSLocator(){

           theGPS = new GPS();

     }

     public static GPSLocator Getinstance() {

           return _instance;

     }

     public GPS GetGPS() {

           return theGPS;

     }

     // For testing

     public void setGPS(GPS aGPS) {

           theGPS = aGPS;

     }

     public void resetGPS() {

           theGPS = new GPS();

     }

}

Note that the service locator () is a singleton [1].  This is important because we need to ensure that the instance of the locator being used is the same one that the test will interact with.  Also note the “for testing” methods that allow the test to change the implementation that the locator returns to something beneficial for testing (a mock, in this case) and then reset the locator to work in its normal way when the test concludes.

Of course, this could represent a similar security threat as with direct dependency injection, and so most likely these additional methods would be removed when the code was shipped.  The advantage here is that client objects tend to grow in number whereas locators and other forms of factory objects are much less likely to do so, and so we’ve reduced the amount of maintenance we have to do.

3 – Dependency Injection Frameworks

As with the creation of mocks, injecting them can also be done with automation.  A dependency injection framework (or DIF) can essentially provide a service locator for you.

There are many different tools for doing this, and as we do not want to focus overmuch on tools (TDD is not about tools per se) we will simply give a single example.  We are not advocating for this particular tool, it’s just one we’ve seen used fairly frequently.  It is called Structure Map [2], and it happens to be a .Net tool.  There are plenty of such tools for Java and other languages/frameworks.

First, we change the BoundaryAlarm code:

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = ObjectFactory.GetInstance();

}

     public void CheckFieldLocation(){

     // Implementation unchanged

}

}

ObjectFactory is a class that is provided by Structure Map.  For it to work, we need to bind to a resource that will map a type referenced to a real type.  Structure Map can actually map to many different kinds of resources, but we’ll choose XML here as it makes for an easy-to-understand example.

// In StructureMap.config

 

     PluginType="GPS"

     PluggedType="GPSImpl"

     ConnectionString="...." />

 

All this does in ensure that every time the type “GPS” is specified, ObjectFactory will return an instance of GPSImpl.  Now, in the test we do this:

[TestMethod]

public void testBoundaryAlarm() {

     GPSMock mock = new GPSMock();

     ObjectFactory.InjectStub(typeof(GPS), mock);

     BoundaryAlarm testAlarm = new BoundaryAlarm();

     //The body of the test, then

     ObjectFactory.ResetDefault();

}

Again, this is just an example using this particular tool.  The advantage here over writing your own service locators is that these tools typically have a way of disabling the injection (disabling the InjectStub() method in this case) before the code is shipped, which further reduces code maintenance while not leaving the “door open” for miscreants in the future. 

4 – Endo Testing [3]

Our first three techniques all centered upon the idea of “building the dependency elsewhere”. In direct dependency injection, it is built by the client/test.  When using a service locator, it is built by the locator.  In using a DIF, the framework tool creates it.  Endo testing is a way to avoid creating this separation while still allowing the dependency to be injected.

Techniques are good to know about, but beware the overuse of any code tricks you happen to know.  Remember that good separation is probably a good idea anyway, and just because you know how to do something does not mean you should.

To start, we change BoundaryAlarm in a different way: 

public class BoundaryAlarm {

     private GPS myGPS;

     public BoundaryAlarm(){

           myGPS = MakeGPS();

}

     public void CheckFieldLocation(){

     // Implementation unchanged

}

protected virtual GPS MakeGPS() {

     return new GPSImpl();

}

}

We have not gone to the extent of creating a new entity to manage the instantiation of the GPS dependency, we’ve simple done it in its own method.  This would be a very simple refactor of the original code, something most IDE’s would do for you as part of their built-in refactoring tool suite.

However, note that the method is both protected and virtual (marked that way in a language like C#, or that way by default in a language like Java).  That’s the trick: it allows the test of BoundaryAlarm to actually create a subclass to be tested.  Ideally, this will be a private, inner class of the test:

[TestClass]

public class BoundaryAlarmTest

     private GPSMock mock;

     [SetUp]

     public void initializeTest() {

           Mock = new GPSMock();

     }

public void testBoundaryAlarmRespondsToGPS() {

           BoundaryAlarm testAlarm = new TestableBoundaryAlarm();

           //The body of the test, conditioning the mock

           //as needed

}

private class TestableBoundaryAlarm : BoundaryAlarm {

     protected override MakeGPS() {

           return mock;

     }

}

}
The trick here is that the test creates a subclass of the class being testing (TestableBoundaryAlarm subclasses BoundaryAlarm) but the only thing it overrides is the method that builds the dependency, causing it to build the mock instead.  The test is essentially “reaching inside” the class under test to make this one single change.

This trick can be used to solve other problems involving dependencies.  For example, we often develop code that couples directly to system/frameworks entities such as the date/time API, a network socket, the system console, random number generation, and the like.  We can wrap access to simple entities in local methods, and then override those methods in the test.

Let’s take the system console.  If you’re writing directly to it, it’s very difficult for the test to see what you’ve written, and specify what it should be.  You could create a mock of the class that represents the console, and perhaps you would.  But, if it was a simple issue and making a mock seemed like overkill, you could wrap the console in a local method.

public class MessageSender {

     public void SendMessage() {

           Output(“Ground control to Major Tom.”);

     }

     protected virtual void Output(String message) {

           Console.WriteLine(message);

     }

Now, in our test, we simply subclass MessageSender, override the Output method to simply write the message into a local String, or keep a log of all writes, or whatever is most convenient to the test.  You could do the same with a network connection, or an http request, or… whatever.     

Conclusions 

All systems are built using dependencies between entities and yet these entities must be tested in isolation from one another, in order to create narrow, precise specifications of their behaviors.  The ability to create and inject mocks of other entities when testing a given entity that is dependent on them is crucial in TDD.  The more techniques we learn, and the more we innovate as technology changes, the more efficient we can be in breaking these dependencies.   

One thing we will want to examine is the role of Design Patterns in all of this.  If a pattern is (as we believe) a collection of best practices for solving problems that recur, surely the pattern should include best practices for testing it.  Once we figure out how to test, say, the Chain of Responsibility (where the mock object goes, what it does, etc…) we see no reason to ever have to figure it out again.

So, stay tuned for that discussion!


[1] Unfamiliar with the Singleton Pattern?  See: https://www.pmi.org/disciplined-agile/the-design-patterns-repository/the-singleton-pattern

[2] And you can get it here: http://docs.structuremap.net/

[3] This technique was originally suggested by Alex Chaffee and Bill Pietri at IBM’s Developerworks: see http://www.ibm.com/developerworks/library/j-mocktest/index.html


Mock Objects, Part 1

Mock Objects, Part 2

Mock Objects, Part 3

Posted on: February 11, 2021 08:49 AM | Permalink | Comments (0)

Testing the Chain of Responsibility, Part 1

The Chain of Responsibility pattern (hereafter CoR) is one of the original “Gang of Four” patterns.  We’re assuming you know this pattern already, but if not you might want to read about it first at the Design Patterns Repository:

Here’s the UML at it appears in the Gang of Four book [1]:

In testing this pattern, we have a number of behaviors to specify.

Individual Handler Behavior

  1. That a given handler will choose to act (elect) when it should
  2. That a given handler will not elect when it shouldn't
  3. That upon acting, the given handler will perform its function correctly


Chain Traversal Behavior

  1. That a handler which elects itself will not delegate to the next handler
  2. That a handler which does not elect itself will delegate to the next handler, and will hand it the parameter(s)  unchanged
  3. That a handler which does not elect will “hand up” (return) any result returned to it without changing the result


Chain Composition Behavior

  1. The chain is made up of the right handlers
  2. The handlers are given “a chance” in the right order


The first two sets of  behaviors can be specified using a single, simple mock object:

The same mock can be used to test each handler.  If the mock is conditionable and inspectable, it can be used to test all five scenarios.

Example mock code (pseudo-code) [2]:

class Mock: TargetAbstraction {
    private bool wasCalled = false;
    private par passedParam;
    private ret returnValue;

    public ret m(par param) {
        passedParam = param;
        wasCalled = true;
        return returnValue;
    }

      // This makes the mock inspectable.  
      // The test can tell if the mock was called.
    public boolean gotCalled() {
        return wasCalled;
    }

      // This also makes the mock inspectable.  
      // The test can check to see what it received.
    public par passedParameter() {
        return passedParam;
    }

      // This makes the mock conditionable.  
      // The test can dictate what it returns to the tested handler.
    public void setReturn(ret value){
        returnValue = value;
    }
}  


The same mock can be used to test each Handler because:

  1. The mock can report whether it got called or not.  This allows us to use it for both the scenario where it should have been called and the scenario where it should not have been.
  2. The mock can report what was passed to it.  This allow us to use it in the scenario when a given Handler, not electing, passes the data along “unmolested” to the next Handler (in this case, the mock).
  3. The mock can be “programmed“ to return a known value.  This allows us to use it in the scenario where a given Handler, not electing, should bubble up the return from the next Handler unmolested to the caller.


However, if we started to write these tests for each and every handler, we would find that the tests were almost entirely duplications of each other.  The set of tests for each handler would do precisely the same things (albeit for a different handler implementation) except for the one test which specified “That upon acting, the given handler will perform its function correctly”.  We do not like to write redundant tests any more than we like to write redundant production code.

This, in other words, causes “pain” in the tests and, as we say frequently, all pain is diagnostic.  Perhaps the problem is in the implementation.  

The Chain of Responsibility, as classically implemented, does put two different responsibilities into each handler: to select whether it should behave, or not, and what to do in each case.

The Chain Composition, however, initially seems tricky to test.  The pattern does not specify where and how the chain is created, which is typical with patterns; we can pair the CoR with any one of a number of creational patterns, depending on how this issue should be handled, the nature of the rules of creation, how dynamic the chain needs to be, and so forth.

The short answer is that the creation of the chain should be done in a factory class.  We will deal with these issues, and the redesign they suggest, in part 2.  

However, this might be a fun challenge for you.

How would you test (really, specify) the creation of a Chain of Responsibility implementation?  Remember you need to specify in a test that the factory includes all the necessary chain objects when it builds the collection, and that the objects are in the chain in the proper order.  Give it a try, and post your thoughts in the comments section here.

---------

[1] “Design Patterns, Elements of Reusable Object Oriented Software” Gamma, Helm, Johnson, Vlissides

[2] Do we really need to encapsulate our mocks so completely?  Here Scott and Amir are not (yet) in agreement.  Amir says we can do this much more simply:

class Mock : TargetAbstraction {

    public bool gotCalled = false;
    public par passedParam;
    public ret returnValue;

    public ret m(par param) {
        passedParam = param;
        called = true;
        return returnedRet;
    }
}


Scott feels that public state is a bad smell to be avoided in all code.  So, either Amir will beat Scott into submission, or vice-versa.  What do you think?

Posted on: February 11, 2021 08:07 AM | Permalink | Comments (0)

A Learning Process: Re-doing CoR part 1

We're writing this blog for several reasons:

  1. To work out the intellectual material for the book we're writing
  2. To provide value to our community as quickly as possible
  3. To get feedback from our readers as soon as possible (validation-centric)

However, we're also finding that the process of creating each blog and each podcast is, in and of itself, a learning process.  In recording the podcast for the Testing the Chain of Responsibility, Part 1 blog we realized that we'd actually done it wrong. [1]

In that blog we said "The same mock can be used to test each Handler."  The problem with that is the redundancy; we'll be testing the delegation issues in each handler, and those tests will all be duplicates of each other.

So, stay tuned for Testing the Chain of Responsibility Part 1, Redux.

We learn from you, but we also learn from ourselves. :) 

 

Posted on: February 11, 2021 08:00 AM | Permalink | Comments (0)

Testing Best Practices: Test Categories, Part 3

Continued from Test Categories, Part 2

Constant Specification

We often find values that are significant to a given problem domain, but are otherwise arbitrary.  For example, the sales tax rate in a state might be .08 (8 percent), but this is just the rate as currently defined by that state’s government.

Similarly, we sometimes have enumerations that are also significant to a given problem domain.  In a baseball team, you have a Pitcher, Catcher, First Baseman, Second Baseman, Third Baseman, Shortstop, Left Fielder, Center Fielder, and Right Fielder.  Always these nine positions exist, but this is only because the rules of baseball say so.

Good coding practice says to avoid “magic numbers” in our code.  If we were developing a taxation application, we would not want to hard code .08 every time we needed the tax rate.  If we did, this would cause several problems:
 

  • It would be in more than one place (redundant) and thus if it changed would have to be changed in more than one place.
  • It would be meaningless to someone who did not happen to know it was a tax rate.
  • Other systems depending on the same value would likely hard code it as well.


Because of this we tend to create system constants to represent these values.  Something like System.TAX_RATE which, once established in a single place, can be used throughout the code in the place of the literal value.  Or, for an enumerated type, we create a discrete enumeration to represent it: Positions.PITCHER, Positions.CATCHER, and so forth.

This is good practice, and in the past we simply created these constants as needed.  In TDD, however, we have to look at the issue differently.

In TDD we do not create implementation code until we first have a failing test, and we always create only that code needed to make the test pass.  In TDD as we are defining it, the “test” is actually not a test at all, but a specification of the system.  Specifications must be complete.

In other words, we must create a test that specifies these constants should exist before we create the code that means they do exist.  This is extremely simple to do, but is very often left out of the process when people first start doing TDD.

For a constant:

public void specifyTaxConstant() {
 Assert.AreEqual(.08, System.TAX_RATE);
}


Not a tremendous amount of work here, just more the realization that it needs to be done.  With an enumeration, it might be a tad more involved:

public void specifyPlayerPositions() {
 string[] players = Enum.getValues(Positions);
 Assert.AreEqual(9, players.length());
 Assert.Contains(“PITCHER”, players);
 Assert.Contains(“CATCHER”, players);
 Assert.Contains(“FIRST_BASEMAN”, players);
 Assert.Contains(“SECOND_BASEMAN”, players);
 Assert.Contains(“THIRD_BASEMAN”, players);
 Assert.Contains(“SHORTSTOP”, players);
 Assert.Contains(“LEFT_FIELDER”, players);
 Assert.Contains(“CENTER_FIELDER”, players);
 Assert.Contains(“RIGHT_FIELDER”, players);
}


Why would we do this?  It does seems silly at first.  But, as easy as this is to to, it is actually pretty important.

  • This is s specification, remember.  In a traditional specification, these values would certainly be included.  In a tax application’s specification, you would certainly have, somewhere, “the tax rate is currently 8 percent” or words to that effect.  Or, “the positions in Baseball are:...”.
  • If a value needs to be changed (if, for example, the state raises the tax rate), then we want to make the change to the specification first, watch the test fail now, and then change the implementation to make the test pass again.  This ensures that the spec and the code are always in sync, and that we always have tests, going forward, that can fail.  It is quite easy, in test maintenance, to change a test in such a way as to make it impossible for it to fail (a bug, in other words).
  • If a value needs to be added (for example, Positions.DESIGNATED_HITTER) then, again, we have a place to change the spec first.
  • Other developers on other teams will now know about these constants, and will use them too instead of magic numbers.


So a constant specification, as simple as it is, tells you four very important things (espcially later, when reviewing the spec).
 

  1. That a constant was used
  2. Where it is
  3. What it’s called
  4. What its current values are


That’s a lot of value for very little work!  It is important to note, however, that all other tests that access these constant values must use the constants rather than the literal values.  Just like any code we have to maintain, we don’t want redundancies.

Creational

Instantiating objects is the first thing that we need to do in any system. After all, before we use them, they have to be there, right?
So what is there to test (that is, specify) about the creation of objects?

There are two scenario we need to deal with

  1. Simple, single objects
  2. Compound objects.

Simple Objects
When it comes to simple objects we need to specify two things:

  • The type of the object
  • Its observed initial state

What is a type (as in ‘the type of the object is …’)? The type is a concept that we have identified as existing in our domain. By the word concept we mean that we identify it but we have no idea how it is implemented. The concept could be a high level concept such as shape or vehicle, or a very specific concept such as an equilateral triangle or a 2005 Honda Accord.

In order to use an object it must be instantiated, and that is done through some factory mechanism. We may use a factory object, a factory method (such as GetInstance()) or the new operator (which is, after all, the most primitive factory method).

The simplest case
Lets assume we run a furniture business and in our catalog we have a desk named Aaarty,
AartyDesk is a low level concept in our model. AartyDesk will need to be created, and here is the test for it:

//If in our language the new operator cannot fail and still return (as is the case with C# or Java) then the test is:
public void testCreateAartyDesk{
 newAartyDesk();
}

//if the construction could fail and still return a value (as is the case with c++)
public void testCreateAartyDesk {
 Assert.NotNull(new AartyDesk());
}


This is not too interesting, but we have just specified that the concept of an Aarty desk exists in our domain and that it can be created. The next step, to make it more interesting (and useful) is to specify the behavior associated with AartyDesk immediately after it was created -- what we call the initial behavior of the object, which, naturally, depends on the initial state of the object. Note that we should have no visibility to that initial state, only the effect it has on the initial behavior.

public void testCreateAartyDesk {
 AartyDesk aartyDesk = new AartyDesk();
 Assert.False(aartyDesk.isLocked());
 Assert.True(aartyDesk.hasProtectiveCover());
}


Note that it is up to you to decide whether this initial behavior should be captured in single or multiple test; we will discuss that in a later blog.

At this point, we can note that the actual instantiation could vary, for example using a static factory method:

public void testCreateAartyDesk {
 AartyDesk aartyDesk = AartyDesk.GetInstance();
 Assert.False(aartyDesk.isLocked());
 Assert.True(aartyDesk.hasProtectiveCover());
}


or a factory object:

public void testCreateAartyDesk {
 AartyDesk aartyDesk = DeskFactory.GetInstance();
 Assert.False(aartyDesk.isLocked());
 Assert.True(aartyDesk.hasProtectiveCover());
}


The latter two options are significantly better from design perspective as they allow change to occur more easily by encapsulation the construction.

Creating a higher-level abstraction
Our domain analysis revealed that there will be more than one type of desk in our store -- we were told that a new model is being introduced. We want to capture this knowledge through the higher level Desk abstraction. There are several ways to capture this in the tests, depending on the creation mechanism. Firstly let's consider the new operator:

public void testCreateAartyDesk {
 Desk desk = new AartyDesk();
}


This simple test tells the reader that an AartyDesk behaves like a Desk. The same holds for if we use the GetInstance method.

public void testCreateAartyDesk {
 Desk desk = AartyDesk.GetInstance();
}


And the same for a factory object:

public void testCreateAartyDesk {
 Desk desk = DeskFactory.GetInstance();
}


But wait! How can we be sure that the DeskFactory.GetInstance() actually returns an AartyDesk? For that matter, how do we know that AartyDesk.GetInstance()actually returns an AartyDesk? We need to specify that in the test. For the purpose of this code we will assume the existence of an AssertOffType assertion.

public void testCreateAartyDesk {
 Desk desk = AartyDesk.GetInstance();
 AssertOfType(desk);
}

And the same for a factory object:
public void testCreateAartyDesk {
 Desk desk = DeskFactory.GetInstance();
 AssertOfType(desk);
}


Choice
Factories have two responsibilities. The obvious one is the creation of objects. Before this, however, is the responsibility of choosing which type to instantiate. In the example so far, there was only one type of desk -- Aarty, so there was no need to make any selection. What happens when we add the next variation -- Shmaag. Now the factory needs to choose between Aarty and Shmaag.

There are different mechanisms that can be used to enable factory selection. In our case we will have a helper class that returns the type of buyer: Cheap or Refined. Someone who knows the buyer category needs to set this information before the factory is called. In the test case, that someone is that test itself. We will also assume that the test used to specify these two categories of buyer via BuyerCategory was already written . We will also assume that the buyerHelper object was likewise already tested..

public void testCreateAartyDesk {
 BuyerHelper.getInstance().BuyerCategory = BuyerCategory.Cheap;//[4]
 Desk desk = DeskFactory.GetInstance();
 AssertOfType(desk);
}


public void testCreateAartyDesk {
 BuyerHelper.getInstance().BuyerCategory = BuyerCategory.Refined;
 Desk desk = DeskFactory.GetInstance();
 AssertOfType(desk);
}


As before, further tests, or assertions within these tests are needed to specify the initial behavior of these Desk objects.

Composition
The next level of complexity arrives with the introduction of collections. There are 2 things that need to be specified when collections are created:

  • The content of the collection
  • The behavior of the collection


Specifying the content of the collection is done by specifying what the expected values are and what their initial behavior is. For example, in a football team you have many players with specific roles. Consider part of the offense: Quarterback (QB), Center (C), Running Back (RB), Fullback (FB), Tight End (TE), Left and Right Tackles (LT, RT), Left and Right Guards (LG, RG) and Wide Receivers (WR).

The specific makeup of the offensive team depends on the specific play that is called.  For example, if we’re going to run the ball we may choose to have no WR and if we’re gonna pass it we may elect to have 3.

The OffenceFactory object will return the 11 players that should be on the field based on the type of play, which will be a parameter to the GetInstance method. We can have a whole playbook’s worth of plays (which need to be defined in their own tests, of course). Here is one possible test -- We assume here that Type is the type of the object and it can be passed around. There are many possible implementation in various languages. The code below is highly pseudocodish, but easily implementable...

public void testShotgunOffenceComposition() {
 Type[] expectedOffenseRoles = new Type[]
   {QB, C, LG, RG, LT, RT, TE, WR, WR, WR, WR}
 Player[] offense = OffenceFactory.GetInstance(Playbook.Shotgun);
 Type[] actualOffenseRoles = GetTypes(offense)
 expectedOffneseRoles.sort();
 actualOffenseRoles.sort();
 Assert.Equal(expectedOffenseRoles, actualOffenseRoles);  
}


The sort method sorts in place and the Assert.Equal when applied to collections uses iterators to iterate and compare the individual elements. The reason the collections are sorted before the comparison is to make is clear that the order does not matter, just the content.

Once we establish the role composition, we can iterate over the returned collection (offense in this case) and specify the initial behavior of all its contained Players.

Lastly, we need to specify the behavior of the collection as a collection. There are two ways to do that.

  • Assert that the type of the collection if some specific type. This ensures all the behavior implied by the type.
  • Specify the behavior of the iterators using ordering mocks. We will talk more about indexing mocks when we discuss mocks, but for now just remember that they are used to ensure that an iterator returns them in the correct order.

A Note on asserting the type of objects
Asserting types requires what may seem to some as an unsavory approach in some languages, which is the use of metadata; for example, .net reflection, Java introspection or c++ RTTI. In the context of asserting the type of objects, it is OK. As we have already discussed, the type of an object is not an implementation detail, but rather derived from the problem domain. Metadata just a means of retrieving this data, and many times, the only way.

Here is an example of implementing AssertType using C# and .net reflection. It is but one of the many ways that it can be done.

public void AssertType(object actual)
{
    Assert.AreEqual(typeof(Expected).Name, actual.GetType().Name,
                     "type mismatch");
}


Work-Flow

Consider the following code:

class SalesOrder {
 public double CalculateCharges(TaxPolicy taxCalc) {
   double charge = // Some algorithm that totals up the sales
   return charge + taxCalc.getTax(charge); //Add the tax
 }
}

interface TaxPolicy {
 double taxCalc(double amount);
}        

 
TaxPolicy is an interface that we can create different implementations of for the various locales where we have to calculate tax.  We might have one for VAT (value added tax) for our European customers, another that calculated PST (provincial sales tax) for Canadian customers and so forth.  Each of these implementations would have its own test, of course.

But in testing SalesOrder, we would not want to use any of those “real” TaxPolicy implementations.  The reasons may be obvious, and we certainly cover this elsewhere, but just to summarize:

  1. We don’t want to couple the test of SalesOrder to any particular TaxPolicy.  If we just “picked one” then subsequently retire that object, we’ll break the test for SalesOrder even though SalesOrder was unchanged.
  2. We could see the test of SalesOrder fail when the object is operating correctly; the failure could be caused by a bug in the particular TaxPolicy implementation we chose.
  3. When that happens, we would have two tests failing for one reason (the test of SalesOrder and the test of the particular TaxPolicy implementation).  This is a bad road to be on.


The issue is easy to deal with... we could simply create an implementation of TaxPolicy that returns zero.  This would be a mock object[3], and would actually be a part of the test.

So, let’s say we did that.  One morning we start our day by running our tests.  They all are green.  We also happen to note our code coverage measurement and see that it is 100%.  Great!  100% of the code is covered, and all the tests are green!  
     
We have a strong sense confidence that nothing bad has been done to the code since we last encountered it.  Nobody has introduced a bug, and nobody has added code without an accompanying test.

Unfortunately, this confidence may be faulty.  Perhaps some other developer, innocently, made the following change.        

class SalesOrder {
 public double CalculateCharges(TaxPolicy taxCalc) {
   double charge = // Some algorithm that totals up the sales
   return charge;
 }
}


Bascially, SalesOrder does not call the getTax method on the TaxPolicy implementation at all.  This can happen for any number of reasons... perhaps the developer felt that the comment //Add the tax was not necessary, and so backspaced it out... and simply went too far.  Or, perhaps he inaccurately believed that the tax was being added elsewhere.  There are lots of reasons why mistakes like this are made.

Unfortunately, all the tests still pass. The real TaxPolicy implementations still work, and the SalesOrder not calling the mock produces the same additional tax change as calling it: zero.  Also, removing code will not lower the code coverage percentage.

What we’re missing, therefore, is a test that ensures that SalesOrder does, in fact, make use of the TaxPolicy object it is given.  This is a question of proper workflow between these objects, and it must have a test because it is required behavior.  

This is, in our experience, a very commonly-missed test.  This is not because the test is particularly difficult or laborious to write, it is because the developers simply don’t think of it.  

class MockTaxPolicy : TaxPolicy {
 bool gotCalled = false;
 public double getTax(double amount) {
   gotCalled = true;
   return 0.0;
 }

 public bool didGetCalled() {
   return gotCalled;
 }
}


…. and then we would add a new test:

public void testSalesOrderToTaxPolicyWorkflow() {
 MockTaxPolicy mock = new MockTaxPolicy();
 SalesOrder order = new SalesOrder();

 order.CalculateCharges(TaxPolicy mock);

 Assert.IsTrue(mock.didGetCalled());
}


This works because the test holds the mock by its concrete type (making didGetCalled() available to it) while SalesOrder holds it in an implicit upcast to TaxPolicy.  This encapsulates the “for testing only” method from the production code.

You should take care not to overdo this.  Simply because you can do something does not necessarily mean you should.  If a given workflow is already covered by a test of behavior, than you don’t want to test it again on its own.  We only want to create workflow tests where breaking the workflow per-se does not cause any existing test to fail.

Another way to think if it is this:  some workflows are specified, some are simply an implementation choice that might be changed in refactoring (without, in other words, changing the resulting behavior).  Only the specified (must be done) workflows should have tests written specifically for them.  Creating tests of workflows that are only implementation details will couple your test suite too tightly to your current implementation and will impede your ability to refactor.

Conclusions

The purpose in creating categories of tests was fourfold:

  1. Ensure that we remember to include all the types of tests needed to form a complete specification, including those that are often missed (constant and workflow specification being very typical)
  2. Capture the best practices for each category
  3. Promote consistency across the team/organization by creating a shared set of test categories that all developers know to create
  4. Add to the language of testing, improving communication

The list we presented here may not be comprehensive, but it serves as a good starting point.  Once we have the notion of these categories in place, we can enhance them and add new ones as we discover more ways to fully specify a system using tests.
 


[1] In point of fact, we don’t actually completely agree with this method of categorizing the Design Patterns, but it does serve as a reasonable example of categorization in general.
[2] There are other ways to do this.  In another blog we will discuss the use of an “Any” class to make these “I don’t care” values even more obvious.
[3] We have lots to say about mocking.  For now, we’re keeping it simple.
[4] This is the Singleton pattern.  If you are unfamiliar with it:  
     https://www.pmi.org/disciplined-agile/the-design-patterns-repository/the-singleton-pattern

Posted on: February 11, 2021 07:42 AM | Permalink | Comments (0)

Testing Best Practices: Test Categories, Part 2

Continued from Test Categories, Part 1

5. Behavior with related boundaries (“and”) 

Sometimes a behavior varies based on more than one factor.  For example, let’s say that in order for the system to allow someone to be hired (returning a true from the canHire() method, perhaps), the individual in question must both be at least 18 years old and must be a US citizen.

Similar to a boundary within a range, here we need more than one assert to establish the rule.  However, here 2 is not enough.


Let’s further stipulate that there is a constant, System.MINIMUM_HIRING_AGE, and a enumeration named System.Citizenship with members for various countries.  The specifying test code would look like this:

class HiringRuleTest {
   public void testOnlyAppropriateCitizensOfSufficientAgeHired() {
       System.Citizenship anyOtherThanUS =
           System.Citizenship.OUTER_MONGOLIA;

       HiringRule testHiringRule = new HiringRule();

       Assert.True(testHiringRule(System.MINIMUM_HIRING_AGE,
           System.Citizenship.US));
       Assert.False(testHiringRule(System.MINIMUM_HIRING_AGE - 1,
           System.Citizenship.US));
       Assert.False(testHiringRule(System.MINIMUM_HIRING_AGE,
           anyOtherThanUS));
   }
}


A typical question that  might occur to you on reading this might be:  

Should we not show that a person under the minimum age and also not a citizen of the US will get a false when we call this method?  We have shown that:

  1. A US citizen of sufficient age will return a true (hire)
  2. A younger US citizen will not be hired (false)
  3. A non-US citizen “of age” will not be hired (false)

But should we not show that 2 and 3 combined will also produce a false?

This is a perfect example of the difference between testing and specification.  If we were to ask an expert in testing this question, they would likely talk about the four possible conditions here as “quadrants” and would, in fact, say that the the test was incomplete without the fourth case.

In TDD, we feel differently.  We don’t want to make the test suite any larger than necessary, ever, because we want the tests to run as fast as possible, and also because we don’t want to create excessive maintenance tasks when things change.  The ideas of “thoroughness” and “rigor” that naturally accompany the testing thought process become “sufficiency” in TDD.  This probably seems like a trivial point, but it becomes less so over the long haul of the development process.

Put another way, we never add a new assert or test unless doing so makes a new distinction about the system that was not already made in the existing assertions and tests.  What makes non US citizens distinct from US citizens is that they will get a false when the age is sufficient.  That a non-US citizen with insufficient age will get a false is no different than a US citizen, it is the same distinction.

Again, TDD does not replace traditional testing.  We still expect that to happen, and still respect the value of testing organizations as much as we ever have.

You may also wonder “why Outer Mongolia to represent ‘some other country’?  Why not something else?”  We don’t really care what country we choose here, hence we named the temporary method variable “anyOtherThanUS”.  Elsewhere in this book, we’ll look at other options for representing “I don’t care” values like this one.

6. Behavior with repeated boundaries (“or”)

Sometimes are have boundaries that are related in a different way.  Sometimes it is a matter of selecting some values across a set, and specifying that some of them must be in a given condition.  If we don’t address this properly, we can seriously explode the size of our tests.

Let us say, for example, that we have a system of rules for assigning employees to teams.  Each team consists of a Salesperson, a Customer Support Representative (CSR), and an Installation Tech. The rule is that no more than one of these people can be a probationary employee (“probie”).  It’s okay if none of them are probies, but if one of them is a probie the other two must not be.  If we think if this in terms of the various possible cases, we might envision something like this:

This would indicate eight individual asserts.  However, if we take a lesson from the previous section on “and” boundaries we note that zero probies is the same behavior (acceptance) as any one probie; it is not a new distinction and we should leave that one off.  Similarly, we don’t want to show that all three being probies is unacceptable, since any two are already not acceptable and so, again, this would not be a new distinction.

That makes for six asserts.  Doesn’t that feel wrong somehow?  Like it s a brute force solution?  Yes, and this instinct is something you should listen to.  Imagine if the teams had 99 different roles and the rule was that no more than 43 of them could be probies?  Would you like to work out the truth table on that?  Me neither, but we can do it with math:

That would be a lot of asserts!  Painful.

Pain is almost always diagnostic.  When something feels wrong, this can be a clue that, perhaps, we are not doing things in the best way.  Here the testing pain is suggesting that perhaps we are not thinking of this problem correctly, and that an alternate way of modeling the problem might be more advantageous, and actually more appropriate.

A clue to this lies in the way we might decide to implement the rule in the production code.  We could, for instance, do something like this:

class TeamBuilder {
    public const MAX_PROBIES = 1;

    public boolean isTeamProperlyConfigured(
        Employee aSalesPerson,
        Employee aCSR,
        Employee aTech) {

        int count = 0;

        if(aSalesperson.type == “probationary”) count++;
        if(aCSR.type == “probationary”) count++;
        if(aTech.type == “probationary”) count++;

        if(count > MAX_PROBIES) return false;

        return true;

}

}

This also seems like an inelegant, brute-force approach to the problem, and we might refactor it to be something like this:

class TeamBuilder {
    public const MAX_PROBIES = 1;

    public boolean isTeamProperlyConfigured(
        Employee aSalesPerson,
        Employee aCSR,
        Employee aTech) {

        int count = 0;
        Employee[] emps = new Employee[]

{aSalesPerson, aCDR, aTECH};


        forEach(Employee e in emps)
            if(e.type == “probationary”) count++;

        if(count > MAX_PROBIES) return false;


        return true;

}

}

In other words, if we stuff all the team member into a collection, we can just scan it for those who are probies and count them.  This suggests that perhaps we should be using a collection in the first place, in the API of the object:

class TeamBuilder {
    public const MAX_PROBATIONARY = 1;

    public boolean isTeamProperlyConfigured(Employee[] emps) {
        int count = 0;
        forEach(Employee e in emps)
            if(e.type == “probationary”) count++;

        if(count > MAX_PROBATIONARY) return false;

        return true;

}

}

Now we can write a test with two asserts: one that uses a collection with just enough probies in it and one with MAX_PROBATIONARY + 1 probies in it.  The boundary is easy to define because we’ve changed the abstraction to that of a collection.

Here is another example of a test being helpful.  The fact that a collection would make the testing easier is the test essentially giving you “design advice”.  Making thing easier for tests is going to tend to make things easier for client code in general since, in test-first, the test is essentially the first client that a given object ever has.

7. Technically-induced boundaries

Addition is an example of a single-behavior function:

val = op1 + op2


As such, there is little that needs to be specified about it. This, however is the mathematical view of the problem. In reality there’s another issue that presents itself in different ways -- the bit-limited internal representation of numbers in a computer or the way a specific library that we consider using is implemented.

Let’s assume, for starters, that op1, op2 and val are 32 bit integers. As such there are a maximal and minimal values that the can take: 231-1 and -231. The tests needs to specify the following: 

  1. What are the largest positive and negative numbers that can be passed as arguments? The boundary in this case is on the “value of the operand
  2. What happens if the sum of the arguments is more, or less than these technical limits?   (231-1) + 1 = ?

The boundary in this case is on the “sum of the operands

When it comes to floating point calculation the problem is further complicated by precision both is the exponent and the mantissa. For example:

                           100000000000.0 + 0.00000000001 = 100000000000.00000000001

In a computer, because of the way floating point processors operate, the calculation may turn out differently:

                            100000000000.0 + 0.00000000001 = 100000000000.0

Note that the boundary in this case is on the “difference between the operands

In many cases the hardware implementation will satisfy the needs of our customer. If it does not, this does not mean we cannot solve the problem. It means that we will not be able to rely on the the system’s software or hardware implementation but would need to roll our own. Surely this is a tidbit of information worthwhile knowing prior to embarking on implementation?

Continued in Test Categories, Part 3...

Posted on: February 11, 2021 07:19 AM | Permalink | Comments (0)
ADVERTISEMENTS

"Truth comes out of error more readily than out of confusion."

- Francis Bacon

ADVERTISEMENT

Sponsors

Vendor Events

See all Vendor Events