Project Management

Sustainable Test-Driven Development

by
Test-driven development is a very powerful technique for analyzing, designing, and testing quality software. However, if done incorrectly, TDD can incur massive maintenance costs as the test suite grows large. This is such a common problem that it has led some to conclude that TDD is not sustainable over the long haul. This does not have to be true. It's all about what you think TDD is, and how you do it. This blog is all about the issues that arise when TDD is done poorly—and how to avoid them.

About this Blog

RSS

Recent Posts

ATDD and TDD

The Importance of Test Failure

Mock Objects, Part 1

Mock Objects, Part 2

Mock Objects, Part 3

Testing the Chain of Responsibility, Part 1

The Chain of Responsibility pattern (hereafter CoR) is one of the original “Gang of Four” patterns.  We’re assuming you know this pattern already, but if not you might want to read about it first at the Design Patterns Repository:

Here’s the UML at it appears in the Gang of Four book [1]:

In testing this pattern, we have a number of behaviors to specify.

Individual Handler Behavior

  1. That a given handler will choose to act (elect) when it should
  2. That a given handler will not elect when it shouldn't
  3. That upon acting, the given handler will perform its function correctly


Chain Traversal Behavior

  1. That a handler which elects itself will not delegate to the next handler
  2. That a handler which does not elect itself will delegate to the next handler, and will hand it the parameter(s)  unchanged
  3. That a handler which does not elect will “hand up” (return) any result returned to it without changing the result


Chain Composition Behavior

  1. The chain is made up of the right handlers
  2. The handlers are given “a chance” in the right order


The first two sets of  behaviors can be specified using a single, simple mock object:

The same mock can be used to test each handler.  If the mock is conditionable and inspectable, it can be used to test all five scenarios.

Example mock code (pseudo-code) [2]:

class Mock: TargetAbstraction {
    private bool wasCalled = false;
    private par passedParam;
    private ret returnValue;

    public ret m(par param) {
        passedParam = param;
        wasCalled = true;
        return returnValue;
    }

      // This makes the mock inspectable.  
      // The test can tell if the mock was called.
    public boolean gotCalled() {
        return wasCalled;
    }

      // This also makes the mock inspectable.  
      // The test can check to see what it received.
    public par passedParameter() {
        return passedParam;
    }

      // This makes the mock conditionable.  
      // The test can dictate what it returns to the tested handler.
    public void setReturn(ret value){
        returnValue = value;
    }
}  


The same mock can be used to test each Handler because:

  1. The mock can report whether it got called or not.  This allows us to use it for both the scenario where it should have been called and the scenario where it should not have been.
  2. The mock can report what was passed to it.  This allow us to use it in the scenario when a given Handler, not electing, passes the data along “unmolested” to the next Handler (in this case, the mock).
  3. The mock can be “programmed“ to return a known value.  This allows us to use it in the scenario where a given Handler, not electing, should bubble up the return from the next Handler unmolested to the caller.


However, if we started to write these tests for each and every handler, we would find that the tests were almost entirely duplications of each other.  The set of tests for each handler would do precisely the same things (albeit for a different handler implementation) except for the one test which specified “That upon acting, the given handler will perform its function correctly”.  We do not like to write redundant tests any more than we like to write redundant production code.

This, in other words, causes “pain” in the tests and, as we say frequently, all pain is diagnostic.  Perhaps the problem is in the implementation.  

The Chain of Responsibility, as classically implemented, does put two different responsibilities into each handler: to select whether it should behave, or not, and what to do in each case.

The Chain Composition, however, initially seems tricky to test.  The pattern does not specify where and how the chain is created, which is typical with patterns; we can pair the CoR with any one of a number of creational patterns, depending on how this issue should be handled, the nature of the rules of creation, how dynamic the chain needs to be, and so forth.

The short answer is that the creation of the chain should be done in a factory class.  We will deal with these issues, and the redesign they suggest, in part 2.  

However, this might be a fun challenge for you.

How would you test (really, specify) the creation of a Chain of Responsibility implementation?  Remember you need to specify in a test that the factory includes all the necessary chain objects when it builds the collection, and that the objects are in the chain in the proper order.  Give it a try, and post your thoughts in the comments section here.

---------

[1] “Design Patterns, Elements of Reusable Object Oriented Software” Gamma, Helm, Johnson, Vlissides

[2] Do we really need to encapsulate our mocks so completely?  Here Scott and Amir are not (yet) in agreement.  Amir says we can do this much more simply:

class Mock : TargetAbstraction {

    public bool gotCalled = false;
    public par passedParam;
    public ret returnValue;

    public ret m(par param) {
        passedParam = param;
        called = true;
        return returnedRet;
    }
}


Scott feels that public state is a bad smell to be avoided in all code.  So, either Amir will beat Scott into submission, or vice-versa.  What do you think?

Posted on: February 11, 2021 08:07 AM | Permalink | Comments (0)

A Learning Process: Re-doing CoR part 1

We're writing this blog for several reasons:

  1. To work out the intellectual material for the book we're writing
  2. To provide value to our community as quickly as possible
  3. To get feedback from our readers as soon as possible (validation-centric)

However, we're also finding that the process of creating each blog and each podcast is, in and of itself, a learning process.  In recording the podcast for the Testing the Chain of Responsibility, Part 1 blog we realized that we'd actually done it wrong. [1]

In that blog we said "The same mock can be used to test each Handler."  The problem with that is the redundancy; we'll be testing the delegation issues in each handler, and those tests will all be duplicates of each other.

So, stay tuned for Testing the Chain of Responsibility Part 1, Redux.

We learn from you, but we also learn from ourselves. :) 

 

Posted on: February 11, 2021 08:00 AM | Permalink | Comments (0)

Testing Best Practices: Test Categories, Part 3

Continued from Test Categories, Part 2

Constant Specification

We often find values that are significant to a given problem domain, but are otherwise arbitrary.  For example, the sales tax rate in a state might be .08 (8 percent), but this is just the rate as currently defined by that state’s government.

Similarly, we sometimes have enumerations that are also significant to a given problem domain.  In a baseball team, you have a Pitcher, Catcher, First Baseman, Second Baseman, Third Baseman, Shortstop, Left Fielder, Center Fielder, and Right Fielder.  Always these nine positions exist, but this is only because the rules of baseball say so.

Good coding practice says to avoid “magic numbers” in our code.  If we were developing a taxation application, we would not want to hard code .08 every time we needed the tax rate.  If we did, this would cause several problems:
 

  • It would be in more than one place (redundant) and thus if it changed would have to be changed in more than one place.
  • It would be meaningless to someone who did not happen to know it was a tax rate.
  • Other systems depending on the same value would likely hard code it as well.


Because of this we tend to create system constants to represent these values.  Something like System.TAX_RATE which, once established in a single place, can be used throughout the code in the place of the literal value.  Or, for an enumerated type, we create a discrete enumeration to represent it: Positions.PITCHER, Positions.CATCHER, and so forth.

This is good practice, and in the past we simply created these constants as needed.  In TDD, however, we have to look at the issue differently.

In TDD we do not create implementation code until we first have a failing test, and we always create only that code needed to make the test pass.  In TDD as we are defining it, the “test” is actually not a test at all, but a specification of the system.  Specifications must be complete.

In other words, we must create a test that specifies these constants should exist before we create the code that means they do exist.  This is extremely simple to do, but is very often left out of the process when people first start doing TDD.

For a constant:

public void specifyTaxConstant() {
 Assert.AreEqual(.08, System.TAX_RATE);
}


Not a tremendous amount of work here, just more the realization that it needs to be done.  With an enumeration, it might be a tad more involved:

public void specifyPlayerPositions() {
 string[] players = Enum.getValues(Positions);
 Assert.AreEqual(9, players.length());
 Assert.Contains(“PITCHER”, players);
 Assert.Contains(“CATCHER”, players);
 Assert.Contains(“FIRST_BASEMAN”, players);
 Assert.Contains(“SECOND_BASEMAN”, players);
 Assert.Contains(“THIRD_BASEMAN”, players);
 Assert.Contains(“SHORTSTOP”, players);
 Assert.Contains(“LEFT_FIELDER”, players);
 Assert.Contains(“CENTER_FIELDER”, players);
 Assert.Contains(“RIGHT_FIELDER”, players);
}


Why would we do this?  It does seems silly at first.  But, as easy as this is to to, it is actually pretty important.

  • This is s specification, remember.  In a traditional specification, these values would certainly be included.  In a tax application’s specification, you would certainly have, somewhere, “the tax rate is currently 8 percent” or words to that effect.  Or, “the positions in Baseball are:...”.
  • If a value needs to be changed (if, for example, the state raises the tax rate), then we want to make the change to the specification first, watch the test fail now, and then change the implementation to make the test pass again.  This ensures that the spec and the code are always in sync, and that we always have tests, going forward, that can fail.  It is quite easy, in test maintenance, to change a test in such a way as to make it impossible for it to fail (a bug, in other words).
  • If a value needs to be added (for example, Positions.DESIGNATED_HITTER) then, again, we have a place to change the spec first.
  • Other developers on other teams will now know about these constants, and will use them too instead of magic numbers.


So a constant specification, as simple as it is, tells you four very important things (espcially later, when reviewing the spec).
 

  1. That a constant was used
  2. Where it is
  3. What it’s called
  4. What its current values are


That’s a lot of value for very little work!  It is important to note, however, that all other tests that access these constant values must use the constants rather than the literal values.  Just like any code we have to maintain, we don’t want redundancies.

Creational

Instantiating objects is the first thing that we need to do in any system. After all, before we use them, they have to be there, right?
So what is there to test (that is, specify) about the creation of objects?

There are two scenario we need to deal with

  1. Simple, single objects
  2. Compound objects.

Simple Objects
When it comes to simple objects we need to specify two things:

  • The type of the object
  • Its observed initial state

What is a type (as in ‘the type of the object is …’)? The type is a concept that we have identified as existing in our domain. By the word concept we mean that we identify it but we have no idea how it is implemented. The concept could be a high level concept such as shape or vehicle, or a very specific concept such as an equilateral triangle or a 2005 Honda Accord.

In order to use an object it must be instantiated, and that is done through some factory mechanism. We may use a factory object, a factory method (such as GetInstance()) or the new operator (which is, after all, the most primitive factory method).

The simplest case
Lets assume we run a furniture business and in our catalog we have a desk named Aaarty,
AartyDesk is a low level concept in our model. AartyDesk will need to be created, and here is the test for it:

//If in our language the new operator cannot fail and still return (as is the case with C# or Java) then the test is:
public void testCreateAartyDesk{
 newAartyDesk();
}

//if the construction could fail and still return a value (as is the case with c++)
public void testCreateAartyDesk {
 Assert.NotNull(new AartyDesk());
}


This is not too interesting, but we have just specified that the concept of an Aarty desk exists in our domain and that it can be created. The next step, to make it more interesting (and useful) is to specify the behavior associated with AartyDesk immediately after it was created -- what we call the initial behavior of the object, which, naturally, depends on the initial state of the object. Note that we should have no visibility to that initial state, only the effect it has on the initial behavior.

public void testCreateAartyDesk {
 AartyDesk aartyDesk = new AartyDesk();
 Assert.False(aartyDesk.isLocked());
 Assert.True(aartyDesk.hasProtectiveCover());
}


Note that it is up to you to decide whether this initial behavior should be captured in single or multiple test; we will discuss that in a later blog.

At this point, we can note that the actual instantiation could vary, for example using a static factory method:

public void testCreateAartyDesk {
 AartyDesk aartyDesk = AartyDesk.GetInstance();
 Assert.False(aartyDesk.isLocked());
 Assert.True(aartyDesk.hasProtectiveCover());
}


or a factory object:

public void testCreateAartyDesk {
 AartyDesk aartyDesk = DeskFactory.GetInstance();
 Assert.False(aartyDesk.isLocked());
 Assert.True(aartyDesk.hasProtectiveCover());
}


The latter two options are significantly better from design perspective as they allow change to occur more easily by encapsulation the construction.

Creating a higher-level abstraction
Our domain analysis revealed that there will be more than one type of desk in our store -- we were told that a new model is being introduced. We want to capture this knowledge through the higher level Desk abstraction. There are several ways to capture this in the tests, depending on the creation mechanism. Firstly let's consider the new operator:

public void testCreateAartyDesk {
 Desk desk = new AartyDesk();
}


This simple test tells the reader that an AartyDesk behaves like a Desk. The same holds for if we use the GetInstance method.

public void testCreateAartyDesk {
 Desk desk = AartyDesk.GetInstance();
}


And the same for a factory object:

public void testCreateAartyDesk {
 Desk desk = DeskFactory.GetInstance();
}


But wait! How can we be sure that the DeskFactory.GetInstance() actually returns an AartyDesk? For that matter, how do we know that AartyDesk.GetInstance()actually returns an AartyDesk? We need to specify that in the test. For the purpose of this code we will assume the existence of an AssertOffType assertion.

public void testCreateAartyDesk {
 Desk desk = AartyDesk.GetInstance();
 AssertOfType(desk);
}

And the same for a factory object:
public void testCreateAartyDesk {
 Desk desk = DeskFactory.GetInstance();
 AssertOfType(desk);
}


Choice
Factories have two responsibilities. The obvious one is the creation of objects. Before this, however, is the responsibility of choosing which type to instantiate. In the example so far, there was only one type of desk -- Aarty, so there was no need to make any selection. What happens when we add the next variation -- Shmaag. Now the factory needs to choose between Aarty and Shmaag.

There are different mechanisms that can be used to enable factory selection. In our case we will have a helper class that returns the type of buyer: Cheap or Refined. Someone who knows the buyer category needs to set this information before the factory is called. In the test case, that someone is that test itself. We will also assume that the test used to specify these two categories of buyer via BuyerCategory was already written . We will also assume that the buyerHelper object was likewise already tested..

public void testCreateAartyDesk {
 BuyerHelper.getInstance().BuyerCategory = BuyerCategory.Cheap;//[4]
 Desk desk = DeskFactory.GetInstance();
 AssertOfType(desk);
}


public void testCreateAartyDesk {
 BuyerHelper.getInstance().BuyerCategory = BuyerCategory.Refined;
 Desk desk = DeskFactory.GetInstance();
 AssertOfType(desk);
}


As before, further tests, or assertions within these tests are needed to specify the initial behavior of these Desk objects.

Composition
The next level of complexity arrives with the introduction of collections. There are 2 things that need to be specified when collections are created:

  • The content of the collection
  • The behavior of the collection


Specifying the content of the collection is done by specifying what the expected values are and what their initial behavior is. For example, in a football team you have many players with specific roles. Consider part of the offense: Quarterback (QB), Center (C), Running Back (RB), Fullback (FB), Tight End (TE), Left and Right Tackles (LT, RT), Left and Right Guards (LG, RG) and Wide Receivers (WR).

The specific makeup of the offensive team depends on the specific play that is called.  For example, if we’re going to run the ball we may choose to have no WR and if we’re gonna pass it we may elect to have 3.

The OffenceFactory object will return the 11 players that should be on the field based on the type of play, which will be a parameter to the GetInstance method. We can have a whole playbook’s worth of plays (which need to be defined in their own tests, of course). Here is one possible test -- We assume here that Type is the type of the object and it can be passed around. There are many possible implementation in various languages. The code below is highly pseudocodish, but easily implementable...

public void testShotgunOffenceComposition() {
 Type[] expectedOffenseRoles = new Type[]
   {QB, C, LG, RG, LT, RT, TE, WR, WR, WR, WR}
 Player[] offense = OffenceFactory.GetInstance(Playbook.Shotgun);
 Type[] actualOffenseRoles = GetTypes(offense)
 expectedOffneseRoles.sort();
 actualOffenseRoles.sort();
 Assert.Equal(expectedOffenseRoles, actualOffenseRoles);  
}


The sort method sorts in place and the Assert.Equal when applied to collections uses iterators to iterate and compare the individual elements. The reason the collections are sorted before the comparison is to make is clear that the order does not matter, just the content.

Once we establish the role composition, we can iterate over the returned collection (offense in this case) and specify the initial behavior of all its contained Players.

Lastly, we need to specify the behavior of the collection as a collection. There are two ways to do that.

  • Assert that the type of the collection if some specific type. This ensures all the behavior implied by the type.
  • Specify the behavior of the iterators using ordering mocks. We will talk more about indexing mocks when we discuss mocks, but for now just remember that they are used to ensure that an iterator returns them in the correct order.

A Note on asserting the type of objects
Asserting types requires what may seem to some as an unsavory approach in some languages, which is the use of metadata; for example, .net reflection, Java introspection or c++ RTTI. In the context of asserting the type of objects, it is OK. As we have already discussed, the type of an object is not an implementation detail, but rather derived from the problem domain. Metadata just a means of retrieving this data, and many times, the only way.

Here is an example of implementing AssertType using C# and .net reflection. It is but one of the many ways that it can be done.

public void AssertType(object actual)
{
    Assert.AreEqual(typeof(Expected).Name, actual.GetType().Name,
                     "type mismatch");
}


Work-Flow

Consider the following code:

class SalesOrder {
 public double CalculateCharges(TaxPolicy taxCalc) {
   double charge = // Some algorithm that totals up the sales
   return charge + taxCalc.getTax(charge); //Add the tax
 }
}

interface TaxPolicy {
 double taxCalc(double amount);
}        

 
TaxPolicy is an interface that we can create different implementations of for the various locales where we have to calculate tax.  We might have one for VAT (value added tax) for our European customers, another that calculated PST (provincial sales tax) for Canadian customers and so forth.  Each of these implementations would have its own test, of course.

But in testing SalesOrder, we would not want to use any of those “real” TaxPolicy implementations.  The reasons may be obvious, and we certainly cover this elsewhere, but just to summarize:

  1. We don’t want to couple the test of SalesOrder to any particular TaxPolicy.  If we just “picked one” then subsequently retire that object, we’ll break the test for SalesOrder even though SalesOrder was unchanged.
  2. We could see the test of SalesOrder fail when the object is operating correctly; the failure could be caused by a bug in the particular TaxPolicy implementation we chose.
  3. When that happens, we would have two tests failing for one reason (the test of SalesOrder and the test of the particular TaxPolicy implementation).  This is a bad road to be on.


The issue is easy to deal with... we could simply create an implementation of TaxPolicy that returns zero.  This would be a mock object[3], and would actually be a part of the test.

So, let’s say we did that.  One morning we start our day by running our tests.  They all are green.  We also happen to note our code coverage measurement and see that it is 100%.  Great!  100% of the code is covered, and all the tests are green!  
     
We have a strong sense confidence that nothing bad has been done to the code since we last encountered it.  Nobody has introduced a bug, and nobody has added code without an accompanying test.

Unfortunately, this confidence may be faulty.  Perhaps some other developer, innocently, made the following change.        

class SalesOrder {
 public double CalculateCharges(TaxPolicy taxCalc) {
   double charge = // Some algorithm that totals up the sales
   return charge;
 }
}


Bascially, SalesOrder does not call the getTax method on the TaxPolicy implementation at all.  This can happen for any number of reasons... perhaps the developer felt that the comment //Add the tax was not necessary, and so backspaced it out... and simply went too far.  Or, perhaps he inaccurately believed that the tax was being added elsewhere.  There are lots of reasons why mistakes like this are made.

Unfortunately, all the tests still pass. The real TaxPolicy implementations still work, and the SalesOrder not calling the mock produces the same additional tax change as calling it: zero.  Also, removing code will not lower the code coverage percentage.

What we’re missing, therefore, is a test that ensures that SalesOrder does, in fact, make use of the TaxPolicy object it is given.  This is a question of proper workflow between these objects, and it must have a test because it is required behavior.  

This is, in our experience, a very commonly-missed test.  This is not because the test is particularly difficult or laborious to write, it is because the developers simply don’t think of it.  

class MockTaxPolicy : TaxPolicy {
 bool gotCalled = false;
 public double getTax(double amount) {
   gotCalled = true;
   return 0.0;
 }

 public bool didGetCalled() {
   return gotCalled;
 }
}


…. and then we would add a new test:

public void testSalesOrderToTaxPolicyWorkflow() {
 MockTaxPolicy mock = new MockTaxPolicy();
 SalesOrder order = new SalesOrder();

 order.CalculateCharges(TaxPolicy mock);

 Assert.IsTrue(mock.didGetCalled());
}


This works because the test holds the mock by its concrete type (making didGetCalled() available to it) while SalesOrder holds it in an implicit upcast to TaxPolicy.  This encapsulates the “for testing only” method from the production code.

You should take care not to overdo this.  Simply because you can do something does not necessarily mean you should.  If a given workflow is already covered by a test of behavior, than you don’t want to test it again on its own.  We only want to create workflow tests where breaking the workflow per-se does not cause any existing test to fail.

Another way to think if it is this:  some workflows are specified, some are simply an implementation choice that might be changed in refactoring (without, in other words, changing the resulting behavior).  Only the specified (must be done) workflows should have tests written specifically for them.  Creating tests of workflows that are only implementation details will couple your test suite too tightly to your current implementation and will impede your ability to refactor.

Conclusions

The purpose in creating categories of tests was fourfold:

  1. Ensure that we remember to include all the types of tests needed to form a complete specification, including those that are often missed (constant and workflow specification being very typical)
  2. Capture the best practices for each category
  3. Promote consistency across the team/organization by creating a shared set of test categories that all developers know to create
  4. Add to the language of testing, improving communication

The list we presented here may not be comprehensive, but it serves as a good starting point.  Once we have the notion of these categories in place, we can enhance them and add new ones as we discover more ways to fully specify a system using tests.
 


[1] In point of fact, we don’t actually completely agree with this method of categorizing the Design Patterns, but it does serve as a reasonable example of categorization in general.
[2] There are other ways to do this.  In another blog we will discuss the use of an “Any” class to make these “I don’t care” values even more obvious.
[3] We have lots to say about mocking.  For now, we’re keeping it simple.
[4] This is the Singleton pattern.  If you are unfamiliar with it:  
     https://www.pmi.org/disciplined-agile/the-design-patterns-repository/the-singleton-pattern

Posted on: February 11, 2021 07:42 AM | Permalink | Comments (0)

Testing Best Practices: Test Categories, Part 2

Continued from Test Categories, Part 1

5. Behavior with related boundaries (“and”) 

Sometimes a behavior varies based on more than one factor.  For example, let’s say that in order for the system to allow someone to be hired (returning a true from the canHire() method, perhaps), the individual in question must both be at least 18 years old and must be a US citizen.

Similar to a boundary within a range, here we need more than one assert to establish the rule.  However, here 2 is not enough.


Let’s further stipulate that there is a constant, System.MINIMUM_HIRING_AGE, and a enumeration named System.Citizenship with members for various countries.  The specifying test code would look like this:

class HiringRuleTest {
   public void testOnlyAppropriateCitizensOfSufficientAgeHired() {
       System.Citizenship anyOtherThanUS =
           System.Citizenship.OUTER_MONGOLIA;

       HiringRule testHiringRule = new HiringRule();

       Assert.True(testHiringRule(System.MINIMUM_HIRING_AGE,
           System.Citizenship.US));
       Assert.False(testHiringRule(System.MINIMUM_HIRING_AGE - 1,
           System.Citizenship.US));
       Assert.False(testHiringRule(System.MINIMUM_HIRING_AGE,
           anyOtherThanUS));
   }
}


A typical question that  might occur to you on reading this might be:  

Should we not show that a person under the minimum age and also not a citizen of the US will get a false when we call this method?  We have shown that:

  1. A US citizen of sufficient age will return a true (hire)
  2. A younger US citizen will not be hired (false)
  3. A non-US citizen “of age” will not be hired (false)

But should we not show that 2 and 3 combined will also produce a false?

This is a perfect example of the difference between testing and specification.  If we were to ask an expert in testing this question, they would likely talk about the four possible conditions here as “quadrants” and would, in fact, say that the the test was incomplete without the fourth case.

In TDD, we feel differently.  We don’t want to make the test suite any larger than necessary, ever, because we want the tests to run as fast as possible, and also because we don’t want to create excessive maintenance tasks when things change.  The ideas of “thoroughness” and “rigor” that naturally accompany the testing thought process become “sufficiency” in TDD.  This probably seems like a trivial point, but it becomes less so over the long haul of the development process.

Put another way, we never add a new assert or test unless doing so makes a new distinction about the system that was not already made in the existing assertions and tests.  What makes non US citizens distinct from US citizens is that they will get a false when the age is sufficient.  That a non-US citizen with insufficient age will get a false is no different than a US citizen, it is the same distinction.

Again, TDD does not replace traditional testing.  We still expect that to happen, and still respect the value of testing organizations as much as we ever have.

You may also wonder “why Outer Mongolia to represent ‘some other country’?  Why not something else?”  We don’t really care what country we choose here, hence we named the temporary method variable “anyOtherThanUS”.  Elsewhere in this book, we’ll look at other options for representing “I don’t care” values like this one.

6. Behavior with repeated boundaries (“or”)

Sometimes are have boundaries that are related in a different way.  Sometimes it is a matter of selecting some values across a set, and specifying that some of them must be in a given condition.  If we don’t address this properly, we can seriously explode the size of our tests.

Let us say, for example, that we have a system of rules for assigning employees to teams.  Each team consists of a Salesperson, a Customer Support Representative (CSR), and an Installation Tech. The rule is that no more than one of these people can be a probationary employee (“probie”).  It’s okay if none of them are probies, but if one of them is a probie the other two must not be.  If we think if this in terms of the various possible cases, we might envision something like this:

This would indicate eight individual asserts.  However, if we take a lesson from the previous section on “and” boundaries we note that zero probies is the same behavior (acceptance) as any one probie; it is not a new distinction and we should leave that one off.  Similarly, we don’t want to show that all three being probies is unacceptable, since any two are already not acceptable and so, again, this would not be a new distinction.

That makes for six asserts.  Doesn’t that feel wrong somehow?  Like it s a brute force solution?  Yes, and this instinct is something you should listen to.  Imagine if the teams had 99 different roles and the rule was that no more than 43 of them could be probies?  Would you like to work out the truth table on that?  Me neither, but we can do it with math:

That would be a lot of asserts!  Painful.

Pain is almost always diagnostic.  When something feels wrong, this can be a clue that, perhaps, we are not doing things in the best way.  Here the testing pain is suggesting that perhaps we are not thinking of this problem correctly, and that an alternate way of modeling the problem might be more advantageous, and actually more appropriate.

A clue to this lies in the way we might decide to implement the rule in the production code.  We could, for instance, do something like this:

class TeamBuilder {
    public const MAX_PROBIES = 1;

    public boolean isTeamProperlyConfigured(
        Employee aSalesPerson,
        Employee aCSR,
        Employee aTech) {

        int count = 0;

        if(aSalesperson.type == “probationary”) count++;
        if(aCSR.type == “probationary”) count++;
        if(aTech.type == “probationary”) count++;

        if(count > MAX_PROBIES) return false;

        return true;

}

}

This also seems like an inelegant, brute-force approach to the problem, and we might refactor it to be something like this:

class TeamBuilder {
    public const MAX_PROBIES = 1;

    public boolean isTeamProperlyConfigured(
        Employee aSalesPerson,
        Employee aCSR,
        Employee aTech) {

        int count = 0;
        Employee[] emps = new Employee[]

{aSalesPerson, aCDR, aTECH};


        forEach(Employee e in emps)
            if(e.type == “probationary”) count++;

        if(count > MAX_PROBIES) return false;


        return true;

}

}

In other words, if we stuff all the team member into a collection, we can just scan it for those who are probies and count them.  This suggests that perhaps we should be using a collection in the first place, in the API of the object:

class TeamBuilder {
    public const MAX_PROBATIONARY = 1;

    public boolean isTeamProperlyConfigured(Employee[] emps) {
        int count = 0;
        forEach(Employee e in emps)
            if(e.type == “probationary”) count++;

        if(count > MAX_PROBATIONARY) return false;

        return true;

}

}

Now we can write a test with two asserts: one that uses a collection with just enough probies in it and one with MAX_PROBATIONARY + 1 probies in it.  The boundary is easy to define because we’ve changed the abstraction to that of a collection.

Here is another example of a test being helpful.  The fact that a collection would make the testing easier is the test essentially giving you “design advice”.  Making thing easier for tests is going to tend to make things easier for client code in general since, in test-first, the test is essentially the first client that a given object ever has.

7. Technically-induced boundaries

Addition is an example of a single-behavior function:

val = op1 + op2


As such, there is little that needs to be specified about it. This, however is the mathematical view of the problem. In reality there’s another issue that presents itself in different ways -- the bit-limited internal representation of numbers in a computer or the way a specific library that we consider using is implemented.

Let’s assume, for starters, that op1, op2 and val are 32 bit integers. As such there are a maximal and minimal values that the can take: 231-1 and -231. The tests needs to specify the following: 

  1. What are the largest positive and negative numbers that can be passed as arguments? The boundary in this case is on the “value of the operand
  2. What happens if the sum of the arguments is more, or less than these technical limits?   (231-1) + 1 = ?

The boundary in this case is on the “sum of the operands

When it comes to floating point calculation the problem is further complicated by precision both is the exponent and the mantissa. For example:

                           100000000000.0 + 0.00000000001 = 100000000000.00000000001

In a computer, because of the way floating point processors operate, the calculation may turn out differently:

                            100000000000.0 + 0.00000000001 = 100000000000.0

Note that the boundary in this case is on the “difference between the operands

In many cases the hardware implementation will satisfy the needs of our customer. If it does not, this does not mean we cannot solve the problem. It means that we will not be able to rely on the the system’s software or hardware implementation but would need to roll our own. Surely this is a tidbit of information worthwhile knowing prior to embarking on implementation?

Continued in Test Categories, Part 3...

Posted on: February 11, 2021 07:19 AM | Permalink | Comments (0)

Testing the Chain of Responsibility, Part 1 (redux)

Testing the Chain of Responsibility

The Chain of Responsibility pattern (hereafter CoR) is one of the original “Gang of Four” patterns.  We’re assuming you know this pattern already, but if not you might want to read about it first at the Design Patterns Repository.

The basic idea is this: you have a series of rules (or algorithms) that are conceptually the same.  Only one of the rules will apply in a given circumstance.  You want to decouple the client objects that use the rules from:

  1. The fact that there is more than one rule
  2. How many rules there are
  3. How each rule is implemented
  4. How the correct rule is selected
  5. Which rule actually acted on any given request

All the clients should see/couple to is the common interface that all rules export, and perhaps the factory that creates them (or, from the clients’ perspective, the factory that creates “it”).

The CoR, in its classic form [1] accomplishes this by chaining the rules together, and handing a reference to the first one (in an upcast to the shared abstraction) to the client. When the client requests the action, the first rule decides “for itself” if it should act.  We call this “electing”.  If the rule elects, it performs the action and returns its result.  If it does not elect, it delegates to the next rule in the chain, and so on until some rule elects.  Regardless of which rule elects, the result is propagated back up the chain to the client. Typically only one rule will elect, and when one does we stop asking the rules that follow it; it just acts and returns, and we’re done.

Let’s examine a concrete example, and look at the design and some code.  We’ll keep the example very simple, so the pattern and testing techniques are easy to see.

Problem Statement

We have to process an integer.  There are two ways of processing it: a processing algorithm that is appropriate for “small” values (which are defined in the domain as any value in the range of 1 - 10000) and a different algorithm that is appropriate for “large” values (10001 - 20000).  Values over 20000 are not allowed.

Again, for simplicity, we’ll say that the large processor algorithm halves the value it is given, while the small processor doubles it.  If neither processing algorithm is appropriate, the system must throw an exception indicating an unsupported value was given.

Using the CoR

The classic CoR design view of this problem would look like this:

The Classic Chain of Responsibility

The Classic Chain of Responsibility


The Code

public abstract class Processor {
    public const int MIN_SMALL_VALUE = 1;
    public const int MAX_SMALL_VALUE = 10000;
    public const int MIN_LARGE_VALUE = 10001;
    public const int MAX_LARGE_VALUE = 20000;

    private readonly Processor nextProcessor;

    protected Processor(Processor aProcessor) {
       nextProcessor = aProcessor;
    }

    public int Process(int value) {
           int returnValue = 0;

           if(ShouldProcess(value)) {
               returnValue = ProcessThis(value);
           } else {
               returnValue = nextProcessor.Process(value);
           }
           return returnValue;
    }

    protected abstract bool ShouldProcess(int value);
    protected abstract int ProcessThis(int value);
}


Note the use of the Template Method Pattern [2] in this base class.  This eliminates the otherwise redundant part of the “decision making” that all the various processors would share, and delegates to the two abstract methods where the specific implementation in each case will be supplied in the derived classes.

Here they are:

public class LargeValueProcessor : Processor {
    public LargeValueProcessor(Processor aProcessor) :

base(aProcessor){}


    protected override bool ShouldProcess(int value) {
           if (value >= MIN_LARGE_VALUE && 
                value <= MAX_LARGE_VALUE)

    return true;

           return false;
    }

    protected override int ProcessThis(int value) {
           return (value/2);
    }
}

public class SmallValueProcessor : Processor {
    public SmallValueProcessor(Processor aProcessor) :

base(aProcessor){}


    protected override bool ShouldProcess(int value) {
           if (value <= MAX_SMALL_VALUE && 
                value >= MIN_SMALL_VALUE)

     return true;

           return false;
    }

    protected override int ProcessThis(int value) {

       return (value * 2);

    }
}

public class TerminalProcessor : Processor {
    public TerminalProcessor() : base(null){ }

    protected override bool ShouldProcess(int value) {
           return true;
    }

    protected override int ProcessThis(int value) {
           throw new ArgumentException();
    }

}


In testing this pattern, we have a number of behaviors to specify:

Common Chain-Traversal Behaviors

  1. That a processor which elects itself will not delegate to the next processor
  2. That a processor which does not elect itself will delegate to the next processor, and will forward the parameter(s) it was given unchanged
  3. That a processor which did not elect will “hand back” (return) any result returned to it from the next processor without changing the result


Individually Varying Processor Behaviors

  1. That a given processor will choose to act (elect) when it should
  2. That a given processor will not elect when it shouldn't
  3. That upon acting, the given processor will perform its function correctly


Chain Composition Behaviors

  1. That the chain appears to be the proper abstraction to the client
  2. The chain is made up of the right processors
  3. The processors are given “a chance” in the right order (if this is important)


Common Chain-Traversal Behaviors

All these behaviors are implemented in the base class Processor, via the template method, to avoid redundancy.  We don’t want redundancy in the tests either, so the place to specify these behaviors is in a test of one entity: the base class.  Unfortunately, the base class is an abstract type and thus cannot be instantiated.  One might think “well, just pick one of the processors -- it does not matter which one -- and write the test using that.  All derived classes can access the behavior of their base class, after all.”

We don’t want to do that.  First of all, it will couple the test of the common behaviors to the existence particular processor we happened to choose.  What if that implementation gets retired at some point in the future?  We’ll have to do test maintenance just because we got unlucky.  Or, what if a bug is introduced in the concrete processor we picked?  This could cause the test of the base behavior to fail when the base class is working just fine, due to the inheritance coupling.  That would be a misleading failure; we never want our tests to lie to us.  Coupling should always be intentional, and should always work for, not against us.

Here’s another good use for a mock.  If we make a mock implementation of the base class, it, like any other derived class, will have access to the common behavior.

class MockProcessor : Processor {
    public bool willElect = false;
    public bool wasAskedtoProcess = false;
    public int valueReceived = 0;
    public int returnValue = 0;

    public MockProcessor(Processor aProcessor) : 
           base(aProcessor){}

    protected override bool ShouldProcess(int value) {
           wasAskedtoProcess = true;
           valueReceived = value;
           return willElect;
    }

    protected override int ProcessThis(int value) {
           return returnValue;
    }
}


Note we keep this as simple as possible.  This is really part of the test, and will thus not be tested itself.  In fact, if we didn’t need it for two different tests, we’d probably make it an inner class of the test (which we call an inner shunt.  More on shunts later).

The tests that specify the proper chain-traversal behavior are simply conducted with two instances of the mock, chained together.  The first can be told to elect or not, and the second can be examined to see what happens to it with each scenario.

The first scenario concerns what should happen if the first process does not elect, but delegates to the second processor:

[TestClass]
public class ProcessorDelegationTest {
    private MockProcessor firstProcessor;
    private MockProcessor secondProcessor;
    private int valueToProcess;
    private int returnedValue;

    [TestInitialize]
    public void Init() {
           // Setup
           secondProcessor = new MockProcessor(null);

secondProcessor.willElect = true;

           firstProcessor = new MockProcessor(secondProcessor);

firstProcessor.willElect = false;

           valueToProcess = Any.Value; // [3]
           secondProcessor.returnValue = Any.Value;

           // Common Trigger
           returnedValue = 
                  firstProcessor.Process(valueToProcess);
    }

    [TestMethod]
    public void TestDelegationHappensWhenItShould() {
           Assert.IsTrue(secondProcessor.wasAskedtoProcess);
    }

    [TestMethod]
    public void TestDelegationHappensWithUnchangedParameter() {
           Assert.AreEqual(valueToProcess,

secondProcessor.valueReceived);

    }

    [TestMethod]
    public void TestDelegationHappensWithUnchangedReturn() {
           Assert.AreEqual(returnedValue,

secondProcessor.returnValue);

    }
}


These tests specify the three aspects of a processor that does not elect.  Note each aspect is in its own test method.  By telling the first mock not to elect, we can inspect the second mock to ensure that it got called, that it got the parameter unchanged, and that whatever it returns to the first mock is propagated back out with being changed.

The second scenario is where the first processor does elect.  All we need to prove here is that it does not delegate to the second processor.  Whether it does the right thing, algorithmically, will be specified in the test of the actual processors (we’ll get to that)..

[TestClass]
public class ProcessorNonDelegationTest {
    [TestMethod]
    public void TestNoDelegationWhenProcessorElects() {
           MockProcessor secondProcessor = 
                new MockProcessor(null);
           MockProcessor firstProcessor =

new MockProcessor((secondProcessor));

           firstProcessor.willElect = true;

           firstProcessor.Process(Any.Value);

           Assert.IsFalse(secondProcessor.wasAskedtoProcess);
    }
}


At first this might seem odd.  We’re writing tests that only use mocks?  That seems like a snake eating itself... the test is testing the test.  But remember, when the classloader loads a subclass (in this case, the mock), it also loads an instance of the base class in the background, and the base class is where the behavior we’re specifying actually exists.  We’re not testing the mock, we’re testing the template method in the abstract base class through the mock.

Individually Varying Processor Behaviors

Now that we’ve specified and proven that the delegation and traversal issues are correct, we now only have two things to specify in each individual processor: that it will elect only when it should, and that it will process correctly when it does.  The exception is the Terminal processor which, of course, should simply always elect and always throw an exception.

The problem here is that the only public method of the concrete processors is the Process() method, which is established (and tested) in the base class.  It would be a mistake, and a rather easy one to make, to write the tests of the concrete processors through the Process()method.  Doing so would couple these new tests to the ones we’ve already written, and over the long haul this will dramatically reduce the maintainability of the suite.

What we need to do is to write tests that directly access the protected methods ShouldProcess() and ProcessThis(), giving them different values to ensure they do what they are specified to do in the case of each concrete prossor.  Normally, such methods would not be accessible to the test, but we can fix this, simply, by deriving the test from the class in each case.  For example:

[TestClass]
public class SmallValueProcessorTest : SmallValueProcessor {
    public SmallValueProcessorTest():base(null){}

    [TestMethod]
    public void TestSmallValueProcessorElectsCorrectly() {
          Assert.IsTrue(
                 ShouldProcess(Processor.MIN_SMALL_VALUE));
          Assert.IsFalse(
                 ShouldProcess(Processor.MIN_SMALL_VALUE-1));
          Assert.IsTrue(
                 ShouldProcess(Processor.MAX_SMALL_VALUE));
          Assert.IsFalse(
                 ShouldProcess(Processor.MAX_SMALL_VALUE+1));
    }

    [TestMethod]
    public void TestSmallValueProcessorProcessesCorrectly() {
           int valueToBeProcessed =

Any.ValueBetween(Processor.MIN_SMALL_VALUE,

Processor.MAX_SMALL_VALUE);

           int expectedReturn = valueToBeProcessed * 2;
           Assert.AreEqual(expectedReturn,

this.ProcessThis(valueToBeProcessed));

    }
}


Note we have to give our test a constructor, just to satisfy the base class contract (chaining to its parameterized constructor, passing null).  If you dislike this, and/or if you dislike the direct coupling between the test and the class under test, an alternative is to use a testing adapter:

[TestClass]
public class LargeValueProcessorTest {
    private LargeValueProcessorAdapter testAdapter;

    [TestInitialize]
    public void Init() {
           testAdapter = new LargeValueProcessorAdapter();
    }

    [TestMethod]

public void TestLargeValueProcessorElectsCorrectly() {

         
           Assert.IsTrue(

testAdapter.ShouldProcess(

Processor.MIN_LARGE_VALUE));

           Assert.IsFalse(

testAdapter.ShouldProcess(

Processor.MIN_LARGE_VALUE - 1));

           Assert.IsTrue(

testAdapter.ShouldProcess(

Processor.MAX_LARGE_VALUE));

           Assert.IsFalse(

testAdapter.ShouldProcess(

Processor.MAX_LARGE_VALUE + 1));

    }

    [TestMethod]
    public void TestLargeValueProcessorProcessesCorrectly() {

int valueToBeProcessed =

Any.ValueBetween(Processor.MIN_LARGE_VALUE,

Processor.MAX_LARGE_VALUE);

           int expectedReturn = valueToBeProcessed / 2;
           Assert.AreEqual(expectedReturn,

testAdapter.ProcessThis(valueToBeProcessed));

    }

    private class LargeValueProcessorAdapter : 
               LargeValueProcessor {
           public LargeValueProcessorAdapter() : base(null) { }

           public new bool ShouldProcess(int value) {
               return base.ShouldProcess(value);
           }

           public new int ProcessThis(int value) {
               return base.ProcessThis(value);
           }
    }
}


We leave it up to you to decide which is desirable, but we’d recommend you pick one technique and stick with it.

Note that the first test method (TestLargeValueProcessorElectsCorrectly()) is a boundary (range)  test, and the second test method (TestLargeValueProcessorProcessesCorrectly()) is a test of a static behavior.  Refer to our blog on Test Categories for more details, if you’ve not already read that one.

Finally, we need to specify the exception-throwing behavior of the terminal processor.  This could be done either through direct subclassing or via a testing adapter; we’ll use direct subclassing for brevity:

[TestClass]
public class TerminalProcessorTest : TerminalProcessor
{
    [TestMethod]
    public void TestTerminalProcessorAlwaysElects()    {
           Assert.IsTrue(ShouldProcess(Any.Value));
    }

    [TestMethod]
    public void

TestTerminalProcessorThrowsExceptionWhenProcessing() {

           try
           {
               ProcessThis(Any.Value);
               Assert.Fail("TerminalProcessor should always throw an exception when reached");
           } catch (ArgumentException){}
    }
}


This may look a bit odd, but we’ll talk about exceptions and testing in another entry.  For now we think you can see that this test will pass if the exception is thrown if it is reached, and fail if it is not.

Oh, and don’t forget to specify your public constants!

[TestClass]
public class ConstantSpecificationTest
{
    [TestMethod]
    public void SpecifyConstants()
    {
           Assert.AreEqual(1, Processor.MIN_SMALL_VALUE);
           Assert.AreEqual(10000, Processor.MAX_SMALL_VALUE);
           Assert.AreEqual(10001, Processor.MIN_LARGE_VALUE);
           Assert.AreEqual(20000, Processor.MAX_LARGE_VALUE);
    }
}


In the next part, we’ll examine the third set of issues that have to do with the composition of the chain itself... that all the required elements are there, and that they are in the proper order (in cases where is important).  This will present us with an opportunity to discuss object factories, and how to test/specify them.

Stay tuned!

-----

[1] It’s important to note that patterns are not implementations.  We know many other forms of this pattern, but in this section will focus on the implementation shown in the Gang of Four.

[2] Unfamiliar with the Template Method Pattern?  We have a write up of it here:
https://www.pmi.org/disciplined-agile/the-design-patterns-repository/the-template-method-pattern
[3] Details on the use of an “Any” class is a subject into itself.  For now, just know that Any.Value returns a random integer, while Any.Value(min, max) returns a random integer within a range.

Posted on: February 11, 2021 06:34 AM | Permalink | Comments (1)
ADVERTISEMENTS

"The remarkable thing about television is that it permits several million people to laugh at the same joke and still feel lonely."

- T.S. Eliot

ADVERTISEMENT

Sponsors