Project Management

Home / Blogs / Sustainable Test-Driven Development /

Sustainable Test-Driven Development

Test-driven development is a very powerful technique for analyzing, designing, and testing quality software. However, if done incorrectly, TDD can incur massive maintenance costs as the test suite grows large. This is such a common problem that it has led some to conclude that TDD is not sustainable over the long haul. This does not have to be true. It's all about what you think TDD is, and how you do it. This blog is all about the issues that arise when TDD is done poorly—and how to avoid them.

TDD | show all posts

About this Blog

RSS

Date

newer posts

older posts

Structure of Tests-As-Specifications

A big part of our thesis is that TDD is not really a testing activity, but rather a specifying activity that generates tests as a very useful side effect. For TDD to be a sustainable process, it is important to understand the various implications of this distinction. [1]

Here, we will discuss the way our tests are structured when we seek to use them as the functional specification of the system.

A question we hear frequently is "how does TDD relate to BDD?" BDD is "Behavior-Driven Development" a term coined by Dan North and Chris Matts in their 2006 article "Introducing BDD" [2]. Many have made various distinctions between TDD, ATDD, and BDD, but we feel these distinctions to be largely unimportant. To us, TDD is BDD, except that we conduct the activity at a level very close to the code, and automation is much more critical. Also, we contend that “development” includes analysis and design, and thus what TDD enables is more accurately stated to be “behavior-based analysis and design”, or BBAD.

In BBAD, the general idea is that the "unit" of software that is being specified is a behavior. Software is behavior, after all. Software is not a noun, it is a verb.  Software’s value lies entirely in what it does, what value the user accrues as result of its behavior. In essence, software only exists in any meaningful sense of the word when it is up and running. The job of a software development team is to take a general-purpose computer and cause it to act in specific, valuable ways. We call these behaviors.

The nomenclature that North and Matts proposed for specifying each behavior of a system is this: Given-When-Then. Here's a simple example:

Given:
   User U has a valid account on our system with Username UN and password PW
     The login username is set to UN and the login password is set to PW
When:
  Login is requested
Then:
    U is logged in

Everything that software does, every behavior can be expressed in this fashion. Each Given-When-Then expression is a specific scenario that is deemed to have business value, and that the team has taken upon itself to implement.

In TDD, when the scenario is interpreted at a test, we strive to make this scenario actionable. So we think of these three parts of the scenario a little differently, we "verbify" them to convert these conditions into activities.

Imagine that you were a manual tester that was seeking to make sure the system was behaving correctly in terms of the scenario above. You would not wait around until a user with a valid account happened to browse to the login page, enter his info, and click the "Login" button... you would create or identify an existing valid user and, as that person, browse to the page, enter the correct username and password, and then click the button yourself. Then you'd check to see if your login was successful. You would do all of these things.

So the Given wasn't given, it was done by the tester (you, in this case), the When was not when, it was now do, and the Then was not a condition but rather an action: go and see if things are correct.

"Given" becomes "Setup".
"When" becomes "Trigger".
"Then" become "Verify".

We want to structure our tests in such a way that these three elements of the specification are clear and, as much as possible, separate from each other. Typical programming languages can make this a bit challenging at times, but we can overcome these problems fairly easily.

For example: Let's say we have a behavior that calculates the arithmetic mean of two real numbers accurate within 0.1. Most likely this will be a method call on some object that takes two values as parameters and returns their arithmetic mean of those values, accurate within 0.1.

Let’s start with the Given-When-Then:

Given:
     Two real values R1 and R2
     Required accuracy A is 0.1
When:
     The arithmetic mean of R1 and R2 is requested
Then:
     The return is (R1+R2)/2, accurate to A

Let's look at a typical unit test for such a behavior:
(Code samples are in C# with MSTest as the testing framework)

[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
Assert.AreEqual(5.5d,
MathUtils.GetInstance().
ArithmeticMean(7.0d, 4.0d),.1);
}
}

This test is simple because the behavior is simple. But this is really not great as a specification.

The Setup (creation of the MathUtils object, the creation of the example doubles 7.0d and 4.0d), the Trigger (the calling of the ArithmeticMean method with our two examples doubles), and the Verify (comparing the method's return to the expectation, 5.5d, and establishing the precision as .1), are all expressed together in the assertion. If we can separate them, we can make the specification easier to read and also make it clear that some of these particular values are not special, that they were just picked as convenient examples.

This is fairly straightforward, but easy to miss:

[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
// Setup
var mathUtils = MathUtils.GetInstance();
var anyFirstValue = 7.0;
var anySecondValue = 4.0;
var tolerance = .1;
var expectedMean = (anyFirstValue + anySecondValue)/2;

// Trigger
var actualMean = mathUtils.ArithmeticMean(anyFirstValue,
anySecondValue);

// Verify
Assert.AreEqual(expectedMean, actualMean, tolerance);
}
}

Here we have included comments to make it clear that the three different aspect of this behavioral specification are now separate and distinct from each other. The "need" for comments always seems like a smell, doesn't it? It means we can still make this better.

But we've also used variable names like "anyFirstValue" to indicate that the number we chose was not a significant value, creating more clarity about what is important here. Note that tolerance and expectedMean were not named in this way, because their values are specific to the required behavior.

This, now, is using TDD to form a readable specification, which also happens to be executable as a test [2]. Obviously the value of this as a test is very high; we do not intend to trivialize this. But we write them with a different mindset when we think of them as specifications and, as we'll see, this leads to many good things.

Looking at both code examples above however, some of you may be thinking "what is this GetInstance() stuff? I would do this: "

// Setup
var mathUtils = new MathUtils();

Perhaps. We have reasons for preferring our version, which we'll set aside for its own discussion.

But the interesting question is: what if you started creating the object one way (using “new”), and then later changed your mind and used a static GetInstance() method, or maybe even some factory pattern? If, when that change was made, you had many test methods on this class doing it the "old" way this would require the same change in all of them.

We can do it this way instead:

[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
// Setup
var arithMeticMeanCalculator =
GetArithmeticMeanCalculator();
var anyFirstValue = 7.0;
var anySecondValue = 4.0;
var tolerance = .1;
var expectedMean = (anyFirstValue + anySecondValue) / 2;

// Trigger
var actualMean = arithMeticMeanCalculator.
ArithmeticMean(anyFirstValue,
anySecondValue);
// Verify
Assert.AreEqual(expectedMean, actualMean, tolerance);
}

private MathUtils GetArithmeticMeanCalculator()
{
return MathUtils.GetInstance();
}
}

Now, no matter how many test methods on this test class needed to access this arithmetic mean behavior (for different scenarios), a change in terms of how you access the behavior would only involve the modification of the single "helper" method that is providing the object for all of them.

Many testing frameworks have their own mechanisms for eliminating redundant object creation, usually in the form of a Setup() or Initialize() method, etc., and these can be used. But we prefer the method because we then gain the ability to decouple the specification from the fact that the behavior we’re specifying happens to be implemented in a class called MathUtils. We could also change this design detail and the impact would only be on the helper method (the fact that C# has a var type is a real plus here… you might be limited a bit in other languages)

But the spec is also not about the particular method you call to get the mean, just how the calculation works, behaviorally. Certainly an ArithmeticMean() method is logical, but what if we decided to make it more flexible, allowing any number of parameters rather than just two? The meaning of "arithmetic mean" would not change, but our spec would have to. Which seems wrong. So, we could take the idea a little bit farther:

[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
// Setup
var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
var anyFirstValue = 7.0;
var anySecondValue = 4.0;
var tolerance = .1;
var expectedMean = (anyFirstValue + anySecondValue) / 2;

// Trigger
var actualMean = TriggerArithmeticMeanCalculator(
arithmeticMeanCalculator,
anyFirstValue, anySecondValue);
// Verify
Assert.AreEqual(expectedMean, actualMean, tolerance);
}

private double TriggerArithmeticMeanCalculator(MathUtils mathUtils,
double anyFirstValue,
double anySecondValue)
{
return mathUtils.ArithmeticMean(anyFirstValue,
anySecondValue);
}

private MathUtils GetArithmeticMeanCalculator()
{
return MathUtils.GetInstance();
}
}

Now if we change the ArithmeticMean() method to take a container rather than discrete parameters, or whatever, then we only change this private helper method and not all the various specification-tests that show the behavior with more parameters, etc...

The idea here is to separate the meaning of the specification from the way the production code is designed. We talk about the specification being one thing, and the "binding" being another. The specification should change only if the behavior changes. The binding (these private helpers) should only change if the design of the system changes.

Another benefit here is clarity, and readability. Let's improve it a bit more:

[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
// Setup
var anyFirstValue = 7.0;
var anySecondValue = 4.0;
var tolerance = .1;

// Trigger
var actualMean = TriggerArithmeticMeanCalculation(
anyFirstValue, '
anySecondValue);

// Verify
var expectedMean = (anyFirstValue + anySecondValue) / 2;
Assert.AreEqual(expectedMean, actualMean, tolerance);
}

private double TriggerArithmeticMeanCalculation(
double anyFirstValue,
double anySecondValue)
{
var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
return arithmeticMeanCalculator.
ArithmeticMean(anyFirstValue,
anySecondValue);
}

private MathUtils GetArithmeticMeanCalculator()
{
return MathUtils.GetInstance();
}
}

We have moved the call GetArithmeticMeanCalculator() to the Trigger, and expectedMean to the Verification [3]. Also we changed the notion of "trigger the calculator" to "trigger the calculation". Now, remember the original specification?

Given:
     Two real values R1 and R2
     Required accuracy A is 0.1
When:
     The Arithmetic Mean of R1 and R2 is requested
Then:
     The return is (R1+R2)/2, accurate to A

The unit test, which is our specification, very closely mirrors this Given-When-Then expression of the behavior. Do we really need the comments to make that clear? Probably not. We’ve created a unit test that is a true specification of the behavior without coupling it to the specifics of how the behavior is expressed by the system.

Can we take this even further? Of course... but that's for another entry. :)

[1] It should be acknowledged that Max prefers to say "it is a test which also serves as a specification." We'll probably beat him into submission :), but for the time being that's how he likes to think of it. We welcome discussion, as always.

[2] Better Software Magazine, March 2006.

[3] It should also be acknowledged that we're currently discussing the relative merits of using Setup/Trigger/Verify in TDD rather than just sticking with Given/When/Then throughout. See Grzegorz Gałęzowski's very interesting comment below on this (and other things).

Posted on: February 10, 2021 10:20 AM | Permalink | Comments (0)

Specifying The Negative in TDD

One of the issues that frequently comes up is "how do I write a test about a behavior that the system is specified not to have?" It's an interesting question given the nature of unit tests. Let's examine it.

The Decision Tree of Negatives

When it comes to behaviors that the system should not have, there are different ways that this can be specified and ensured for the future:

Inherently Impossible

Some things are inherently impossible, depending on the technology being used. For example you cannot write to read-only memory. This is in the nature of the memory and thus does not require a specification (nor a test, since that would be a test that could never fail). In languages like C# and Java, there exists the concept of “private”, and we know that an attempt to read or write a private value from outside a class will not compile and so will never exist in the executable system.

Some things are inherently impossible and cannot be made possible even accidentally. Read-only memory cannot be made writable. However other things which are impossible by nature can be made possible if desired. A good example of this is an immutable object.

Let's say there exists in our system a SaleAmount class that represents an amount of money for a given retail sale in an online environment. Such a class might exist in order to restrict, validate, or perfect the data it holds. In this case, however, there is a customer requirement that the value held must be immutable, for reasons of security and consistency in their transactions.

This brings up the question "how do I specify in a test that you cannot change the value?"
How can we test-drive such an entity when part of what we wish to specify is that the value, once established in an instance of this class, cannot be changed from the outside? A typical way this questions is stated is "how can I show, in a test, that there is no SetValue() method? Any test that references such a method simply will not compile because it does not exist. Therefore, I cannot write the test.”

Developers will sometimes suggest two different ideas:

Add the SetValue() method, but make it throw an exception if anyone ever calls it. Write a test that calls this method and fails if the exception is not thrown.[1] Sometimes other actions are suggested if the method gets called, but an exception is quite common.
Use reflection in the test to examine the object and, if SetValue() is found, fail the test.

The problem with option #1 is that this is not what the requirement says, it is not what was wanted. The specification should be "you cannot change the value" not "if you change the value, thing x will happen." So here, the developer is creating his own specification and ignoring the actual requirements.

The problem with option #2 is twofold: First, reflection is typically a very sluggish thing and in TDD we want our tests to be extremely fast so that we can run them frequently without this slowing down our process. But even if we overcame that somehow, what would we have the test look for? SetValue()? PutValue()? ChangeValue()? AlterValue()? The possibilities are vast and the cost of fully verifying immutability, in this case, would be enormous compared to the value.

The key to solving this is in reminding ourselves once again that TDD is not initially about testing but creating a specification. Developers have always worked from some form of specification it's just that the form was usually some kind of document.

So think about the traditional specification, the one you're likely more familiar with. Ask yourself this: Does a specification indicate everything the system does not do? Obviously not, for this would create a document of infinite length. Every system does a finite set of things, and then there is an infinite set of things it does not do.

For example, here is an acceptance test for the positive requirement [2]:

Given: A SaleAmount S with value V
When: You ask for the value of S
Then: V is retrieved

This could be made into an executable specification by the following simple test:

[TestClass]
public class SaleAmountTest
{
[TestMethod]
public void TestSaleAmountPersistence()
{
var initialValue = 10.50d;
var testDollar = new SaleAmount(initialValue);

var retrievedValue = testDollar.GetValue();

Assert.AreEqual(retrievedValue, initialValue);
}
}

Which would drive the entity and its behavior into existence:

public class SaleAmount
{
private double myValue;
public SaleAmount(double aValue)
{
myValue = aValue;
}

public double GetValue()
{
return myValue;
}
}

Ask yourself the following question: If we were using the TDD process to create this SaleAmount object, and if the object had a method allowing the value to be changed (SetValue() or whatever), how would it have gotten there? Where is the test that drove that mechanism into existence? It's not there because there is a specific requirement that it not be there. In TDD we never add code to the system without having a failing test first, and we only add the code that is needed to make the test pass, and nothing more.

Put another way, if a developer on our team added a method that allowed such a change, and did not have a failing test written first, then he would be ignoring the rules of TDD and would be creating a bug as a result. TDD does not work if you don't do it. We don't know of any process that does.

And if we think back to the concept of a specification there is an implicit rule here, which basically has two parts.

Everything the system does, every behavior, must be specified.
Given this, anything that is not specified is by default specified as not a behavior of the system.

If it is a behavior nonetheless it is a defect.

Inherently possible

We don’t have a test that shows the value being changed, so it cannot be. But this does not mean we have a “test for immutability.” Anything that comes from the customer must be retained; we never want to lose that knowledge. So if we think of this requirement in terms of acceptance testing we could express it using the ATDD nomenclature:

Given: A SaleAmount S with value V exists in the system
Then: You cannot change V

There is no “When” in this case because this is a requirement that is always true, it is not based on system state. But this, of course, implies a strongly-typed, compiled language with access-control idioms (like making things "private" and so forth). What if your technology does not provide this? What if it is an interpreted language, or one with no enforcement mechanism to prevent access to internal variables?

The first answer is: You have to ask the customer. You have to tell them that you cannot do precisely what they are asking for, and consider other alternatives in that investigation. It may well be that we are using the wrong technology.

The second answer is that there will be some occasions where the only way you can ensure that an illegal or unwanted behavior is not added to a system accidentally is through static analysis (a traditional code review, or perhaps a code analysis tool). This is still “a test” but one that either cannot or should not be automated in all cases.

On the other hand, sometimes we can make an inherently possible thing impossible by adding behaviors. Such behaviors must, of course, be test driven.

Let's add a requirement to our SaleAmount class. If the context of this object was, say, an online book store, the customer might have a maximum amount of money that he allows to be entered into a transaction.

We used a double-precision number [3] to hold the value in SaleAmount. A double can hold an incredibly large value inherently. In .net, for example, it can hold a value as high as 1.7976931348623157E+308 [4]. It does not seem credible that any purchase made at our customer's site could total up to something like that! So the requirement is: Any SaleAmount object that is instantiated with a value greater than the customer's maximum credible value should raise a visible alarm, because this probably means the system is being hacked or has a very serious calculation bug.

As developers, we know a good way to raise an alarm is to throw an exception. We can do that, but we also capture the customer's view of what the maximum credible value is, so we specify it. Let's say he says "nothing over $1,000.00 makes any sense". But... how much "over"? A dollar? A cent? We have to ask, of course. Let's say the customer says "one cent".

In TDD everything must be specified, all customer rules, behaviors, values, everything. So we start with this:

Given: The system
Then: The Maximum value for a Sale Amount is $1000.00

We also have to capture the tolerance in its own specification:

Given: The System
Then: Tolerance for comparing SaleAmount to its Maximum is one cent

These tests establish bits of domain-specific language that can then be used in any number of other specifications (we won’t have to repeatedly define them whenever we make comparisons).

[TestMethod]
public void SpecifyMaximumDollarValue()
{
Assert.AreEqual(1000d, SaleAmount.MAXIMUM);
}

[TestMethod]
public void SpecifyMaximumDollarValue()
{
Assert.AreEqual(.01, SaleAmount.TOLERANCE);
}

In order to get these to pass we drive the Maximum and the Tolerance into the system.
Now we can write this test, which will also fail initially of course:

Given: Value S greater than or equal to Maximum + Tolerance
When: An attempt is made to create a SaleAmount with value S
Then: A warning is issued

[TestMethod]
public void TestUSDollarThowsUSDollarValueTooLargeException()
{
var saleAmountMaximum = SaleAmount.MAXIMUM;
var tolerance = SaleAmount.TOLERANCE;
var excessiveAmount = saleAmountMaximum + tolerance;

try
{
CreateSaleAmount(excessiveAmount);
Assert.Fail("SaleAmount created with excessive"+"
"value should have thrown an exception");
}
catch (SaleAmountValueTooLargeException)
{ }
}

But now the question is, what code do we write to make this test pass? The temptation would be to add something like this to the constructor of SaleAmount:
if(aValue => MAXIMUM + TOLERANCE)
throw new SaleAmountValueTooLargeException();

But this is a bit of a mistake. Remember, it's not just "add no code without a failing test", it is "add only the needed code to make the failing test pass."

Your spec is supposed to be your pal. He's supposed to be there at your elbow saying "don't worry. I won't let you make a mistake. I won't let you write the wrong code, I promise." He's not just your pal, he's your best pal.

Here, however, the spec is just a mediocre friend because he will let you write the wrong code and say nothing about it. He’ll let you get in your car when you are in no condition to drive. He'll let you do this, and let it pass:

throw new SaleAmountValueTooLargeException();

There is no conditional. We’re just throwing the exception all the time. That's wrong, obviously. This behavior has a boundary (as we discussed in our blog about test categories) and every boundary has two sides. We need a little more in specification. We need something like this:

try

{
new SaleAmount(SaleAmount.MAXIMUM);
}
catch (SaleAmountValueTooLargeException)
{
Assert.Fail("SaleAmount created with value at the maximum"+
"should not have thrown an exception");
}

Now the "anAmount => MAXIMUM + TOLERANCE" part must be added to the production code or your best buddy will let you know you're blowing it. Friends don’t let friends implement incorrectly. 
...
[1] There are a variety of ways to do this. We’ll show one way here a bit further on.
[2] [TODO] Link to ATDD blog
[3] If you’re thinking “you used the wrong type, a long would be better” it’s a fair point. We simply wanted to make the conceptual point that primitives do not impose domain constraints inherently, and the use of the double just makes the idea really clear.
[4] For those who dislike exponential notation, this is:
$179,769,313,486,231,520,616,720,392,992,464,536,472,240,560,432,240,240,944,616,576, 160,448,992,408,768,712,032,320,616,672,472,536,248,456,776,672,352,088,672,544,960,568, 304,616,280,032,664,704,344,880,448,832,696,664,856,832,848,208,048,648,264,984,808,584, 712,312,912,080,856,536,512,272,
952,424,048,992,064,568,952,496,632,264,936,656,128,816,232,688,512,496,536,552,712, 648,144,200,160,624,560,424,848,368
...and no cents. :)

Posted on: February 10, 2021 06:45 AM | Permalink | Comments (0)

TDD and the "6 Do's and 8 Skills" of Software Development: Pt. 1

This post is not about TDD per se, but rather a context in which TDD can demonstrate its place in and contribution to the value stream. This context has to do with the 6 things that we must accomplish (do) and the 8 skills that the team must have in order to accomplish them. We'll describe each "do", noting where and if TDD has an impact, and then do the same thing with the skills.

6 Dos:

Do the right thing
Do the thing right
Do it efficiently
Do it safely
Do it predictably
Do it sustainably

8 Skills:

Programming
Designing
Analysis
Refactoring
Testing
Dev ops
Estimation
Process Improvement

DO THE RIGHT THING

Everything the team does must be traceable back to business value. This means “the right thing” is the thing that has been chosen by the business to be the next most important thing, in terms of business value, that we should work on. TDD has no contribution to make to this. Our assumption is that this decision has been made, and made correctly before we begin our work. How the business makes this decision is out of scope for us, and if they make the wrong one we will certainly build the wrong thing. This is an issue of product portfolio management and business prioritization, and we do not mean to minimize its importance; it is crucial. But it’s not a TDD activity. It is the responsibility of project/product management.

An analogy:

As a restaurant owner, the boss has determined that the next thing that should be added to the menu is strawberry cheesecake. He made this decision based on customer surveys, or the success of his competitors at selling this particular dessert, or some other form of market research that tells him this added item will sell well and increase customer satisfaction ratings. It will have great business value and, in his determination, is the most valuable thing to have the culinary staff work on.

DO THE THING RIGHT

One major source of mistakes is misunderstanding. Communication is an extremely tricky thing, and there can be extremely subtle differences in meaning with even the simplest of words. “Clip” means to attach (clip one thing to another) and to remove (clipping coupons).

A joke we like: My wife sent me to the store and said “please get a gallon of milk -- if they have eggs get six.” So I came back with 6 gallons of milk. When she asked why I did that, I replied “they had eggs.”

The best way we know to ferret out the hidden assumptions, different uses of terms, different understanding, missing information, and the all-important “why” of a requirement (which is so often simply missing) is by engaging in a richly communicative collaboration involving developers, testers, and businesspeople. The process of writing acceptance tests provides an excellent framework for this collaboration, and is the responsibility of everyone in the organization.

The analogy, continued:

You work as a chef in the restaurant, and the owner has told you to add strawberry cheesecake to the menu. You prepare a graham-cracker crust, and a standard cheesecake base to which you add strawberry syrup as a flavoring. You finish the dish and invite your boss to try it. He says “I did not ask for strawberry flavored cheesecake, I asked for a strawberry cheesecake. Cheesecake with strawberry.”

So you try again, this time making a plain cheesecake base and adding chopped up strawberries, stirring them in. The boss stops by to sample the product and says “no, no, not strawberries in the cake, I meant on the cake.”

So you try another version where the plain cheesecake is topped by sliced strawberries. Again the boss in unhappy with the result. “Not strawberries, strawberry. As in a strawberry topping.”

What he wanted was a cheesecake topped with strawberry preserves, which he has always thought of as “strawberry cheesecake.” All this waste and delay could have been avoided if the requirements had been communicated with more detail and accuracy.

DO IT EFFICIENTLY

For most organizations the primary costs of developing software are the time spent by developers and testers doing their work, and the effect of any delays caused by errors in the development process. Anything that wastes time or delays value must be rooted out and corrected.

TDD has a major role to play here.

When tests are written as the specification that guides development, they keep the team focused on what is actually needed.
The tests themselves require precision in our understanding of a requirement and thus lead to code that satisfies the exact need and nothing more. Traditionally developers have worked in an environment of considerable uncertainty, and thus have spent time writing code that ends up being unnecessary, which wastes their time.
Without TDD, defects in the code will largely be dealt with after development is over, requiring much re-investigation of the system after the fact. TDD drives the issue to one of bug prevention (much more time-efficient) rather than bug detection.

DO IT SAFELY

Software must be able to change if it is to remain valuable, because its value comes from its ability to meet a need of an organization or individual. Since these needs change, software must change.

Changing software means doing new work, and this is usually done in the context of existing work that was already completed. One of the concerns that arises when this is done is: will the new work damage the existing system? When adding a new feature, for example, we need to guard against introducing bugs in the code that existed before we started our work.

TDD has a significant role here, because all of our work proceeds from tests and thus we have test coverage protecting of our code from accidental changes. Furthermore, this test coverage is known to be meaningful because of how it was achieved.

Test coverage that is added after a system is created is only guaranteed to execute the production code, but not to guarantee anything about the behavior that results from the execution. In TDD the coverage is created by writing tests that drive the creation of the behavior, so if they continue to pass we can be assured that the behavior remains the same.

DO IT PREDICTABLY

A big part of success in business is planning effectively, and this includes the notion of predictability. Every development initiative is either about creating something new, or changing something that already exists (and, in fact, you could say that creating something new is just a form of change: from nothing to something).

One question we seek to answer when planning and prioritizing work is: how long will it take and how many resources will be required? Although we know we can never perfectly predict these things, we want to reduce the degree of error in our predictions.

TDD has a role to play here:

TDD increases design and code quality. There are many reasons for this, but the shorthand explanation is that bad designs and poor code are very hard to test. If we start from the testing perspective, we tend to create more quality. Higher quality creates clarity, and the more clarity you have the better your predictions will be.
TDD points out gaps in analysis earlier than traditional methodologies. These gaps, when discovered late, create unexpected/unplanned for work, and this derails our predictions.
TDD provides meaningful code coverage. This reduces the creation of unexpected errors, and fewer unexpected anything increases predictability.
TDD helps us to retain knowledge, and the more you understand a thing the more accurate your predictions will be about changing it.

DO IT SUSTAINABLY

The team must work in a way that can be sustained over the long haul. Part of this is avoiding overwork and rework, and making sure the pace of work is humane. Part of this is allowing time for the team to get appropriate training, and thus to "sharpen the saw" between major development efforts. Issues like these are the responsibility of management whether the team is practicing TDD or not.

However, this work is called "Sustainable Test-Driven Development" for a reason. TDD itself can create sustainability problems if the maintaining the test suite presents an increasingly-significant burden for the team. Much of our focus overall has been and will continue to be avoiding this problem.

In other words, TDD will not create sustainability unless you learn how to do it right.
Next up, How TDD impacts the 8 skills of software development

Posted on: February 10, 2021 05:17 AM | Permalink | Comments (0)

TDD Mark 3, part 2

I realized recently that this had been written, but never published. Part 1 was, but never this second part. Not sure how that happened. Maybe I needed a test. :)

Anyway, here it is. Part three is still pending.

-Scott-

Expanding the thesis

Our central thesis thus far has centered on the notion that TDD is not really about testing, it is really about specification. But we must also make a distinction between what TDD is and what it does. Test-Driven Development is definitely a phrase that describes an action, if one focuses on the word “driven”.

What does TDD drive? It drives development. What is development?

Traditionally we have considered the creation of software to consist of a sequence of phases, usually something like:

Analysis
Design
Construction (coding)
Inspection (testing)

In agile methodologies we abandon the notion that these aspects of software development should be conducted in discreet phases, replacing this with incremental action. In each increment (2 weeks if you adhere to Extreme Programming, one month if you choose to do Scrum, etc…) we conduct all aspects of development, from analysis to testing.

TDD, being a distinctly agile methodology, must therefore concern itself with all aspects of development.

The analysis aspect of TDD is the reason we can consider the test suite to form a technical specification, and we can certainly say TDD drives us toward this by the simple fact that you cannot write a test about something you do not understand. Automated tests are very unforgiving, and require a level of detailed understanding in order to create them. Thus, they require rigorous analysis.

We like to say that the best specification “forces” you to write the correct code. In investigating this fully (which we will do in a future blog) we’ll see that the tests we write, if done in the proper and complete way, do exactly this. You cannot make the tests pass unless you write the right code. Thus TDD leads to construction.

Also, while we do not write our tests for testing purposes, but rather as the spec that leads to the implemention code, we do not discard the tests one the code is complete. They have, in essence, a second life where they provide a second value. They become tests once we are done using them to create the system. So TDD does apply to testing as well. There may be other tests we write, but the TDD suite does contribute to the testing needs of the team.

That leaves design. Can TDD also be said to apply to design? Could TDD also be “Test-Driven Design”, in other words? We say yes, decidedly so. Much of what will follow in future blogs will demonstrate this.

But this integration of the test-writing activity into all aspects of software development means that the test suite itself becomes essentially part of the source code. We must consider the tests to be first class citizens of the project, and thus we must also address ourselves to the design of the tests themselves. They must be well-designed in order to be maintainable, and this is a critical issue when it comes to conducting TDD in a sustainable way, which is a clear focus of this blog series.

“Good” design

How does one define a good design? This is not a trivial question. Some would say that looking to the Design Patterns can provide excellent examples of good design. Some would say that attending to a rubric like SOLID (Single responsibility, Open-closed, Liskov substitution, Interface segregation and Dependency inversion) can provide the guidance we need to produce high-quality designs. We agree with these ideas, but also with the notion of the separation of concerns.

Martin Fowler, in his book “UML Distilled”, suggested that one way to approach this is to fundamentally ensure that the abstract aspects (what he called the “conceptual perspective”) of the system should not be intermixed with the specific way those concepts are executed (what he called the “implementation perspective”).

Let’s examine a counter example, where we do not follow this advice and mix these two perspectives.

Let’s say we have an object that allows us to communicate via a USB port. We’ll call it USBConnection, and we’ll give it a send() and receive() method. Let’s furthermore say that, sometime after this object has been developed we have a new requirement to create a similar object, but that we need to also ensure that any packet sent over the port is verified to be well-formed, otherwise we throw a BadPackedException. In the past, when we considered OO to be primarily focused on the notion of object reuse, we might have suggested something like this:

Figure 1: “Reusing” the USBConnection by deriving a new type from it

This can produce problems.

First, any change to USBConnection can also propagate down to VerifiedUSBConnection, whether that is appropriate/desired or not. The opposite, however, is not true. We can make changes to the verified version with complete confidence that these changes will have no effect on the original class.

Second, one can create an instance of VerifiedUSBConnection and, perhaps accidentally, cast it to the base type. It will appear, in the code, to be the simple USBConnection, which never throws an exception, but this will not be true. The reverse, however, is impossible. We cannot cast an instance of USBConnection to type VerifiedUSBConnection and then compile the code successfully.

If we do this very much, we end up with a vague, muddy, confusing architecture, where changes and errors propagate in hard-to-predict ways, where we simply have to remember that certain issues are of concern where other are not, because the design does not explicitly control coupling.

But Fowler’s guidance would also lead us away from using inheritance like this, because the class USBConnection is essentially forming an interface which is implemented by VerifiedUSBConnection, while also being an implementation itself. It is both conceptual, and an implementation, we have not separated these perspectives in this object. If we want to completely separate the conceptual part of the system from its implementation we would be forced to design it differently:

Figure 2: Two ways of separating concept from implementation

In the first design, USBConnection is a conceptual type (interface, abstract class, pure virtual class, something along those lines) with two different implementing versions. The conceptual type is only conceptual and the implementing types are only implementations; there is no mixing.

In the second design (which, if you are familiar with patterns is a combination of the Strategy Pattern with the Null Object Pattern), the concept of PacketVerifier is represented by a type that is strictly conceptual, whereas the two kinds of verifiers (one which performs the verification and one which does nothing at all) are only implementations, there is no mixing.

Either way (and we will examine which of these we prefer, and why, in a later blog) we have created this separation of concerns. In the first design, a change to NonVerifiedUSBConnection will never propagate to VerifiedUSBConnection, and the same is true in the reverse. Instances of neither of the implementing types can be accidentally cast to the other. In the second design, these qualities are the same for the PacketVerifier implementations.

Design quality is all about maintainability, about the ability to add, modify, change, scale, and extend without excessive risk and waste. If our tests are first-class citizens in our system, they must be well-designed too.
Let’s look back at a piece of code from TDD Mark 3 Introduced:

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
LoadInitialFunds(MinimalFunds());
uint insufficientFunds = MinimalFunds() - 1;
try
{
LoadInitialFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Account.InsufficientFundsException).Name());
}
catch (Account.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}

private UInt MinimalFunds() {
return Account.MINIMAL_FUNDS;
}

private void LoadFunds(uint funds)
{
Account account = new Account(funds);
}

The public method (marked [TestMethod] expresses the specification conceptually; the concept of loading funds, there being a notion of “minimal funds”, and the idea that a whole dollar epsilon of the behavioral boundary, these comprise the conceptual perspective. The fact that “minimum funds” is a constant on the Account class, and the fact that the fund-loading behavior is done by the constructor of Account, these are implementation details that could be changed without the concepts being effected.

For example, we may later decide to store the minimal funds in a database, to make it configurable. We may decide to validate the minimum level in a service object that Account uses, or we could build Account in a factory and allow the factory to validate that the funds are sufficient. These changes would impact, in each case, a single private method on the test, and the conceptual public method would be unchanged.

This is the next step in sustainability, and we will be investigating many aspects of it. How will it change the way we write tests? How will it change dependency management? Should these private methods actually be extracted into a separate class? If so, when and why would we decide to do that?

We’d love to hear from you….

https://www.projectmanagement.com/blog-post/68206/TDD-Mark-3-Introduced

Posted on: February 10, 2021 04:49 AM | Permalink | Comments (0)

TDD Mark 3 Introduced

First of all, sorry for the long absence. Our training schedule has been wall-to-wall, and when one of us had a brief gap the other has always been busy.

It has given us time to think, however. Long airplane rides and such. :)

We're been playing around with an idea we're calling (for the moment) TDD Mark 3 (the notion that TDD is not about testing but rather about specification being TDD Mark 2). To give you an idea of what we're thinking, let's look at an example of TDD Mark 2 as we've been writing tests up to this point, and then refactor it to the TDD Mark 3 style.

Mark 2
So, what is the requirement? Our client is a cruise ship operator. Some of the stuff on the cruise is free and the rest are paid extras. On the ship, the guest can pay for extras with a standard credit card, or with the ship's debit card. Paying with the ship's debit card gives the guest a 5% discount on the purchase cost. The catch is that if you want to use the ship's debit card the guest has to load the card for the first time with at least $2,000. If you try to load a card with less than that amount, the transaction should fail.

[TestClass]
public class FundLoaderTDD
{
[TestMethod]
public void TestMinimalFunds()
{
Assert.AreEqual(2000, Card.MINIMAL_FUNDS);
}

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
uint minimalFunds = Card.MINIMAL_FUNDS;
Card card = new Card(/*card holder's details*/);
card.LoadFunds(minimalFunds);

uint insufficientFunds = minimalFunds - 1;
card = new Card(/**/);
try
{
card.LoadFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Card.InsufficientFundsException).Name());
}
catch (Card.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}
}

The meaning here is very clear: the LoadFunds() method of Card will throw an InsufficientFundsException if you try to load an amount less than the minimal allowed value. We also show that if the minimal amount is loaded, an exception is not thrown. This constitutes a very typical specification of a behavioral boundary anchored at the value MINIMAL_FUNDS. Note also that we have specified what that value is in the first test.

Naturally, there are many other tests that specify the various aspects of the Card's behavior, and together they turn the user's requirement into an executable specification. That's what Mark 2 is all about.

Refactor to Mark 3
We all know the importance of good design. Good design enables proper code maintainability (more on that in a future blog), which has to do with dealing with change.

We should also acknowledge that the tests that we write are not "second class citizens". They require as much love and attention as the production code they specify. This means that after the test has been written we have an opportunity to refactor its design. This is done with respect to specific changes that may be required in the code. These can come from two sources - changing requirements and changing the domain model to reflect changing responsibilities.

Changing requirement could comprise raising the minimal limit or creating a graded discount structure. Changing the domain comprises adding or removing classes or methods on classes.

The customers new requirement is this: there are other ways to charge the user for on-board services. It turns out that guests often do not carry the card with them (to the pool, for example) but would still like to purchase cute drinks with little pink umbrellas. To enable that, a biometric system was installed where the guest can charge the drink to his card by swiping their finger over a fingerprint scanner incorporated into the card reader held by the server.

This means that the model we created where the Card was the central object needs to be refined, and an Account class introduced. The Card is just one way if interacting with the account.

What affect will this have on our test? All reference to Card must be replaced with references to Account. Considering out test code there are two redundancies that we can identify: Card.MINIMAL_FUNDS and card.LoadFunds().

[TestClass]
public class FundLoaderTDD
{
[TestMethod]
public void TestMinimalFunds()
{
Assert.AreEqual(2000, Card.MINIMAL_FUNDS);
}

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{

uint minimalFunds = Card.MINIMAL_FUNDS;
Card card = new Card(/*Any card holder's details*/);
card.LoadFunds(minimalFunds);

uint insufficientFunds = minimalFunds - 1;
card = new Card(/*Any card holder's details*/);
try
{
card.LoadFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Card.InsufficientFundsException).Name());
}
catch (Card.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}
}

We don't like redundancies in our tests any more than we like them in our production code. We extract the redundancies into methods:

[TestClass]
public class FundLoaderTDD
{
[TestMethod]
public void TestMinimalFunds()
{
Assert.AreEqual(2000, MinimalFunds());
}

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
uint minimalFunds = MinimalFunds();
card = new Card(/*Any card holder's details*/);
LoadFunds(minimalFunds);

uint insufficientFunds = minimalFunds - 1;
card = new Card(/*Any card holder's details*/);
try
{
LoadFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Card.InsufficientFundsException).Name());
}
catch (Card.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}

Card card;
private UInt MinimalFunds()
{
return Card.MINIMAL_FUNDS;
}

private void LoadFunds(uint funds)
{
card.LoadFunds(funds);
}
}

We can inline the MinimalFunds private function and get:

[TestClass]
public class FundLoaderTDD
{
[TestMethod]
public void TestMinimalFunds()
{
Assert.AreEqual(2000, MinimalFunds());
}

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
card = new Card(/*Any card holder's details*/);
LoadFunds(MinimalFunds());

uint insufficientFunds = MinimalFunds() - 1;
card = new Card(/*Any card holder's details*/);
try
{
LoadFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Card.InsufficientFundsException).Name());
}
catch (Card.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}

Card card;
private UInt MinimalFunds()
{
return Card.MINIMAL_FUNDS;
}

private void LoadFunds(uint funds)
{
card.LoadFunds(funds);
}
}

Wait! There's another redundancy above:

try
{
//...
Assert.Fail("Card should have thrown a " +
typeof(Card.InsufficientFundsException).Name());
}
catch (Card.InsufficientFundsException exception)
{
//...
}

We are specifying the type of the exception twice...We'll deal with that redundancy in a bit, so we'll put it on the to-do list. In the meanwhile, back to the refactored tests. I do not like the name we gave the LoadFunds method, it's misleading. The customer not want the exception to be thrown every time the card is loaded with a small amount -- only on the initial load. So perhaps this is better:

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
LoadInitialFunds(MinimalFunds());

uint insufficientFunds = MinimalFunds() - 1;
try
{
LoadInitialFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Card.InsufficientFundsException).Name());
}
catch (Card.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}

Card card;
private void LoadInitialFunds(uint funds)
{
card = new Card(/*Any card holder's details*/);
card.LoadFunds(funds);
}

Note the fact that card's initialization was moved into the LoadInitialFunds method.

Besides shifting the funds handling responsibility to the Account object, it was also deemed useful to shift the initial amount loading from a specific method to the constructor. So for the $64,000 question - how many places do we need to make this change in? One:

~~Card card;~~
private void LoadInitialFunds(uint funds)
{
~~card = new Card(/*card holder's details*/);~~
Account account = new Account(funds);
}

And where should the limit be defines? Account, and it will return the value in a method.

private UInt MinimalFunds()
{
return Account.MinimalFunds();
}

Finally, we can make the changes in the test, but only because we left the two references to the exception in the test.

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
LoadInitialFunds(MinimalFunds());

uint insufficientFunds = MinimalFunds() - 1;
try
{
LoadInitialFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Account.InsufficientFundsException).Name());
}
catch (Account.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}

The public methods are the specification, the private methods encapsulate implementation. Well, almost, with the exception of the exception handling. But why is an exception being thrown at all?

Well, if you remember, the customer wanted the user to be notified if the amount is too small. Exceptions are just one way of doing it. So we can safely say that the specific exception is an implementation detail, and based on the role we want the public method to play - specification, we really need to get that implementation detail out of here.

So, here's a question to our readers. How would you do it? Note that although we used C# right now, the refactoring principles are relevant to any language.

So without dealing with the exception, yet, this is what the test code looks like.

[TestClass]
public class FundLoaderTDD
{
[TestMethod]
public void TestMinimalFunds()
{
Assert.AreEqual(2000, MinimalFunds());
}

[TestMethod]
public void TestLoadingLessThanMinimalFundsThrowsException()
{
LoadInitialFunds(MinimalFunds());

uint insufficientFunds = MinimalFunds() - 1;
try
{
LoadInitialFunds(insufficientFunds);
Assert.Fail("Card should have thrown a " +
typeof(Account.InsufficientFundsException).Name());
}
catch (Account.InsufficientFundsException exception)
{
Assert.AreEqual(insufficientFunds, exception.Funds());
}
}

private UInt MinimalFunds() {
return Account.MINIMAL_FUNDS;
}

private void LoadFunds(uint funds)
{
Account account = new Account(funds);
}
}

The public methods now essentially constitute an acceptance test. In fact, those familiar with acceptance testing frameworks like FIT would express what these unit test methods communicate in another form, like a table for example, and the private methods would be the fixtures written to connect the tests to the system's implementation.

This does make the test class longer and more verbose, but it also makes it easier to read just the specification part, if that's all you are interested in. Also, when design changes are made later (lets say, for example, that we decide to build the Account in a factory, or store the minimal initial value in a configuration file) that only one private method will be effected by a given change, and none of the public methods at all (which makes sense, since the design has been altered but not the acceptance criteria).

Mark 3
The separation in perspectives that was created in the above code is a result of refactoring. But it actually makes sense regardless. The public test method is written by intention, and describes the conceptual behavior of the system.

We also have a separation between the specification and implementation. We call these - different perspectives. And they allow us to focus on getting the requirement right, and then getting the design right. We can change the design without affective the requirement.

This is a major piece of making TDD sustainable. As this allows us to change the system design without affecting the public tests which specify the behavior.

So, the $1.000,0000 question is: "Why not write the tests that way to begin with?"

To Be Continued....

TDD Mark 3 Part 2

Posted on: February 10, 2021 04:43 AM | Permalink | Comments (0)

newer posts

older posts

Sustainable Test-Driven Development

About this Blog

RSS

Recent Posts

Categories

Date

Structure of Tests-As-Specifications

Specifying The Negative in TDD

TDD and the "6 Do's and 8 Skills" of Software Development: Pt. 1

TDD Mark 3, part 2

Expanding the thesis

“Good” design

TDD Mark 3 Introduced

Sustainable Test-Driven Development

About this Blog

RSS

Recent Posts

Categories

Date

Structure of Tests-As-Specifications

Specifying The Negative in TDD

TDD and the "6 Do's and 8 Skills" of Software Development: Pt. 1

TDD Mark 3, part 2

Expanding the thesis

“Good” design

TDD Mark 3 Introduced

Sponsors

Newsletters