Project Management

Sustainable Test-Driven Development

Test-driven development is a very powerful technique for analyzing, designing, and testing quality software. However, if done incorrectly, TDD can incur massive maintenance costs as the test suite grows large. This is such a common problem that it has led some to conclude that TDD is not sustainable over the long haul. This does not have to be true. It's all about what you think TDD is, and how you do it. This blog is all about the issues that arise when TDD is done poorly—and how to avoid them.

About this Blog


Recent Posts

Do I Really Have to Test Everything? (part 3)

Do I Really Have to Test Everything? (part 2)

Do I Really Have to Test Everything?

TDD Tests as “Karen”s


Do I Really Have to Test Everything? (part 3)

Categories: TDD

This is part 3 of a three-part posting.  If you have not read part 1, I strongly suggest you start there:

Answer #3:
TDD is not about testing in the first place.  It never was.

The idea of driving development from tests can be a bit confusing to some.  "How can you test something you haven't created yet?  The test will fail, obviously.  What's the point?"

That a reasonable question too.  The fact is that TDD is not really a "testing" activity at all.  It produces enormously beneficial tests, but they are a side effect.  TDD is all about creating a specification - we simply use tests to do it.  It's not at all surprising that we would do that first, is it?  Realizing that TDD is actually a specifying activity also changes almost everything about the way we do it, and for the better. I could talk about that at length. But to the point I'm addressing here...

When you think this way, then the original question on the table becomes:

Do I Really Have to Specify Everything?

How about getters and setters?  They are so very trivial.  What about an object with no constructor, or only a parameter-less constructor, do I have to specify what "new" does?  It can seem like a silly idea and one that may well be a waste of time.

Well, first of all you don't have to do anything.  This is not "Test-Dictated Development", you are in charge of how you use your time.  However, you may find great value in specifying everything. I'm going to argue here that you will, and that you'll want to do it with automated tests.

When you create software you do so from an understanding you hold in your head.  That knowledge has value but it may leave your head over time.  Or your head may leave (you get another job, get promoted, win the lottery and retire, etc...).  When this happens, this represents lost enterprise knowledge, a significant problem across our industry. Many organizations have large, complex legacy systems that are mission-critical but the expertise that created them is long gone.  Updating them is dangerous, slow, and error-prone.

As an example: If you work on something to completion, then move on to other work, then perhaps months or even years later are required to work on it again you may well have forgotten a lot of what you knew when you made it. How will you re-gather the knowledge you once had in order to become qualified to do this work now?  Will you read the specification you left behind?

Imagine such a thing done in the traditional way: a document of some kind.  Its value was that it informed the initial development effort.  But can you use it later to re-educate yourself?  It is still accurate?  How do you know that someone has not changed the way the system works and failed to update the specifying document? That's not supposed to happen but it does all the time.  People get busy, run out of time, or don't even know that the document exists or how to find it.

The main problem is that you have no way of knowing.  You can't trust it. Its value is gone.  You may as well throw it away.

Now imagine the specification is a suite of tests.  You can execute them, and if they still run and pass you know that they are still accurate to the system as it stands today.  A big part of the way we write tests in TDD is to make them highly readable by humans, and so now you can read them and know the information you're getting is accurate.  But... what if they don't execute, or don't pass?  Then the specification has lost its value anyway.  What do we do about that?

You adopt the following strict policy: any change to the system must begin by changing the test that specifies that behavior.  The test will fail, but then the change is made to get it passing again. If you always do that, then the suite of tests will always represent a complete and accurate view of the way it worked, works, and has changed over time.  Forever.

Of course, that is only possible if everything has a test to change in the first place, right?  That's how you know the suite is always an accurate specification of the system.

So far so good. But... is it complete?

How do you know that someone has not added totally new behavior to the system and failed to add a test to the specification? It may be accurate as far as it goes, but it may be missing things, perhaps significant things.  Again, there is no way for you to know.  This is lost knowledge once again.

This is why TDD must be adopted as a team discipline.  Everyone must be on board and sufficiently trained to do it right.  If even one developer is not willing or able to adhere to the process, then TDD simply won't work.  It's got to be everyone, and it's got to be everything.

The key word to all of this is everything.  Anything less than everything isn't everything, and you lose the value of the comprehensive confidence that TDD offers in your process, your design, your institutional knowledge, and your product overall.

So, for my part, I choose to specify everything using tests no matter how trivial it may seem to be at the moment.  I think it's worth it.  Maybe you will too.

I hope you've found this series illuminating and helpful. Watch this space for more.


Posted on: November 30, 2022 11:55 AM | Permalink | Comments (6)

Do I Really Have to Test Everything? (part 2)

Categories: TDD

This is part 2 of a three-part posting.  If you have not read part 1, I strongly suggest you start there:

Answer #2:
You will not test everything. You'll want to, but you won't.  

In TDD the theory is that you always write the test first, observe its failure, then do the work to make it pass.  Consequently, in a perfect world no line of code would ever be written that was not also tested.  It is not, unfortunately, a perfect world.

So you'll miss things because you're human.  When you do, then defects may escape your notice, and some of them will likely make it into the delivered product.  I wish I could tell you I knew how to prevent that completely, but I don't.  Far fewer defects will end up in your customer's lap, but not zero.  So what is the value of TDD in this case?  

It is huge.

First, when a customer encounters a bug, they will hopefully report it back to you.  Most organizations have some mechanism for this: an 800 number or a website, something.  When such a report is made a "trouble ticket" or similar artifact is generated and assigned to a developer.  Normally the job is now to do two things: 1) Find the cause of the problem either by attempting to replicate it, or by running the system in debug mode, or using any number of techniques we have created for this purpose.  2) Once found the issue is fixed, the fix is confirmed, and the ticked is closed.

Not in TDD.

In TDD a defect that makes it into the product is not a bug.  It is a missing test.  It is the test that should have been written but was not.  It is the test that would be failing now, but because it doesn't exist the product was released in ignorance.  Job #1 is to figure out what test was missed, and while this may involve many of the same activities as before, the goal is different. We want to write the test, run it, and watch it fail since we have not yet addressed the defect at all. That failure almost completely confirms that yes, indeed, we found and created the missing test. Almost.

Now we fix the problem and run the tests again.  The single red one should now go green, and because of the way we made that happen we now have complete confirmation that it was the right test.  But we also run all the other tests too.  Always.  Because if they stay green this also confirms that the action we took to fix the bug didn't, in turn, create another one.  Any developer will tell you what a nightmare that can be: you fix one thing and break another, you fix that and break something else, and down the rabbit hole you go.  TDD will tell you that this has not happened or, if it has, exactly where and why.  Immediately.

One more, very important thing:

The actions we take are not all that different from tradition except that they end up producing a test, and the test is kept forever.  Normally when a bug is fixed it is quite possible that it will come back later, because someone working on the system inadvertently re-introduces it.  That's how it got there in the first place after all.  But if you follow the TDD process, the bug will never come back because we never release software with a failing test.  So whatever effort this involves produces permanent value.  I know of no other way to make that happen.  

Stay tuned for answer #3.

Posted on: November 29, 2022 11:47 AM | Permalink | Comments (3)

Do I Really Have to Test Everything?

Categories: TDD

This is a quite frequent question in my Test-Driven development classes.  I really like it because it presents a wonderful opportunity to make some significant observations about TDD and testing in general.

I have three answers to this question, so this posting will be in three parts.  One today, one tomorrow, one the day after that.

Answer #1:
It's not the right question.  The truth is, everything will be tested, the real question is: by whom?

  1. It could be you, as you are developing the software.  That's the standard practice of TDD: the test is written as part of the development process in the first place.
  2. It could be someone else within your organization.  Perhaps another developer is tasked to test what you created, or perhaps you have a separate testing team, QA, QC, SAT testing, something along those lines.
  3. If it's neither of these, then it is your customer, when using the product.  They will test it by using it.

If it's a separate testing effort within your organization, the problem is that there may be a significant delay before you get the results.  Testing is scheduled and often lags significantly behind development.  When a report of a defect arrives, it may be days, weeks, or even longer since you wrote the code and handed it over.  It will no longer be fresh in your mind and so you'll have to spend time tracking down the cause of the problem before you can fix it.  This, of course, creates cost in terms the hours you spend and the delay this causes.

If it's your customer, the costs go up.  First, this can damage the reputation of your product; it is seen as faulty to some degree.  It can also damage the reputation of your organization overall because you released software that was not working properly.  At long last once the defect is uncovered, it can damage the reputation of the developer who created it.  That could be you.

What may be even worse than this is that a customer, having encountered a defect, may not tell you at all.  They feel no responsibility to do so.  They may simply be living with it, annoyed with you and your product, and may well tell colleagues of theirs that you put out software that's buggy.  Your reputation is suffering and you don't even know it, or to repair it.

If it's you, and you introduce a defect, you find out immediately. If you are doing TDD properly your tests are fast and so you run them frequently.  If one goes red, you know it must be something you just did, as they were all green minutes ago.  Finding it is trivial, and maybe you won't even bother looking.  Just Ctrl-Z and try again, that might be most efficient depending on the nature of the work.  In any case, nobody knows you made that mistake but you, you save yourself the embarrassment and potential harm to your career while also safeguarding the reputation of your organizations and it's products.

Do you need to test everything?  The truth is you want to test everything, not to satisfy some process requirement but because it is in your own best interest to do so.  Developers who get comfortable working this way routinely exclaim that they wish they'd always been doing it, that they never want to go back to working without writing tests.  

I have much more to say.  Stay tuned for answers 2 and 3.

Posted on: November 28, 2022 12:22 PM | Permalink | Comments (3)

TDD Tests as “Karen”s

Categories: TDD

We’ve all heard the meme by now: The Karen.  Usually a blonde woman with an eccentric hairstyle that demands to see the manager when things do not go as they want them to.  Personally I think the term is a bit sexist as I’ve met plenty of men who act this way too.  Anyway, when you’re stuck behind such a person in line at a retail establishment, you suffer while they selfishly demand things until someone relents, and you have to wait until they get their way.

I don’t much care for those people.   But when it comes to TDD, I want the tests I write to be just like that.  I want them to be demanding, picky, and downright relentless.  Let me explain why.

Let’s say you are not a TDD shop.  You are tasked with this requirement:  Every night at midnight, the transactions that have been recorded throughout the day (let’s say, they are financial in nature) must be committed to a service.  Assuming we know what it means to “commit a transaction” this seems pretty simple, right?  You’d create a job running on a timer, and when midnight comes you’d activate the code that commits the collection of transactions.  Most developers could write that pretty quickly and move on.  Let’s say you did that.

The next day you come in to find the business in an uproar.  The transactions did not go through, and your customers have sustained significant financial damage as a result.  There is liability, potential loss of market share, and damaged reputation.  When you ask why this happened, they tell you that the commit process was “too late” and that the transactions were rejected as a result.

The problem is the notion of “happening at midnight”.  That seems like a perfectly clear, simple requirement with little room for misinterpretation.  Everyone knows what midnight it.

But nothing can happen “at midnight”.  Midnight is an infinitely short period of time… the moment it becomes midnight, it is immediately “after midnight”.  So what did the client actually want?  Did they mean “by midnight”?  If so, then you should have started the commit process prior to midnight so that it will compete before the deadline?

But how much earlier should you start?  Is that based on the speed of the computer performing the action?  Is there a time that is “too early” in terms of regulations or business rules?  You now know there is a “too late”, so that seems believable.

Is it really “at” midnight, or “by” midnight, or what?

Now imagine you are a TDD team, and you received that same requirement in the same language.  The first thing you must do is convert it into some kind of executable test, probably a unit test if you are a developer.  When such a test is written, it must express three things:

  1. The condition of the system before the behavior is called for
  2. The event or action that triggers the behavior
  3. The way we can determine the success or failure of the behavior

TDD says “until you have these three things accomplished in a test, and have run that test and observed its failure, you may not begin the development work.”  It will not budge, or yield, or get out of line until you do what it wants.  It’s a Karen, or whatever the non-sexist term might be.   TDD is a discipline, and it only works if you follow it diligently.

How will you create the environment that the behavior applies to?  Do you really want to create a bunch of bogus transactions every time you need to run the test?  How can you activate the commitment if it’s based on the system clock?  Will you have to come in every night at midnight to conduct your test?  How will you know if it worked? What might make it not work?  What does “it worked” even mean?

In TDD, the team must know how to do all these things.  Some of it involves technique (dependency injection, mocking, endo-testing, etc…), some of it requires that we ask questions that we might otherwise neglect to ask (like “how will I know if it worked?”) and some of it will challenge the design of the system we are planning.  Should the “trigger at midnight” code be intertwined with the “commit the transactions” code?  Probably not, but that would be an easy mistake to make.  TDD won’t let you, because it will be too painful and arduous to write a test of such a thing.

Bad designs are hard to test.  Is your design any good?  If not, don’t you want to know?

It’s hard if not impossible to write a test for something you don’t understand.  Do you really know enough to build this feature?  Are there questions you should be asking?

TDD works because, among other things, it is demanding, picky, and downright relentless.  It is that annoying customer that won’t go away until you address them, and thus it holds your feet to the fire in a way that makes you better at your job, and more valuable to your clients.

To do this, the team must be effectively trained.  Just knowing “how to write a unit test” won’t do it, not even close.  If you do not know how to control dependencies (like “time” in the example) or how to separate behaviors in your system without over-complicating its design, or what it means to completely express a requirement as a test, then you’ll struggle and probably give up.  But if the team is empowered with this knowledge they can move swiftly, confidently, and aggressively to ensure that the organization they serve remains competitive in an ever-changing world.  TDD makes you do it right, because doing it wrong simply won’t work.

And isn’t that great?

Posted on: May 02, 2022 02:18 PM | Permalink | Comments (9)


A question that we are often asked is: “What is the difference between Acceptance Test Driven Development (ATDD) and Test Driven Development (TDD)?” These two activities are related by name but otherwise seem to have little to do with each other. 

ATDD is a whole-team practice where the team members discuss a requirement and come to an agreement about the acceptance criteria for that requirement. Through the process of accurately specifying the acceptance criteria -- the acceptance test -- the team fleshes out the requirement, discovering and corroborating the various assumptions made by the team members and identifying and answering the various questions that, unanswered, would prevent the team from implementing or testing the system correctly.

The word acceptance is used in a wide sense here:

  • The customer agrees that if the system, which the team is about to implement, fulfills the acceptance criteria then the work was done properly
  • The developers accept the responsibility for implementing the system
  • The testers accept the responsibility for testing the system

This is a human-oriented interaction that focuses on the customer, identifying their needs. These needs are specified using the external, public interfaces of the system. 

TDD, on the other hand is a developer-oriented activity designed to assist the developers in writing the code by strict analysis of the requirements and the establishment of functional  boundaries, work-flows, significant values, and initial states. TDD tests are written in the developer’s language and are not designed to be read by the customers. These tests can use the public interfaces of the system, but are also used to test internal design elements. 

We often see the developers take the tests written through the ATDD process and implement them with a unit testing framework.

Requirements from the customer

Before we continue, we need to ask ourselves -- what is a requirement? It is something that the customer needs the system to do. But who is the customer? 

In truth, every system has more than one customer... dozens at times:

  • Stakeholders
  • End users, of different types
  • Operators
  • Administrators (DB, network, user, storage)
  • Support (field, customer, technical)
  • Sales, marketing, legal, training
  • QA and developers (e.g., traces and logs, simulators for QA)
  • etc...

All requirements coming from all of these different customers must be addressed, identified and expressed through the ATDD process. For example:

  • The legal department needs an End User Legal Agreement (EULA) to be displayed when the software is first run, and for the end user to check off the agreement before the system can be used.  This is of no interest to the end users (who we sometimes think of as ‘the customers’), in fact might be an annoyance to them, but is required for the system to be acceptable to the lawyers.
  • The production support team needs all error messages in the system to be accompanied by error codes that can be reported along with the condition that caused the error.  Here again, end users are not interested in these codes, but they can be crucial for the system to be acceptably supported.

And let us not forget the the developers are customers too, who else do we build tracers and loggers for? This is an obvious, publicly visible facet of the developer’s work. But when do we need these facilities? When we try to fix bugs. When we want to understand how the system works. When we work on the system for any reason.  

In other words, when we do maintenance to the system.

Maintainability is a requirement

We need our maintenance to be as easy as possible. No car owner would like to disassemble the car’s engine just to change a windshield wiper; nor would they want to worry that by changing a tire they have damaged the car’s entertainment system. 

Maintainability is a crucial requirement for any software system. Software system maintenance should be fast, safe and predictable. You should be able to make a change fast, without breaking anything, and you need to be able to tell me reliably how long it will take. We expect this of our car mechanic as well as our software developer. So although maintainability is primarily the concern of the developer it definitely affects the non-technical customers. 

The way maintainability manifests itself in software is through design. Design principles are to developers as mathematics is to physicists. It’s the basis of everything that we do. If we do not pay attention to the system’s design as it is developed,it will quickly become unmanageable. 

How often, however, have you seen “maintainability” as a requirement? We’ve never seen it. We call it the “hidden requirement.” It’s always there but no one talks about it. And because we don't talk about it, we forget it about it; we focus on fulfilling the written requirements thinking that we will be done when we complete them. And very quickly, the system turns very hard and unsafe to change.  We are accumulating technical debt, which we could just call “the silent killer.” 

If maintainability is such a crucial requirement, where is the acceptance criteria for it? Who is the customer for this requirement? The development team.  

We need to prove to the customer that the design as was perceived was implemented, and that this design is in fact maintainable, that the correct abstractions exist, that object factories do what they are supposed to, that the functional units operate the way they should, etc... 

Indeed there is something that we can do, in the developers’ own language -- code -- that does precisely these things. It’s TDD. 

One key purpose of TDD is to prove and document the design of the system, hence proving and documenting its maintainability.

TDD is developer-facing ATDD

ATTD is about the acceptability of the system to its various customers.  When the specific customer is the development team then the tests are about the acceptability of the system’s design and resulting maintainability.  Our focus in this work is acceptability in this sense: is the design acceptable?  Is our domain understanding sufficient and correct?  Have we asked enough questions, and were they the right ones?  A system that fails to meet these acceptance criteria will quickly become too expensive to maintain and thus will fail to meet the needs of those who use it. 

Software that fails to meet a need is worthless.  It dies.  So, here again, failing to pass the “maintainability” acceptance criteria is the silent killer.  TDD is the answer to this ailment. 

Note to readers: This was a philosophical treatise. Specific, practical examples abound and will constitute much of our work here, so, read on.

Posted on: February 12, 2021 02:21 AM | Permalink | Comments (1)

"Don't go around saying the world owes you a living. The world owes you nothing. It was here first."

- Mark Twain