Project Management

Sustainable Test-Driven Development

Test-driven development is a very powerful technique for analyzing, designing, and testing quality software. However, if done incorrectly, TDD can incur massive maintenance costs as the test suite grows large. This is such a common problem that it has led some to conclude that TDD is not sustainable over the long haul. This does not have to be true. It's all about what you think TDD is, and how you do it. This blog is all about the issues that arise when TDD is done poorly—and how to avoid them.

About this Blog


Recent Posts

Acceptance Tests: Why Bother?

Do I Really Have to Test Everything? (part 3)

Do I Really Have to Test Everything? (part 2)

Do I Really Have to Test Everything?

TDD Tests as “Karen”s


PMI Training, TDD


Acceptance Tests: Why Bother?

Categories: TDD, PMI Training

Acceptance Test-Driven Development (ATDD) is a collaborative activity where individuals from different parts of an organization, both technical and non-technical people, get together to create acceptance tests before development begins.  It is not without cost, of course, and so many might ask if it is worth it and why.

A story will help to illustrate why ATDD is so important.  This is a real-world example.

It involves a relatively small software development company in Montana. Its main client was a statewide provider of satellite dish hardware and video services in rural areas of the state. In these areas it is too expensive to lay out cable or fiber optics and given that there were so few customers the big companies like Comcast and Frontier were not interested in the business.  So a local company served this customer base.

This boutique software shop had written custom applications for all of the company's activities including general ledger, payroll, installation and maintenance scheduling, and so forth. One of these applications had to do with the documents that their customers had signed when they contracted for satellite dish services. These contracts were digitized and stored in a database so they would be available if there were any disputes or conflicts about level of service agreements, payment schedules, or anything that the customers had agreed to when they signed up.

A new requirement was issued. A feature was to be created that could be activated daily to scan all of the documents stored in the database and delete those that were more than 7 years old. The developers accepted this requirement as stated and considered it to be a fairly trivial task. Accordingly they gave a very low bid and promised completion in a short period of time.

They created a plugin component that could be installed in the database server's workflow that would activate at a convenient time during the day, when business was slow, scan all the documents in the database, comparing their date field with the system clock, and erasing those that were more than 7 years old.  As promised they got this done very quickly.

What they did not do was ask why. Why did these documents need to be deleted?

Why didn't they ask? Because they thought they knew the answer already, that it was obvious. They assumed that the reason the old contracts should be deleted was that they were out of date anyway, had little to no value, and were filling up the database which would consume resources and impede the performance of the system. 

This requirement was one of many that had been issued and when everything was complete a meeting was called so that they could deliver the products, review them, arrange for installation, and prepare for the next iteration of work. At that meeting were several individuals that had not been present when the requirements were accepted.

When they got to the component in question they explained that it would serve to free up space in the database and improve the performance and responsiveness of the applications. One person at the meeting, a legal expert, spoke up and said “that's not the reason we wanted the documents to be removed.” The reason the documents had to be erased was a new regulation from the state of Montana that said that you may not keep a person's signature for longer than 7 years, neither in physical nor digital form.  It was about privacy and identity protection, and reflected the view of that conservative legislature about individual rights.  This had caused banks all across the state to discard their signature cards when they got sufficiently old, and have their customers come in to create new ones.  But this law did not apply only to banks.

The development lead said “well maybe we did it for the wrong reason but we did the right thing. Those documents will be erased and therefore you will be in compliance with this regulation.”

The database administrator was also at the meeting. They sighed and shook their head and said “no, that won't work at all.  We’ll still be in violation.”

If you are responsible for a database one thing you will do if you want to keep your job is back it up. So the database administrator did incremental backups every day of all of the information in the database including the digitized contracts. Those incremental backups were stored off site at another location which was also in Montana. Those backups would contain the documents and the signatures spread throughout the incremental history of the database. That would violate the law.

Removing data from incremental backups is extremely complex. It involves creating a kind of recursive worm that can make its way incrementally through all the different versions of the backups, seeking out the old documents and erasing them. It would take much more time and be far more expensive than the simple component that the developers had created. They gave an estimate for this and the company said that it was far too costly. Also it would be very complex code, might not work properly at first, would probably require several iterations to get right and the deadline was looming.

They decided instead to re-architect the entire system. They would have two databases: one for their general ledger and other business records that was backed up normally, and another, separate database for all of their scanned documents. That database was not backed up because if you consider the “why” of backing something up, which of course is to ensure that information is never lost, there's another way to accomplish that. They decided to use data replication in that case. Another server at another location would store the same document data and would synchronize with the main server throughout the day. When you delete something from a server that is replicated the deletion gets replicated automatically so there is nothing left to do

This was still a fair amount of work of course and it was work that no one anticipated in the estimate because no one had spoken to the legal expert who understood the “why” of the contract deletion, and the DBA had made the same assumption that the developers had.

Communication is difficult, to say the least.  But it is also crucial when we create anything collaboratively, which almost all software is today.  Accepting a requirement involves critical thinking from many different points of view or important details will be missed or misinterpreted, resulting in wasteful work and costly errors.

ATDD is a framework for structured conversations that can reveal the real nature of requirements in a way that can be understood and confirmed by all involved.  It also records that nature in a way that can be verified against the system at any time, because it produces executable tests.  It is not difficult to do once everyone is properly trained, and it addresses a major cause of failure in software development: rampant misunderstandings.

See this page for information on our ATDD course:


Posted on: March 29, 2023 02:34 PM | Permalink | Comments (5)

Do I Really Have to Test Everything? (part 3)

Categories: TDD

This is part 3 of a three-part posting.  If you have not read part 1, I strongly suggest you start there:

Answer #3:
TDD is not about testing in the first place.  It never was.

The idea of driving development from tests can be a bit confusing to some.  "How can you test something you haven't created yet?  The test will fail, obviously.  What's the point?"

That a reasonable question too.  The fact is that TDD is not really a "testing" activity at all.  It produces enormously beneficial tests, but they are a side effect.  TDD is all about creating a specification - we simply use tests to do it.  It's not at all surprising that we would do that first, is it?  Realizing that TDD is actually a specifying activity also changes almost everything about the way we do it, and for the better. I could talk about that at length. But to the point I'm addressing here...

When you think this way, then the original question on the table becomes:

Do I Really Have to Specify Everything?

How about getters and setters?  They are so very trivial.  What about an object with no constructor, or only a parameter-less constructor, do I have to specify what "new" does?  It can seem like a silly idea and one that may well be a waste of time.

Well, first of all you don't have to do anything.  This is not "Test-Dictated Development", you are in charge of how you use your time.  However, you may find great value in specifying everything. I'm going to argue here that you will, and that you'll want to do it with automated tests.

When you create software you do so from an understanding you hold in your head.  That knowledge has value but it may leave your head over time.  Or your head may leave (you get another job, get promoted, win the lottery and retire, etc...).  When this happens, this represents lost enterprise knowledge, a significant problem across our industry. Many organizations have large, complex legacy systems that are mission-critical but the expertise that created them is long gone.  Updating them is dangerous, slow, and error-prone.

As an example: If you work on something to completion, then move on to other work, then perhaps months or even years later are required to work on it again you may well have forgotten a lot of what you knew when you made it. How will you re-gather the knowledge you once had in order to become qualified to do this work now?  Will you read the specification you left behind?

Imagine such a thing done in the traditional way: a document of some kind.  Its value was that it informed the initial development effort.  But can you use it later to re-educate yourself?  It is still accurate?  How do you know that someone has not changed the way the system works and failed to update the specifying document? That's not supposed to happen but it does all the time.  People get busy, run out of time, or don't even know that the document exists or how to find it.

The main problem is that you have no way of knowing.  You can't trust it. Its value is gone.  You may as well throw it away.

Now imagine the specification is a suite of tests.  You can execute them, and if they still run and pass you know that they are still accurate to the system as it stands today.  A big part of the way we write tests in TDD is to make them highly readable by humans, and so now you can read them and know the information you're getting is accurate.  But... what if they don't execute, or don't pass?  Then the specification has lost its value anyway.  What do we do about that?

You adopt the following strict policy: any change to the system must begin by changing the test that specifies that behavior.  The test will fail, but then the change is made to get it passing again. If you always do that, then the suite of tests will always represent a complete and accurate view of the way it worked, works, and has changed over time.  Forever.

Of course, that is only possible if everything has a test to change in the first place, right?  That's how you know the suite is always an accurate specification of the system.

So far so good. But... is it complete?

How do you know that someone has not added totally new behavior to the system and failed to add a test to the specification? It may be accurate as far as it goes, but it may be missing things, perhaps significant things.  Again, there is no way for you to know.  This is lost knowledge once again.

This is why TDD must be adopted as a team discipline.  Everyone must be on board and sufficiently trained to do it right.  If even one developer is not willing or able to adhere to the process, then TDD simply won't work.  It's got to be everyone, and it's got to be everything.

The key word to all of this is everything.  Anything less than everything isn't everything, and you lose the value of the comprehensive confidence that TDD offers in your process, your design, your institutional knowledge, and your product overall.

So, for my part, I choose to specify everything using tests no matter how trivial it may seem to be at the moment.  I think it's worth it.  Maybe you will too.

I hope you've found this series illuminating and helpful. Watch this space for more.


Posted on: November 30, 2022 11:55 AM | Permalink | Comments (7)

Do I Really Have to Test Everything? (part 2)

Categories: TDD

This is part 2 of a three-part posting.  If you have not read part 1, I strongly suggest you start there:

Answer #2:
You will not test everything. You'll want to, but you won't.  

In TDD the theory is that you always write the test first, observe its failure, then do the work to make it pass.  Consequently, in a perfect world no line of code would ever be written that was not also tested.  It is not, unfortunately, a perfect world.

So you'll miss things because you're human.  When you do, then defects may escape your notice, and some of them will likely make it into the delivered product.  I wish I could tell you I knew how to prevent that completely, but I don't.  Far fewer defects will end up in your customer's lap, but not zero.  So what is the value of TDD in this case?  

It is huge.

First, when a customer encounters a bug, they will hopefully report it back to you.  Most organizations have some mechanism for this: an 800 number or a website, something.  When such a report is made a "trouble ticket" or similar artifact is generated and assigned to a developer.  Normally the job is now to do two things: 1) Find the cause of the problem either by attempting to replicate it, or by running the system in debug mode, or using any number of techniques we have created for this purpose.  2) Once found the issue is fixed, the fix is confirmed, and the ticked is closed.

Not in TDD.

In TDD a defect that makes it into the product is not a bug.  It is a missing test.  It is the test that should have been written but was not.  It is the test that would be failing now, but because it doesn't exist the product was released in ignorance.  Job #1 is to figure out what test was missed, and while this may involve many of the same activities as before, the goal is different. We want to write the test, run it, and watch it fail since we have not yet addressed the defect at all. That failure almost completely confirms that yes, indeed, we found and created the missing test. Almost.

Now we fix the problem and run the tests again.  The single red one should now go green, and because of the way we made that happen we now have complete confirmation that it was the right test.  But we also run all the other tests too.  Always.  Because if they stay green this also confirms that the action we took to fix the bug didn't, in turn, create another one.  Any developer will tell you what a nightmare that can be: you fix one thing and break another, you fix that and break something else, and down the rabbit hole you go.  TDD will tell you that this has not happened or, if it has, exactly where and why.  Immediately.

One more, very important thing:

The actions we take are not all that different from tradition except that they end up producing a test, and the test is kept forever.  Normally when a bug is fixed it is quite possible that it will come back later, because someone working on the system inadvertently re-introduces it.  That's how it got there in the first place after all.  But if you follow the TDD process, the bug will never come back because we never release software with a failing test.  So whatever effort this involves produces permanent value.  I know of no other way to make that happen.  

Stay tuned for answer #3.

Posted on: November 29, 2022 11:47 AM | Permalink | Comments (4)

Do I Really Have to Test Everything?

Categories: TDD

This is a quite frequent question in my Test-Driven development classes.  I really like it because it presents a wonderful opportunity to make some significant observations about TDD and testing in general.

I have three answers to this question, so this posting will be in three parts.  One today, one tomorrow, one the day after that.

Answer #1:
It's not the right question.  The truth is, everything will be tested, the real question is: by whom?

  1. It could be you, as you are developing the software.  That's the standard practice of TDD: the test is written as part of the development process in the first place.
  2. It could be someone else within your organization.  Perhaps another developer is tasked to test what you created, or perhaps you have a separate testing team, QA, QC, SAT testing, something along those lines.
  3. If it's neither of these, then it is your customer, when using the product.  They will test it by using it.

If it's a separate testing effort within your organization, the problem is that there may be a significant delay before you get the results.  Testing is scheduled and often lags significantly behind development.  When a report of a defect arrives, it may be days, weeks, or even longer since you wrote the code and handed it over.  It will no longer be fresh in your mind and so you'll have to spend time tracking down the cause of the problem before you can fix it.  This, of course, creates cost in terms the hours you spend and the delay this causes.

If it's your customer, the costs go up.  First, this can damage the reputation of your product; it is seen as faulty to some degree.  It can also damage the reputation of your organization overall because you released software that was not working properly.  At long last once the defect is uncovered, it can damage the reputation of the developer who created it.  That could be you.

What may be even worse than this is that a customer, having encountered a defect, may not tell you at all.  They feel no responsibility to do so.  They may simply be living with it, annoyed with you and your product, and may well tell colleagues of theirs that you put out software that's buggy.  Your reputation is suffering and you don't even know it, or to repair it.

If it's you, and you introduce a defect, you find out immediately. If you are doing TDD properly your tests are fast and so you run them frequently.  If one goes red, you know it must be something you just did, as they were all green minutes ago.  Finding it is trivial, and maybe you won't even bother looking.  Just Ctrl-Z and try again, that might be most efficient depending on the nature of the work.  In any case, nobody knows you made that mistake but you, you save yourself the embarrassment and potential harm to your career while also safeguarding the reputation of your organizations and it's products.

Do you need to test everything?  The truth is you want to test everything, not to satisfy some process requirement but because it is in your own best interest to do so.  Developers who get comfortable working this way routinely exclaim that they wish they'd always been doing it, that they never want to go back to working without writing tests.  

I have much more to say.  Stay tuned for answers 2 and 3.

Posted on: November 28, 2022 12:22 PM | Permalink | Comments (4)

TDD Tests as “Karen”s

Categories: TDD

We’ve all heard the meme by now: The Karen.  Usually a blonde woman with an eccentric hairstyle that demands to see the manager when things do not go as they want them to.  Personally I think the term is a bit sexist as I’ve met plenty of men who act this way too.  Anyway, when you’re stuck behind such a person in line at a retail establishment, you suffer while they selfishly demand things until someone relents, and you have to wait until they get their way.

I don’t much care for those people.   But when it comes to TDD, I want the tests I write to be just like that.  I want them to be demanding, picky, and downright relentless.  Let me explain why.

Let’s say you are not a TDD shop.  You are tasked with this requirement:  Every night at midnight, the transactions that have been recorded throughout the day (let’s say, they are financial in nature) must be committed to a service.  Assuming we know what it means to “commit a transaction” this seems pretty simple, right?  You’d create a job running on a timer, and when midnight comes you’d activate the code that commits the collection of transactions.  Most developers could write that pretty quickly and move on.  Let’s say you did that.

The next day you come in to find the business in an uproar.  The transactions did not go through, and your customers have sustained significant financial damage as a result.  There is liability, potential loss of market share, and damaged reputation.  When you ask why this happened, they tell you that the commit process was “too late” and that the transactions were rejected as a result.

The problem is the notion of “happening at midnight”.  That seems like a perfectly clear, simple requirement with little room for misinterpretation.  Everyone knows what midnight it.

But nothing can happen “at midnight”.  Midnight is an infinitely short period of time… the moment it becomes midnight, it is immediately “after midnight”.  So what did the client actually want?  Did they mean “by midnight”?  If so, then you should have started the commit process prior to midnight so that it will compete before the deadline?

But how much earlier should you start?  Is that based on the speed of the computer performing the action?  Is there a time that is “too early” in terms of regulations or business rules?  You now know there is a “too late”, so that seems believable.

Is it really “at” midnight, or “by” midnight, or what?

Now imagine you are a TDD team, and you received that same requirement in the same language.  The first thing you must do is convert it into some kind of executable test, probably a unit test if you are a developer.  When such a test is written, it must express three things:

  1. The condition of the system before the behavior is called for
  2. The event or action that triggers the behavior
  3. The way we can determine the success or failure of the behavior

TDD says “until you have these three things accomplished in a test, and have run that test and observed its failure, you may not begin the development work.”  It will not budge, or yield, or get out of line until you do what it wants.  It’s a Karen, or whatever the non-sexist term might be.   TDD is a discipline, and it only works if you follow it diligently.

How will you create the environment that the behavior applies to?  Do you really want to create a bunch of bogus transactions every time you need to run the test?  How can you activate the commitment if it’s based on the system clock?  Will you have to come in every night at midnight to conduct your test?  How will you know if it worked? What might make it not work?  What does “it worked” even mean?

In TDD, the team must know how to do all these things.  Some of it involves technique (dependency injection, mocking, endo-testing, etc…), some of it requires that we ask questions that we might otherwise neglect to ask (like “how will I know if it worked?”) and some of it will challenge the design of the system we are planning.  Should the “trigger at midnight” code be intertwined with the “commit the transactions” code?  Probably not, but that would be an easy mistake to make.  TDD won’t let you, because it will be too painful and arduous to write a test of such a thing.

Bad designs are hard to test.  Is your design any good?  If not, don’t you want to know?

It’s hard if not impossible to write a test for something you don’t understand.  Do you really know enough to build this feature?  Are there questions you should be asking?

TDD works because, among other things, it is demanding, picky, and downright relentless.  It is that annoying customer that won’t go away until you address them, and thus it holds your feet to the fire in a way that makes you better at your job, and more valuable to your clients.

To do this, the team must be effectively trained.  Just knowing “how to write a unit test” won’t do it, not even close.  If you do not know how to control dependencies (like “time” in the example) or how to separate behaviors in your system without over-complicating its design, or what it means to completely express a requirement as a test, then you’ll struggle and probably give up.  But if the team is empowered with this knowledge they can move swiftly, confidently, and aggressively to ensure that the organization they serve remains competitive in an ever-changing world.  TDD makes you do it right, because doing it wrong simply won’t work.

And isn’t that great?

Posted on: May 02, 2022 02:18 PM | Permalink | Comments (10)

"We are ashamed of everything that is real about us; ashamed of ourselves, of our relatives, of our incomes, of our accents, of our opinions, of our experience, just as we are ashamed of our naked skins."

- George Bernard Shaw