Do I Really Have to Test Everything? (part 3)
This is part 3 of a three-part posting. If you have not read part 1, I strongly suggest you start there:
The idea of driving development from tests can be a bit confusing to some. "How can you test something you haven't created yet? The test will fail, obviously. What's the point?"
That a reasonable question too. The fact is that TDD is not really a "testing" activity at all. It produces enormously beneficial tests, but they are a side effect. TDD is all about creating a specification - we simply use tests to do it. It's not at all surprising that we would do that first, is it? Realizing that TDD is actually a specifying activity also changes almost everything about the way we do it, and for the better. I could talk about that at length. But to the point I'm addressing here...
When you think this way, then the original question on the table becomes:
Do I Really Have to Specify Everything?
How about getters and setters? They are so very trivial. What about an object with no constructor, or only a parameter-less constructor, do I have to specify what "new" does? It can seem like a silly idea and one that may well be a waste of time.
Well, first of all you don't have to do anything. This is not "Test-Dictated Development", you are in charge of how you use your time. However, you may find great value in specifying everything. I'm going to argue here that you will, and that you'll want to do it with automated tests.
When you create software you do so from an understanding you hold in your head. That knowledge has value but it may leave your head over time. Or your head may leave (you get another job, get promoted, win the lottery and retire, etc...). When this happens, this represents lost enterprise knowledge, a significant problem across our industry. Many organizations have large, complex legacy systems that are mission-critical but the expertise that created them is long gone. Updating them is dangerous, slow, and error-prone.
As an example: If you work on something to completion, then move on to other work, then perhaps months or even years later are required to work on it again you may well have forgotten a lot of what you knew when you made it. How will you re-gather the knowledge you once had in order to become qualified to do this work now? Will you read the specification you left behind?
Imagine such a thing done in the traditional way: a document of some kind. Its value was that it informed the initial development effort. But can you use it later to re-educate yourself? It is still accurate? How do you know that someone has not changed the way the system works and failed to update the specifying document? That's not supposed to happen but it does all the time. People get busy, run out of time, or don't even know that the document exists or how to find it.
The main problem is that you have no way of knowing. You can't trust it. Its value is gone. You may as well throw it away.
Now imagine the specification is a suite of tests. You can execute them, and if they still run and pass you know that they are still accurate to the system as it stands today. A big part of the way we write tests in TDD is to make them highly readable by humans, and so now you can read them and know the information you're getting is accurate. But... what if they don't execute, or don't pass? Then the specification has lost its value anyway. What do we do about that?
You adopt the following strict policy: any change to the system must begin by changing the test that specifies that behavior. The test will fail, but then the change is made to get it passing again. If you always do that, then the suite of tests will always represent a complete and accurate view of the way it worked, works, and has changed over time. Forever.
Of course, that is only possible if everything has a test to change in the first place, right? That's how you know the suite is always an accurate specification of the system.
So far so good. But... is it complete?
How do you know that someone has not added totally new behavior to the system and failed to add a test to the specification? It may be accurate as far as it goes, but it may be missing things, perhaps significant things. Again, there is no way for you to know. This is lost knowledge once again.
This is why TDD must be adopted as a team discipline. Everyone must be on board and sufficiently trained to do it right. If even one developer is not willing or able to adhere to the process, then TDD simply won't work. It's got to be everyone, and it's got to be everything.
The key word to all of this is everything. Anything less than everything isn't everything, and you lose the value of the comprehensive confidence that TDD offers in your process, your design, your institutional knowledge, and your product overall.
So, for my part, I choose to specify everything using tests no matter how trivial it may seem to be at the moment. I think it's worth it. Maybe you will too.
I hope you've found this series illuminating and helpful. Watch this space for more.
Do I Really Have to Test Everything? (part 2)
This is part 2 of a three-part posting. If you have not read part 1, I strongly suggest you start there:
In TDD the theory is that you always write the test first, observe its failure, then do the work to make it pass. Consequently, in a perfect world no line of code would ever be written that was not also tested. It is not, unfortunately, a perfect world.
So you'll miss things because you're human. When you do, then defects may escape your notice, and some of them will likely make it into the delivered product. I wish I could tell you I knew how to prevent that completely, but I don't. Far fewer defects will end up in your customer's lap, but not zero. So what is the value of TDD in this case?
It is huge.
First, when a customer encounters a bug, they will hopefully report it back to you. Most organizations have some mechanism for this: an 800 number or a website, something. When such a report is made a "trouble ticket" or similar artifact is generated and assigned to a developer. Normally the job is now to do two things: 1) Find the cause of the problem either by attempting to replicate it, or by running the system in debug mode, or using any number of techniques we have created for this purpose. 2) Once found the issue is fixed, the fix is confirmed, and the ticked is closed.
Not in TDD.
In TDD a defect that makes it into the product is not a bug. It is a missing test. It is the test that should have been written but was not. It is the test that would be failing now, but because it doesn't exist the product was released in ignorance. Job #1 is to figure out what test was missed, and while this may involve many of the same activities as before, the goal is different. We want to write the test, run it, and watch it fail since we have not yet addressed the defect at all. That failure almost completely confirms that yes, indeed, we found and created the missing test. Almost.
Now we fix the problem and run the tests again. The single red one should now go green, and because of the way we made that happen we now have complete confirmation that it was the right test. But we also run all the other tests too. Always. Because if they stay green this also confirms that the action we took to fix the bug didn't, in turn, create another one. Any developer will tell you what a nightmare that can be: you fix one thing and break another, you fix that and break something else, and down the rabbit hole you go. TDD will tell you that this has not happened or, if it has, exactly where and why. Immediately.
One more, very important thing:
The actions we take are not all that different from tradition except that they end up producing a test, and the test is kept forever. Normally when a bug is fixed it is quite possible that it will come back later, because someone working on the system inadvertently re-introduces it. That's how it got there in the first place after all. But if you follow the TDD process, the bug will never come back because we never release software with a failing test. So whatever effort this involves produces permanent value. I know of no other way to make that happen.
Stay tuned for answer #3.
Do I Really Have to Test Everything?
This is a quite frequent question in my Test-Driven development classes. I really like it because it presents a wonderful opportunity to make some significant observations about TDD and testing in general.
I have three answers to this question, so this posting will be in three parts. One today, one tomorrow, one the day after that.
If it's a separate testing effort within your organization, the problem is that there may be a significant delay before you get the results. Testing is scheduled and often lags significantly behind development. When a report of a defect arrives, it may be days, weeks, or even longer since you wrote the code and handed it over. It will no longer be fresh in your mind and so you'll have to spend time tracking down the cause of the problem before you can fix it. This, of course, creates cost in terms the hours you spend and the delay this causes.
If it's your customer, the costs go up. First, this can damage the reputation of your product; it is seen as faulty to some degree. It can also damage the reputation of your organization overall because you released software that was not working properly. At long last once the defect is uncovered, it can damage the reputation of the developer who created it. That could be you.
What may be even worse than this is that a customer, having encountered a defect, may not tell you at all. They feel no responsibility to do so. They may simply be living with it, annoyed with you and your product, and may well tell colleagues of theirs that you put out software that's buggy. Your reputation is suffering and you don't even know it, or to repair it.
If it's you, and you introduce a defect, you find out immediately. If you are doing TDD properly your tests are fast and so you run them frequently. If one goes red, you know it must be something you just did, as they were all green minutes ago. Finding it is trivial, and maybe you won't even bother looking. Just Ctrl-Z and try again, that might be most efficient depending on the nature of the work. In any case, nobody knows you made that mistake but you, you save yourself the embarrassment and potential harm to your career while also safeguarding the reputation of your organizations and it's products.
Do you need to test everything? The truth is you want to test everything, not to satisfy some process requirement but because it is in your own best interest to do so. Developers who get comfortable working this way routinely exclaim that they wish they'd always been doing it, that they never want to go back to working without writing tests.
I have much more to say. Stay tuned for answers 2 and 3.
TDD Tests as “Karen”s
We’ve all heard the meme by now: The Karen. Usually a blonde woman with an eccentric hairstyle that demands to see the manager when things do not go as they want them to. Personally I think the term is a bit sexist as I’ve met plenty of men who act this way too. Anyway, when you’re stuck behind such a person in line at a retail establishment, you suffer while they selfishly demand things until someone relents, and you have to wait until they get their way.
I don’t much care for those people. But when it comes to TDD, I want the tests I write to be just like that. I want them to be demanding, picky, and downright relentless. Let me explain why.
Let’s say you are not a TDD shop. You are tasked with this requirement: Every night at midnight, the transactions that have been recorded throughout the day (let’s say, they are financial in nature) must be committed to a service. Assuming we know what it means to “commit a transaction” this seems pretty simple, right? You’d create a job running on a timer, and when midnight comes you’d activate the code that commits the collection of transactions. Most developers could write that pretty quickly and move on. Let’s say you did that.
The next day you come in to find the business in an uproar. The transactions did not go through, and your customers have sustained significant financial damage as a result. There is liability, potential loss of market share, and damaged reputation. When you ask why this happened, they tell you that the commit process was “too late” and that the transactions were rejected as a result.
The problem is the notion of “happening at midnight”. That seems like a perfectly clear, simple requirement with little room for misinterpretation. Everyone knows what midnight it.
But nothing can happen “at midnight”. Midnight is an infinitely short period of time… the moment it becomes midnight, it is immediately “after midnight”. So what did the client actually want? Did they mean “by midnight”? If so, then you should have started the commit process prior to midnight so that it will compete before the deadline?
But how much earlier should you start? Is that based on the speed of the computer performing the action? Is there a time that is “too early” in terms of regulations or business rules? You now know there is a “too late”, so that seems believable.
Is it really “at” midnight, or “by” midnight, or what?
Now imagine you are a TDD team, and you received that same requirement in the same language. The first thing you must do is convert it into some kind of executable test, probably a unit test if you are a developer. When such a test is written, it must express three things:
TDD says “until you have these three things accomplished in a test, and have run that test and observed its failure, you may not begin the development work.” It will not budge, or yield, or get out of line until you do what it wants. It’s a Karen, or whatever the non-sexist term might be. TDD is a discipline, and it only works if you follow it diligently.
How will you create the environment that the behavior applies to? Do you really want to create a bunch of bogus transactions every time you need to run the test? How can you activate the commitment if it’s based on the system clock? Will you have to come in every night at midnight to conduct your test? How will you know if it worked? What might make it not work? What does “it worked” even mean?
In TDD, the team must know how to do all these things. Some of it involves technique (dependency injection, mocking, endo-testing, etc…), some of it requires that we ask questions that we might otherwise neglect to ask (like “how will I know if it worked?”) and some of it will challenge the design of the system we are planning. Should the “trigger at midnight” code be intertwined with the “commit the transactions” code? Probably not, but that would be an easy mistake to make. TDD won’t let you, because it will be too painful and arduous to write a test of such a thing.
Bad designs are hard to test. Is your design any good? If not, don’t you want to know?
It’s hard if not impossible to write a test for something you don’t understand. Do you really know enough to build this feature? Are there questions you should be asking?
TDD works because, among other things, it is demanding, picky, and downright relentless. It is that annoying customer that won’t go away until you address them, and thus it holds your feet to the fire in a way that makes you better at your job, and more valuable to your clients.
To do this, the team must be effectively trained. Just knowing “how to write a unit test” won’t do it, not even close. If you do not know how to control dependencies (like “time” in the example) or how to separate behaviors in your system without over-complicating its design, or what it means to completely express a requirement as a test, then you’ll struggle and probably give up. But if the team is empowered with this knowledge they can move swiftly, confidently, and aggressively to ensure that the organization they serve remains competitive in an ever-changing world. TDD makes you do it right, because doing it wrong simply won’t work.
And isn’t that great?