Project Management

Disciplined Agile

by , , , , , , ,
#ChooseYourWoW | #ContinuousImprovement | #Kaizen | #ProcessImprovement | Adoption | agile | Agile certification | agile transformation | Analogy | Architecture | architecture | book | Business Agility | Certification | Choose your WoW | CMMI | Coaching | Collaboration | Compliancy | Configuration management | Construction phase | Context | Continuous Improvement | COVID-19 | Culture | culture | DAD | DAD discussions | DAD roles | Data Management | database | DevOps | Discipline | disciplined agile delivery | Documentation | DW/BI | Enterprise Agile | Enterprise Architecture | Enterprise Awareness | Essence | estimation | Evolving DA | Experiment | Financial | GDD | Geographic Distribution | global development | Goal-Driven | goal-driven | goals | Governance | Guideline | Improvement | inception | Inception phase | Large Teams | layer | Lean | Lifecycle | lifecycle | Metrics | mindset | News | News and events | Non-Functional Requirements | non-functional requirements | Operations | Outsourcing | People | Philosophies | Planning | PMI | PMI and DA | Portfolio Management | Practices | Principle | Process | process improvement | Product Management | Product Owners | Program Management | Project Management | Promise | quality | Release Management | Requirements | requirements | Reuse Engineering | Risk management | RUP | Scaling | scaling | scaling agile | Scrum | serial | Support | Surveys | Teams | Technical Debt | Terminology | Testing | testing | Toolkit | Transformation | velocity | Workflow | show all posts

About this Blog


View Posts By:

Scott Ambler
Glen Little
Mark Lines
Valentin Mocanu
Daniel Gagnon
Michael Richardson
Joshua Barnes
Kashmir Birk

Recent Posts

Would you like to get involved with the 20th Anniversary of Agile?

The Four Layers of the Disciplined Agile Tool Kit

The Disciplined Agile Foundation Layer

The Team Lead Role: Different Types of Teams Need Different Types of Leaders

Disciplined Agile is a Hybrid

Recovery Testing

Categories: quality, Testing, testing

Tester - canstockphoto10102295 - small

by Danial Schwartz

In Disciplined Agile Delivery (DAD), testing is so important we do it all the way through the lifecycle. One approach that your team will need to consider performing is recovery testing, which is used to see the ability of a system to handle faults. If a fault occurs, does the system keep working and does not stop? In case of a fault can the system recover within a specified period of time? In the event of a critical failure will damage such as physical, economical, health related, etc., result or not?

Recovery testing constitutes of making the system fail; then the results of system recovery are observed. The efficiency of the system to return to normal and the time it takes to do so are examined. The disturbances which can result in failure and need to be checked vary from product to product and from industry to industry.

Consider the healthcare industry and medical devices. When products are developed for the health care industry they have to be in strict accordance with FDA guidelines. They also have to adhere to the guidelines provided by the company for which the product is being made. When recovery tests are made they naturally have to comply with these strict rules. The tests require validation and so does the environment in which they are to be carried out.

The Defense Industry consists of complex systems embedded within one another. The interlink of the systems requires recovery testing which takes into account how different systems affect one another. Since the industry has to deal with harsh environmental variables, these have to be replicated for recovery testing. Doing so is no easy task.

Cloud applications are increasing in popularity. They are part of cloud systems. The cloud systems, in turn, are made up of commodity machines. This allows taking advantage of economies of scale. But this results in needing to use complex software which makes recovery testing quite a challenge.

Before a recovery test can be carried out, the software recovery tester has to make sure that recovery analysis has been undertaken. A fail over test is designed. The fail over test serves to determine that if a given threshold is reached, can the system allocate extra resources. It also serves to show if, in case of critical failure, a system can distribute resources and continue to operate or recover within a specified time.

Consider the example of a server which is reachable but it is not responding as one would expect it to. This is the fail-over cause. The result of this, known as the possible impact, could be a crash. The severity of the impact is medium to high. To simulate this one could initiate wrong responses on the server side.

Another example of a fail-over cause is a power supply failure. If the failure was in the auxiliary power source its possible impact could be a complete shutdown. This is critical. To simulate this the system could be subjected to a change in power strength or the power cord could simply be unplugged.

A low impact severity example includes a DB overload. This could result in slow response time. It could also result in information not being fetched from the DB leading to an error. Using appropriate tools a load test could be created to simulate this scenario.

At times a service might stop posing a low to high impact severity depending on the service which stopped. There might not be any possible impact or an application might stop working. To simulate this one could stop the service manually to see the possible impact.

The tester also has to ensure that the test plan and test environment are prepared, information is backed up, the recovery panel has been provided education and a record is kept of the techniques used for recovery.

Use of resources and having to deal with unpredictable possibilities makes recovery testing a daunting task, but its benefits are worth the trouble.  First, recovery testing improves the system quality. It removes risk since one knows that in case of a failure the system will continue to work.  Second, recovery testing results in a staff which is educated to perform recovery failure when need arise.  Third, recovery testing also fixes problems and mistakes in a system before it has to go live.  Finally, recovery testing shows how important recovery is and raises awareness of the fact that long term business continuity relies heavily on recovery management.

In conclusion, recovery testing is used to see how a system behaves when failure occurs. Recovery testing can be a tedious process but shows the efficiency of a recovery plan, educates the staff on how to deal with faults and failures which occur in systems, highlights the importance of recovery at times of crisis to members of the IT and business organizations, and shows how important it is to the long term success of a business to have a recovery strategy in case of a disaster.


About the Author

Danial Schwartz is a content strategist who sheds light on various engaging and informative topics related to the health IT and Q&A industry. His belief in technology, compliance and cost reduction have opened new horizons for people in the health care industry. He is passionate about topics such as Affordable Care Act, EHR,testing, test automation, and privacy and security of data.


Related Resources

Posted by Scott Ambler on: February 22, 2016 11:56 AM | Permalink | Comments (0)

DevOps Strategies: Development

DevOps Practices - Development

In addition to the general strategies described in a previous posting, there are several common development practices that support Disciplined DevOps:

  • Canary tests.  A canary test is a small experiment where new functionality is deployed to a subset of end users so you can determine whether that functionality is of interest to them.  This in turn provides insight to the development team as to the true potential value of the functionality (if any).  For example, an e-commerce company might believe that a new feature where people can buy two related items at a discount will help to increase sales.  At the same time they fear this could decrease overall revenue.  So they decide to run a canary test where 5% of their customers are provided this functionality for a two-week period.  Sales and revenue are tracked and compared against customers not given access to this new functionality. If a new feature successfully passes a canary test it is then made available to a wider range of end users (you may choose to several rounds of canary tests before finally deploying the functionality to all users).  You can think of canary testing as an extreme form of pilot testing.
  • Split tests.  A split test, also known as an A/B test, is an experiment where two or more options are run in parallel so that their effectiveness can be compared.  For example, a bank may identify three different screen design strategies to transfer funds between two accounts via an automated teller machine (ATM).  Instead of holding endless meetings, focus groups, or modelling sessions the bank instead decides to implement all three strategies and put them into production in parallel.  When I use an ATM I’m always presented with strategy A, when you login you always get strategy B, and so on. Because the ATM solution is instrumented to track important usage metrics the bank is able to determine which of the three strategies is most effective.  After the split test is completed the winning strategy is made available to all users of ATMs.
  • Automated regression testing.   Agile software developers are said to be “quality infected” because of their focus on writing quality code and their desire to test as often and early as possible. As a result, automated regression testing is a common practice adopted by agile teams, which is sometimes extended to test-first approaches such as test-driven development (TDD) and behavior-driven development (BDD).  The regression test suite(s) may address function testing, performance testing, system integration testing (SIT), and acceptance testing and many more categories of tests.  Because agile teams commonly run their automated test suites many times a day, and because they fix any problems they find right away, they enjoy higher levels of quality than teams that don’t. Because some tests can take a long time to run, in particular load/stress tests and performance tests, that a team will choose to have several test suites running at different cadences (i.e. some tests run at every code check in, some tests run at scheduled times each day, some once every evening, some over the weekend, and so on).  This greater focus on quality is good news for operations staff that insists a solution must be of sufficient quality before approving its release into production.
  • Continuous integration (CI).  Continuous integration (CI) is the discipline of building and validating a project automatically whenever a file is checked into your configuration management (CM) system.  As you see in the following diagram, validation can occur via several strategies such as automated regression testing and even static or dynamic code and schema analysis. CI enables developers to develop a high-quality working solution safely in small, regular steps by providing immediate feedback on code defects.

Continuous integration process

  • Continuous deployment (CD).  Continuous deployment extends the practice of continuous integration. With continuous deployment, when your integration is successful in one sandbox your changes are automatically promoted to the next sandbox.  The CI strategy running in that environment automatically integrates your solution there because of the updated source files. As you can see in the following diagram this automatic promotion continues until the point where any changes must be verified by a person, typically at the transition point between development and operations. Having said that, advanced teams are now automatically deploying into production as well.  Continuous deployment enables development teams to reduce the time between a new feature being identified and being deployed into production. It enables the business to be more responsive. However, when development teams aren’t sufficiently disciplined continuous deployment can increase operational risk by increasing the potential for defects to be introduced into production. Successful continuous deployment in an enterprise environment requires an effective continuous integration strategy in place in all sandboxes.

Continuous deployment process

There are also several common operations-friendly features that developers with a Disciplined DevOps mindset will choose to build into their solutions:

  • Feature access control.   To support experimentation strategies such as canary tests and split tests it must be possible to limit end user access to certain features. This strategy must be easy to configure and deploy, a common approach is to have XML-based configuration files that are read into memory that contain the meta-data required to drive an access control framework.
  • Monitoring instrumentation.  Developers with a Disciplined DevOps mindset will build instrumentation functionality – logging and better yet real-time alerts – into their solutions.  The purpose is to enable monitoring, in (near) real-time, of their systems when they are operating in production.  This is important to the people responsible for keeping the solution running, to people supporting the solution, to people responsible for debugging and fixing any problems, and to your operational intelligence efforts.  Monitoring instrumentation enables canary tests and split tests in that it provides the data required to determine the effectiveness of the feature or strategy under test.
  • Feature toggles. A feature toggle is effectively a software switch that allows you to turn features on (and off) when appropriate.  A common strategy is to turn on a collection of related functionality that provide a value stream, often described by an epic or use case, all at once when end users are ready to accept it.  Feature toggles are also used to turn off individual features when it’s discovered that the feature isn’t performing well (perhaps the new functionality isn’t found to be useful by end users, perhaps it results in lower sales, …).  Another benefit of feature toggles is that they enable you to test and deploy functionality into production on an incremental basis.
  • Self-testing.  One strategy to make a solution more robust, and thus easier to operate, is to make it self testing.  The basic idea is that each component of a solution includes basic tests to validate that it can properly run while in production.  For example, an application server may run basic tests at startup such as verifying the version of the operating system or of frameworks that it relies on.  While the server is running it might regularly check to see if other components that it relies on, such as data sources and middleware services, are available.  When a problem is detected it minimally should be logged, better yet an alert should be posted if intervention by a person is required, and even better yet the solution should try to recover from the problem.
  • Self-recovery.  When a system runs into a problem it should do it’s best to automatically recover and continue on as before.  For example, if the system detects that a data source is no longer available it should try to restart that data service.  If that fails, it should record change transactions where possible and then process them until the data service becomes available again.  A good example of this is an ATM.  When ATMs lose their connection to a bank’s financial processing system they will continue on for a period of time independently albeit with limited functionality.  They will allow people to withdraw money from their accounts, perhaps putting a limit on the amount withdrawn to limit potential problems with overdrawn accounts.  People will still be able to deposit money but will not be able to get a current balance or see a statement of recent transactions.  Self-recovery functionality provides a better experience to end users and reduces the operational burden on your organization.

Now that we have overviewed a collection of development practices and implementation features, in the next blog posting in this series we will explore strategies that streamline your operations efforts.

Posted by Scott Ambler on: February 08, 2015 06:15 AM | Permalink | Comments (0)

Accelerate Value Delivery

One of the process goals that a disciplined agile team will want to address during construction is Accelerate Value Delivery.  Ideally, in each construction iteration a team will move closer to having a version of their solution that provides sufficient functionality to its stakeholders.  This implies that the solution is a minimally viable product (MVP) that adds greater business value than its cost to develop and deploy.  Realistically it isn’t a perfect world and sometimes a team will run into a bit of trouble resulting in an iteration where they may not have moved closer to something deployable but hopefully they’ve at least learned from their experiences.

This is an important process goal for several reasons. First, it encompasses the packaging aspects of solution development (other important development aspects are addressed by its sister goal Produce a Potentially Consumable Solution).  This includes artifact/asset management options such as version control and configuration management as well as your team’s deployment strategy.  Second, it provides deployment planning options, from not planning at all (yikes!) to planning late in the lifecycle to the more DevOps-friendly strategies of continuous planning and active stakeholder participation. Third, this goal covers critical validation and verification (V&V) strategies, many of which push testing and quality assurance “left in the lifecycle” so that they’re performed earlier and thereby reducing the average cost of fixing any defects.

The process goal diagram for Accelerate Value Delivery is shown below. The rounded rectangle indicates the goal, the squared rectangles indicate issues or process factors that you may need to consider, and the lists in the right hand column represent potential strategies or practices that you may choose to adopt to address those issues. The lists with an arrow to the left are ordered, indicating that in general the options at the top of the list are more preferable from an agile point of view than the options towards the bottom. The highlighted options (bolded and italicized) indicate default starting points for teams looking for a good place to start but who don’t want to invest a lot of time in process tailoring right now. Each of these practices/strategies has advantages and disadvantages, and none are perfect in all situations, which is why it is important to understand the options you have available to you.

Accelerate Value Delivery process goal

Let’s consider each process factor:

  • Choose a Deployment Strategy.  Deployment can be a struggle for teams new to agile.  Many teams will start by aiming to deploy their solution into production more regularly, perhaps every few months instead of every six to twelve months.  Then they will start deploying working builds internally at the end of each iteration, perhaps into demo or testing environments.  Finally they will hopefully evolve into a continuous deployment (CD) strategy.
  • Manage Assets.  The issue here is how your team will manage the various artifacts, including code, which they use and create while building a solution.  Sadly we’ve found that some teams still struggle with basic configuration management control.  Although they may have their source code under fairly sophisticated control other artifacts such as supporting documentation often aren’t.
  • Document.  Supporting documentation, such as user guides and system overviews, are part of the overall solution that the team is working on.  The DA toolkit leverages strategies from Agile Modeling to address this process factor.  Your team may choose to leave such documentation to the end of the lifecycle or to write documentation throughout the lifecycle.
  • Plan Deployment.  There are several techniques that you may want to consider for deployment planning, an important aspect of your overall DevOps strategy.  Although some teams may begin such planning with their operations/release engineers late in the lifecycle, many will instead plan throughout the lifecycle.  The DA toolkit recognizes operations staff as key stakeholders, people whom you will actively work with to plan and then deploy your solution.
  • Maintain Traceability.  Traceability from requirements through your design to your code to your tests, or a subset thereof, may be required by your team.  This is common for some, but not all, regulations.  Traceability is often perceived as critical to enable impact analysis, although in practice this is questionable as manually maintained traceability matrices are rarely kept up to date.
  • Validate.  DAD captures the fact that there are many ways that you can choose to validate your work, including a range of agile quality techniques (TDD, CI, ATDD/BDD) and even a few that are not-so-agile (end-of-lifecycle testing, manual testing, parallel independent testing).  The DA toolkit purposely includes these not-so-agile strategies because in some situations, particularly at scale, they may in fact be your best options.  Furthermore, your team may be in the early stages of becoming agile and as a result then not-so-agile strategies may be the best they are currently capable of doing.
  • Verify.  DAD also recommends that you consider verification strategies to help increase the quality of your work. These strategies include reviews, non-solo development strategies such as pair programming and modeling with others (which are effectively continuous reviews), and including code analysis tools in your CI strategy.

We want to share two important observations about this goal.  First, this goal, along with Explore Initial ScopeCoordinate Activities, and Identify Initial Technical Strategy seem to take the brunt of your process tailoring efforts when working at scale.  It really does seem to be one of those Pareto situations where 20% addresses 80% of the work, more on this in a future blog posting.  As you saw in the discussion of the process issues, the process tailoring decisions that you make regarding this goal will vary greatly based on the various scaling factors.  Second, as with all process goal diagrams, the one above doesn’t provide an exhaustive list of options although it does provide a pretty good start.

We’re firm believers that a team should tailor their strategy, including their team structure, their work environment, and their process, to reflect the situation that they find themselves in.  When it comes to process tailoring, process goal diagrams not only help teams to identify the issues they need to consider they also summarize potential options available to them.  Agile teams with a minimal bit of process guidance such as this are in a much better situation to tailor their approach that teams that are trying to figure it out on their own.  The DA toolkit provides this guidance.

Posted by Scott Ambler on: February 22, 2014 05:11 AM | Permalink | Comments (0)

Strategies for Verifying Quality/Non-Functional Requirements

Early in the lifecycle, during the Inception phase, disciplined agile teams will invest some time in initial requirements envisioning and initial architecture envisioning. One of the issues to be considered as part of requirements envisioning is to identify non-functional requirement (NFRs), also called quality of service (QoS) or simply quality requirements. The NFRs will drive many of your technical decisions that you make when envisioning your initial architectural strategy. These NFRs should be captured someone and implemented during Construction. It isn’t sufficient to simply implement the NFRs, you must also validate that you have done so appropriately. In this blog posting I overview a collection of agile strategies that you can apply to validate NFRs.

A mainstay of agile validation is the philosophy of whole team testing. The basic idea is that the team itself is responsible for validating its own work, they don’t simply write some code and then throw it over the wall to testers to validate. For organizations new to agile this means that testers sit side-by-side with developers, working together and learning from one another in a collaborative manner. Eventually people become generalizing specialists, T-skilled people, who have sufficient testing skills (and other skills).

Minimally your developers should be performing regression testing to the best of their ability, adopting a continuous integration (CI) strategy in which the regression test suite(s) are run automatically many times a day.  Advanced agile teams will take a test-driven development (TDD) approach where a single test is written just before sufficient production code which fulfills that test.  Regardless of when tests are written by the development team, either before or after the writing of the production code, some tests will validate functional requirements and some will validate non-functional requirements.

Whole team testing is great in theory, and it is strategy that I wholeheartedly recommend, but in some situations it proves insufficient.  It is wonderful to strive to have teams with sufficient skills to get the job done, but sometimes the situation is too complex to allow that.  There are some types of NFRs which require significant expertise to address properly: NFRs pertaining to security, usability, and reliability for example.  To validate these types of requirements, worse yet even to identify them, requires skill and sometimes even specialized (read expensive) tooling.  It would be a stretch to assume that all of your delivery teams will have this expertise and access to these tools.

Recognizing that whole team testing may not sufficiently address validating NFRs many organizations will supplement their whole team testing efforts with parallel independent testing  .  With this approach a delivery team makes their working builds available to a test team on a regular basis, minimally at the end of each iteration, and the testers perform the types of testing on it that the delivery team is either unable or unlikely to perform.  Knowing that some classes of NFRs may be missed by the team, independent test teams will look for those types of defects.  They will also perform pre-production system integration testing and exploratory testing to name a few.  Parallel independent testing is also common in regulatory compliance environments.

From a verification point of view some agile teams will perform either formal or informal reviews.  Experienced agilists prefer to avoid reviews due to their inherently long feedback cycle, which increases the average cost of addressing found defects, in favor of non-solo development strategies such as pair programming and modeling with others.  The challenge with non-solo strategies is that managers unfamiliar with agile techniques, or perhaps the real problem is that they’re still overly influenced by disproved traditional theories of yesteryear, believe that non-solo strategies reduce team productivity.  When done right non-solo strategies increase overall productivity, but the political battle required to convince management to allow your team to succeed often isn’t worth the trouble.

Another strategy for validating NFRs code analysis, both dynamic and static.  There is a range of analysis tools available to you that can address NFR types such as security, performance, and more.  These tools will not only identify potential problems with your code many of them will also provide summaries of what they found, metrics that you can leverage in your automated project dashboards.   This strategy of leveraging tool-generated metrics such as this is a technique which IBM calls Development Intelligence and is highly suggested as an enabler of agile governance in DAD. Disciplined agile teams will include invocation of code analysis tools from you CI scripts to support continuous validation throughout the lifecycle.

Your least effective validation option is end-of-lifecycle testing, in the traditional development world this would be referred to as a testing phase.  The problem with this strategy is that you in effect push significant risk, and significant costs, to the end of the lifecycle.  It has been known for several decades know that the average cost of fixing defects rises the longer it takes you to identify them, motivating you to adopt the more agile forms of testing that I described earlier.  Having said that I still run into organizations in the process of adopting agile techniques that haven’t really made embraced agile, as a result still leave most of their testing effort to the least effective time to do such work.  If you find yourself in that situation you will need to validate NFRs in addition to functional requirements.

To summarize, you have many options for validating NFRs on agile delivery teams.  The secret is to pick the right one(s) for the situation that you find yourself in.  The DA toolkit helps to guide you through these important process decisions, describing your options and the trade-offs associated with each one.

Related Resources


Posted by Scott Ambler on: October 23, 2012 07:49 AM | Permalink | Comments (0)

"There are two types of people in this world, good and bad. The good sleep better, but the bad seem to enjoy the waking hours much more."

- Woody Allen