At the Agile 2019 conference in DC I facilitated a workshop with about 70 people where we explored the topic of how do you coach an agile data warehousing (DW)/business intelligence (BI) team. To do this we worked through four issues:
The basic strategy was to introduce the issues to the class one at a time, then at their tables they would discuss the issue and write up to five ideas on sticky notes, then we’d share the ideas. Pictures of the flipcharts for each issue follow below. After the groups shared their ideas I then shared my thoughts with the class.
Issue #1: What Challenges Do You Face Coaching DW/BI Teams?
As you can see the class identified a lot of the classic issues that agile coaches face in general, such as trust issues, the teams being management-driven instead of self organizing, lack of agile skills within the team, cross-team dependencies, and being overwhelmed with work. Certainly there were DW/BI flavours of common problems, such as how to do vertical slices of DW/BI functionality and which lifecycle (agile, lean, CD, …) to follow. But there were also DW/BI specific issues, such as lack of access to data sources, knowing the actual data, and DW/BI architecture and design strategies. These DW/BI specific issues are where agile coaches tend to get hung up.
In my discussion of the challenges that we face when coaching agile DW/BI teams I shared my thoughts on the cultural impedance mismatch that exists between the agile and data communities. This mismatch makes it a bit more difficult to engage with data teams as opposed to application development teams. I also shared results of studies (2009, 2013,2016, 2018) around data quality challenges and practices – it is certainly common for teams to have to deal with technical debt, but data technical debt is both different in nature than code quality debt and the traditional data culture has led them down a very questionable (read dysfunctional) path regarding quality practices.
Issue #2: What Skills/Knowledge Does an Agile DW/BI Coach Require?
The second issue that we explored was what skills/knowledge does an agile DW/BI coach need. Once again the groups identified both classic agile coaching ideas as well as DW/BI specific ideas. Clearly you need coaching skills in order to coach a DW/BI team. But you also need to be knowledgeable about critical skills such as data modeling, data analysis, database testing, database refactoring, and others. You might not be an expert at these things, but you need to know of them and be able to guide the team in their adoption. You’ll also need to be able to speak intelligently about why some of the traditional strategies that they likely hold near and dear to their hearts (remember the cultural impedance mismatch) need to be abandoned for better, more effective strategies.
In my discussion I overviewed the “agile database techniques stack,” a collection of agile strategies and practices for database development. The stack is:
As you can see, this list of techniques is fairly common from an agile point of view, albeit the corresponding data(base) versions of those techniques. The point is that the techniques exist that enable data professionals to work in an agile, and far more effective, manner. As a coach you will need to be aware of these strategies and be able to help your DW/BI team adopt them. And of course there are agile data management strategies that you need to be aware of as well.
Issue #3: What Strategies Should You Use To Engage Successfully With An Agile DW/BI Team?
The groups identified a collection of great strategies for engaging with DW/BI teams. Once again there were a lot of standard coaching strategies, a DW/BI team is still a group of people after all, but there was also a focus on strategies to address the DW/BI challenges identified earlier.
The discussion that followed the sharing of the stickies a very interesting point was brought up. I had earlier stated that my experience with coaching DW/BI teams was that it was different than coaching other types of teams, mostly because of the cultural impedance mismatch. A handful of agile DW/BI coaches in the audience disagreed with that, pointing out that the critical issue was gaining the trust and respect of the team at the start. This is true of any team, and certainly true of DW/BI teams. To do this you need to understand and appreciate the issues that they deal with and be able to show that you know how to guide them through addressing their issues. You might not be an expert in the techniques of the agile database technique stack, or other important agile data techniques, but you do know of them and can help the team learn them. So yes, engaging with an agile DW/BI team is no different on the surface, but it does require the coach to have different skills and knowledge than what your typical agile coach has.
Issues #4: What Are The Qualities You Should Look For In An Agile DW/BI Coach?
For this exercise I pretty much asked the groups to put together a list of qualities for a job ad for an Agile DW/BI coach. This is what they came up with.
Here are our main learnings regarding coaching an agile DW/BI team:
On Tuesday, August 7 I facilitated a workshop about Database DevOps at the Agile 2018 conference in San Diego. I promised the group that I would write up the results here in the blog. This was an easy promise to make because I knew that we’d get some good information out of the participants and sure enough we did. The workshop was organized into three major sections:
Overview of Disciplined DevOps
We started with a brief overview of Disciplined DevOps to set the foundation for the discussion. The workflow for Disciplined DevOps is shown below. The main message was that we need to look at the overall DevOps picture to be successful in modern enterprises, that it was more that Dev+Ops. Having said that, our focus was on Database DevOps.
Challenges around Database DevOps
We then ran a From/To exercise where we asked people to identify what aspects of their current situation that they’d like to move away from and what they’d like to move their organization towards. The following two pictures (I’d like to thank Klaus Boedker for taking all of the following pics) show what we’d like to move from/to respectively (click on them for a larger version).
I then shared my observations about the challenges with Database DevOps, in particular the cultural impedance mismatch between developers and data professionals, the quality challenges we face regarding data, the lack of testing culture and knowledge within the data community, and the mistaken belief that it’s difficult to evolve production data source.
Techniques Supporting Database DevOps
The heart of the workshop was to explore technical techniques that support database DevOps. I gave an overview of several Agile Data techniques so as give people an understanding of how Database DevOps works, then we ran an exercise. In the exercise each table worked through one of six techniques (there are several supporting techniques that the groups didn’t work through), exploring:
Each team was limited to their top three answers to each of those questions, and each technique was covered by several teams. Each of the following sections has a paragraph describing the technique, a picture of the Strategy Canvas the participants created, and my thoughts on what the group produced. It’s important to note that the some of the answers in the canvases contradict each other because each canvas is the amalgam of work performed by a few teams, and each of the teams may have included people completely new to the practice/strategy they were working through.
Just like you can vertically slice the functional aspects of what you’re building, and release those slices if appropriate, you can do the same for the data aspects of your solution. Many traditional data professionals don’t know how to do this, in most part because traditional data techniques are based on waterfall-style development where they’ve been told to think everything through up front in detail. The article Implementing a Data Warehouse via Vertical Slicing goes into this topic in detail.
The advantages of vertical slicing is that it enables you to get something built and into the hands of stakeholders quickly, thereby reducing the feedback cycle. The challenge with it is that you can lose sight of the bigger picture (therefore you want to do some high-level modeling during Inception to get a handle on the bigger picture). To be successful at vertically slicing your work, you need to be able to incrementally model, or better yet agilely model, and implement that functionality.
Agile Data Modeling
There’s nothing special about data modelling, you can perform it in an agile manner just like you can model other things in an agile manner. Once again, this is a critical skill to learn and can be challenging for traditional data professionals due to their culture around heavy “big design up front (BDUF)”. The article Agile Data Modelling goes into details, and more importantly an example, for how to do this.
A database refactoring is a simple change to your database that improves the quality of its design without changing the semantics of the design (in a practical manner). This is a key technique because it enables you to safely evolve your database schema, just like you can safely evolve your application code. Many traditional data professionals believe that it is very difficult and risky to refactor a database, hence their penchant for heavy up-front modeling, but this isn’t actually true in practice. To understand this, see the article The Process of Database Refactoring which summarizes material from the award-winning book Refactoring Databases.
Automated Database Regression Testing
If data is a corporate asset then it should be treated as such. Having an automated regression test suite for a data source helps to ensure that the functionality and the data within a database conforms to the shared business rules and semantics for it. For more information, see the article Database Testing.
Continuous Database Integration
Database changes, just like application code changes, should be brought into your continuous integration (CI) strategy. It is a bit harder to include a data source because of the data. The issue is side effects from tests – in theory a database test should put the db into a known state, do something, check to see if you get the expected results, then put the DB back into the original state. It’s that last part that’s the problem because all it takes is one test to forget to do so and there’s the potential for side effects across tests. So, a common thing is to rebuild (or restore, or a combination thereof) your dev and test data bases every so often so as to decrease the chance of this. You might choose to do this in your nightly CI run for example. For more information, see the book Recipes for Continuous Database Integration.
Operational Data Monitoring
An important part of Operations is to monitor the running infrastructure, including databases. This information can and should be available via real-time dashboards as well as through ad-hoc reporting. Sadly, I need to write an article on this still. But if you poke around the web you’ll find a fair bit of information. Article to come soon.
This was a really interesting workshop. We did it in 75 minutes but it really should have been done in a half day to allow for more detailed discussions about each of the techniques. Having said that, I had several very good conversations with people following the workshop about how valuable and enlightening they found it.
Lynn will soon be blogging about the results so I’m not going to dive into that here. I suspect that her blog post will be very interesting.
What I’d like to do here is share a few thoughts about what I observed:
I’m very happy to see that Lynn is actively trying to bridge the agile and data communities, helping us to learn from each other. This is something she’s been doing for years and doing it quite well. My experience is that both communities would benefit greatly from more collaboration with each other.
Key tenets of agile and lean are to work collaboratively and to streamline your workflow respectively. This includes all aspects of your workflow, not just the fun software development stuff that we all like to focus on. This blog posting explores how Data Management activities fit into your overall process.
In the process flow diagram below we see that Data Management is a collaborative effort that has interdependencies with other DA process blades and the solution delivery teams that Data Management is meant to support. Due to the shortened feedback cycles and collaborative nature of the work this can be very different than the current traditional strategies. For example, with a DA approach, the Data Management team works collaboratively with the delivery teams, Operations, and Release Management to evolve data sources. The delivery teams do the majority of the work to develop and evolve the data sources, with support and guidance coming from Data Management. The delivery teams follows guidance from Release Management to add the database changes into their automated deployment scripts, getting help from Operations if needed to resolve any operational challenges. Evolution of data sources is a key aspect of Disciplined DevOps.
This highly collaborative strategy is very different than the typical traditional strategy that requires delivery teams to first document potential database updates, have the updates reviewed by Data Management, then do the work to implement the updates, then have this work reviewed and accepted, then work through your organizations Release Management process to deploy into production.
In the next blog posting in this series we will explore the internal workflow of a Disciplined Agile approach to Data Management. Stay tuned!
There are several values that are key to your success when transforming to a leaner, more agile approach to Data Management. Taking a cue from the Disciplined Agile Manifesto, we’ve captured these values in the form of X over Y. While both X and Y are important, X proves to be far more important than Y in practice. These values are:
As you can see, we’re not talking about your grandfather’s approach to Data Management. Organizations are now shifting from the slow and documentation-heavy bureaucratic strategies of traditional Data Management towards the collaborative, streamlined, and quality-driven agile/lean strategies that focus on enabling others rather than controlling them.