Project Management

Data Resource Quality: Turning Bad Habits into Good Practices

Author: Michael H. Brackett

ISBN: 0201713063

Buy this book at fatbrain.com

Clean Data Is Happy Data
by Alan Zeichick 

Your enterprise’s data resources are considered to be among its most important assets. But if the data isn’t clean, if schemas are confused, if definitions aren’t consistent, if data structures aren’t oriented toward your line of business, and if the resources aren’t available to the right people at the right time, then you’re in trouble. Reports and query results will be inaccurate, as will be decisions based on unreliable information. Employees will avoid using data resources that their employers have spent hundreds of thousands or even millions of dollars constructing—and might choose to develop alternative resources, which will only exacerbate the problem.

So, if the information’s going to be useful, the data needs to be clean and properly organized. But what does “clean” mean? According to Michael H. Brackett in “Data Resource Quality: Turning Bad Habits into Good Practices,” clean data means more than double-checking to ensure that a social security number has nine digits, or that the ZIP code a customer provides actually matches the customer’s city and state. It’s also imperative to ensure data resource quality at many levels, from creating strict naming conventions used for database tables, to documenting data integrity rules and dependencies, to making sure that data resources are designed with a business perspective foremost.

Brackett has turned his four decades of experience in data processing into a set of blueprints for recognizing problems with data. This book doesn’t have all the answers; it’s not a guide to creating an ideal enterprise data resource, or for re-engineering a dysfunctional data center. But it will help managers ask the right questions when they evaluate their current data resources.

The book is organized in three main sections. The first is a chapter describing what Brackett sees as the state of the enterprise data resource. His contention, which I accept, is that most enterprise data isn’t well organized, structured or validated. It’s not made widely available to those who need it, in the form that they need it. Not only that, but data quality degrades over time. Considering that data resources are expensive, and essential to the business or other organization that created them, data managers face real challenges.

The second section of the book comprises 10 chapters, one dedicated to each of Brackett’s 10 bad habits. More about those shortly.

The third section of the book provides advice as to what can be done to overcome those bad habits—not the technical fixes, but the broader organizational, cultural and financial steps that must be taken. He stresses, over and over again, that there’s no silver bullet. Creating and maintaining data resources is hard work, and must be pursued relentlessly.

BAD HABITS, BEST PRACTICES
The meat of “Data Resource Quality” lies in chapters 2 through 11, where Brackett describes each of his 10 bad habits. Each chapter describes a list of unacceptable or unreasonable items. It discusses the business impact of those habits. It then suggests corresponding good habits to replace the bad habits, and the business impact of those good habits. It concludes with a collection of best practices for turning those bad habits into good ones.

The problem is that the prose in those five parts of each chapter is repetitive, formulaic and downright tiresome to read.

Each of the chapters contains dozens of items, each of which describes a different attribute of the bad habit. Unfortunately, those items overlap, refer both forward and backward to other material in the book (without page numbers), and most annoying of all, have one-sentence summaries stuck in the middle of each item, surrounded by a thick-bordered box. This swiftly became tiresome and distracting. The author really seems to think in PowerPoint bullets, from which I’d wager this book was written.

Patient digging, however, reveals pure gold. The first five bad habits that Brackett describes involve the structure of the data resources themselves, and cover formal data names, formal data definitions, proper data structure, precise data integrity rules and robust data documentation. Those are the relatively easy habits to spot and overcome.

The next five are the hard ones, because they’re often deeply entrenched throughout a data center: having a reasonable data orientation, providing acceptable levels of data availability, assigning adequate responsibility for the data, having an expanded data vision that fits the business, and ensuring that the value of the data is recognized appropriately.

Like I said, pure gold. Many organizations claim that their data is a strategic asset. It’s time to treat it as such. If your responsibility encompasses the creation or maintenance of such data resources, or if your teams are building new systems that will interface with enterprise data, this book will help you evaluate the strengths and weaknesses of those data resources. It’s hard to read, but it’s worth the effort. 

Reprinted with permission from SDTimes. Originally appeared in Issue 18, November 15, 2000.

 

 

 

 

 

 

 

 

 


ADVERTISEMENTS

"Whenever you find that you are on the side of the majority, it is time to reform."

- Mark Twain