Data Resource Quality: Turning Bad Habits into Good Practices
ISBN: 0201713063
Buy this book at fatbrain.com
Your enterprises data resources are considered to be among its most important assets. But if the data isnt clean, if schemas are confused, if definitions arent consistent, if data structures arent oriented toward your line of business, and if the resources arent available to the right people at the right time, then youre in trouble. Reports and query results will be inaccurate, as will be decisions based on unreliable information. Employees will avoid using data resources that their employers have spent hundreds of thousands or even millions of dollars constructingand might choose to develop alternative resources, which will only exacerbate the problem. So, if the informations going to be useful, the data needs to be clean and properly organized. But what does clean mean? According to Michael H. Brackett in Data Resource Quality: Turning Bad Habits into Good Practices, clean data means more than double-checking to ensure that a social security number has nine digits, or that the ZIP code a customer provides actually matches the customers city and state. Its also imperative to ensure data resource quality at many levels, from creating strict naming conventions used for database tables, to documenting data integrity rules and dependencies, to making sure that data resources are designed with a business perspective foremost. Reprinted with permission from SDTimes. Originally appeared in Issue 18, November 15, 2000.Clean Data Is Happy Data
by Alan Zeichick
Brackett has turned his four decades of experience in data processing into a set of blueprints for recognizing problems with data. This book doesnt have all the answers; its not a guide to creating an ideal enterprise data resource, or for re-engineering a dysfunctional data center. But it will help managers ask the right questions when they evaluate their current data resources.
The book is organized in three main sections. The first is a chapter describing what Brackett sees as the state of the enterprise data resource. His contention, which I accept, is that most enterprise data isnt well organized, structured or validated. Its not made widely available to those who need it, in the form that they need it. Not only that, but data quality degrades over time. Considering that data resources are expensive, and essential to the business or other organization that created them, data managers face real challenges.
The second section of the book comprises 10 chapters, one dedicated to each of Bracketts 10 bad habits. More about those shortly.
The third section of the book provides advice as to what can be done to overcome those bad habitsnot the technical fixes, but the broader organizational, cultural and financial steps that must be taken. He stresses, over and over again, that theres no silver bullet. Creating and maintaining data resources is hard work, and must be pursued relentlessly.
BAD HABITS, BEST PRACTICES
The meat of Data Resource Quality lies in chapters 2 through 11, where Brackett describes each of his 10 bad habits. Each chapter describes a list of unacceptable or unreasonable items. It discusses the business impact of those habits. It then suggests corresponding good habits to replace the bad habits, and the business impact of those good habits. It concludes with a collection of best practices for turning those bad habits into good ones.
The problem is that the prose in those five parts of each chapter is repetitive, formulaic and downright tiresome to read.
Each of the chapters contains dozens of items, each of which describes a different attribute of the bad habit. Unfortunately, those items overlap, refer both forward and backward to other material in the book (without page numbers), and most annoying of all, have one-sentence summaries stuck in the middle of each item, surrounded by a thick-bordered box. This swiftly became tiresome and distracting. The author really seems to think in PowerPoint bullets, from which Id wager this book was written.
Patient digging, however, reveals pure gold. The first five bad habits that Brackett describes involve the structure of the data resources themselves, and cover formal data names, formal data definitions, proper data structure, precise data integrity rules and robust data documentation. Those are the relatively easy habits to spot and overcome.
The next five are the hard ones, because theyre often deeply entrenched throughout a data center: having a reasonable data orientation, providing acceptable levels of data availability, assigning adequate responsibility for the data, having an expanded data vision that fits the business, and ensuring that the value of the data is recognized appropriately.
Like I said, pure gold. Many organizations claim that their data is a strategic asset. Its time to treat it as such. If your responsibility encompasses the creation or maintenance of such data resources, or if your teams are building new systems that will interface with enterprise data, this book will help you evaluate the strengths and weaknesses of those data resources. Its hard to read, but its worth the effort.
|
"One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important. " - Bertrand Russell |



