back to reading room.

A general data management reading list

This handful of books is a good starting library for anyone seeking a broad understanding of data management.

DAMA Data Management Body of Knowledge (DAMA DMBOK)

DAMA International sponsors the Data Management Body of Knowledge (DAMA DMBOK®) and the Certified Data Management Professional (CDMP®) qualification.
The DMBOK was substantially revised for its second edition, published in 2017.
For clarity and comprehensiveness this is a hard act to beat. It’s well organised and most sections are well written. It’s worth acquiring even if you have no plans to sit a CDMP exam.

The book itself can be purchased from multiple retailers, in physical or digital form. There are links to related material on the DAMA website.

Data Modeling for the Business – Steve Hoberman, Donna Burbank, Christopher Bradley

Many ‘data professionals’ don’t love this book. But that’s ok, it’s not aimed at them. This is an attempt to explain how the discipline of data modelling, particularly the core concepts stored in a high-level data model, can be used by an organization. You don’t need any technical knowledge to absorb the material. The first chapter draws parallels between a data model and a house blueprint in a way that can feel a bit laboured, but it picks up from there.

The three authors are all influential educators in the data management world – they write clearly and can address a broad audience.

Getting in Front on Data: who does what – Thomas C Redman

A few Harvard Business Review articles have strongly influenced how data management challenges are talked about. Thomas Redman’s 2013 thoughts on “Data’s Credibility Problem” was one of these. Getting in Front on Data expands on the themes of that article while making the same overriding point: organisations need to find and eliminate the root causes of data error.

A quote from Redman himself is the best introduction to his work:

It is clear to me that data is becoming the key asset of our times. Yet most data is in pretty bad shape, most companies are not very good at putting their data to work and their organizations are singularly ill-suited for data. Solving these problems is the management challenge of the 21st century.

Redman frames data quality issues in terms of business processes and the choices people make within them. He’s definitely someone to read if you have a job that involves making decisions about organisational investment priorities.

Achieving Buzzword Compliance: Data Architecture Language and Vocabulary – David C Hay

It’s a shame about the title, but…
we agree with John Zachman’s preface: the time David Hay has invested in understanding, summarizing and contextualizing seminal pieces of work is invaluable. If you work with data or with technology you should read this book.

You may question how learning about the history and vocabulary of the data modelling industry can help you, because the problems you’re trying to solve are specific and concrete. Stay with it, though, and you’ll have a stronger foundation for understanding those problems, and will be able to ask more informed and pointed questions about the solutions that others (your technical staff? vendors?) propose.

There are some typographical and grammatical flaws that we hope will be remedied in a future edition. There are also some authorial idiosyncrasies that a strong editor would have challenged.

But the book is an astonishing achievement. One of our data architects read passages aloud at us for two weeks, and claims the section on The Semantic Data Model to be the best summary of the topic that exists.

Of course, you may consider that to be a powerful reason to avoid both Hay’s writing and our workplace. If so, we suggest you move on to Danette McGilvray’s Executing Data Quality Projects – which contains no references to ontology, and only two to semantics.

Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information – Danette McGilvray

Danette McGilvray sees organisational data quality goals from a project manager’s perspective, as the results of structured and repeatable sequences of planned tasks. Executing Data Quality Projects captures that viewpoint, providing a practical and concrete toolkit for implementing a data quality project. The book includes many guidelines, tips and templates. Along with David Loshin, McGilvray has a claim on the title most influential data quality management practitioner in North America. A web search will throw up opportunities to hear her speaking about her ten step approach, and she also offers formal training.