Many organizations are focusing their data management and “data as an asset” governance programs on improving data for analytical purposes. Several of my present clients are following this trend. One problem they are facing is that their data has been unattended for many years, leading to lower-than-desired levels of data quality and an inability to integrate the data in a cost-effective manner.
The quality of data available to provide analytical capabilities is a sore subject for many organizations. Every organization wants to be able to predict customer behavior, improve efficiency and effectiveness of their supply chain, reduce production costs or whatever the business use of “good data” may be.
“Improved analytical capabilities begin with quality data that is defined, produced and used effectively.”
This column is the first in a three-part series that addresses what it takes to achieve “good data.” For this column, I will focus on the first of the three areas where organizations can simply and logically break down the activities that are required to achieve “good data.” I have mentioned these activities in the past. The three areas include:
1) Improving Data Definition
2) Improving Data Production
3) Improving Data Usage
The activity of improving data definition is a vital part of improving overall data discipline. I am starting this series with the activity that becomes the underlying determinant of quality in the production and usage phases of data. I will explain as I go along. Stay tuned for those columns in the next few issues of TEQ Magazine.
Improving Data Definition
Start with Business Glossaries, Data Dictionaries and Metadata Management
These three items are past, present and future industry buzzwords. Organizations create these metadata (data about data) records of business terminology, database description and end-to-end knowledge of the data that their information systems contain, to provide improved knowledge and understanding of data and data-related assets. There are many webinars, white papers and articles on what to include in these resources.
To take advantage of these resources requires that people have the responsibility to consistently and methodically capture, validate, share and maintain information about the data that our organizations use to operate the business, report accurately, make good decisions and provide improved analytical capabilities based on knowledge and understanding of the data being analyzed.
Practice Data Modeling Best Practice
Wikipedia tells us that data modeling is a process used to define and analyze data requirements needed to support the business processes within the scope of corresponding information systems in organizations. Data modeling is also a process of relating data to other data for purposes of bringing data together to solve business problems and address business operational needs. Data modeling tools provide the ability to capture these requirements through data and turn the requirements into physical data stores that become the backbone of the business.
To take advantage of what data models can provide requires that people have the responsibility to consistently and methodically model the organization’s data. Organizations that follow the Agile approach to system development, software package implementation and enterprise transformation must become convinced that good data is a result of a disciplined approach to defining the data in a way that satisfies their Agile methods. There is work to be done on this.
Build Out Data Catalogs
Data Catalogs are not as well defined, industry-wide, as glossaries, dictionaries and metadata. Data catalogs often contain information about how data is being used across the organization. Simply stated, a data catalog focuses on the data that is readily available to your business communities. This data can reside in data reporting tools (lists of available reports) or documentation about data in the data warehouse, data marts, data lakes or basically wherever you have been storing your data for people to consume.
To take advantage of data catalogs requires that people have the responsibility to consistently and methodically document the data that is being made available for business consumption. Data catalogs can become a valuable resource to people who repeatedly say that they need better access to the data. I am guessing this is said in your organization.
In my last TEQ column, I wrote about organizations having the Data Flu and I wrote about the symptoms of data that is “sick” or not fit for purpose. In this column, I have started to simplify some of the steps required to turn average data to “good data” for analytical purposes. The list I have included here, as part of the activities associated with good data definition, only scratch the surface of activities that will lead to improved analytical capabilities.
There are two consistent thoughts carried by each item in the data definition list. The first is the need to execute and enforce authority over the management of data, meaning that the activities to build and provide these data resources must become built into how organizations act. The second is the need to formalize accountability for the actions of governing data. Data Governance, or as many organizations are now stating – managing “data as an asset” – is becoming accepted widely as the only way to achieve “good data”.