How to Achieve Data Management Maturity

8 September 2020 

A Q&A with Melanie A. Mecca, Chief Executive Officer, DataWise

The scope of information that the UN defines, produces and uses is vast and complex. In June, the UN published the Secretary-General’s "Data Strategy for Action by Everyone, Everywhere" – a roadmap for data-driven change in the Organization.

In this Q&A, Melanie A. Mecca, CEO of DataWise, explores fundamental best practices for designing and implementing a mature data ecosystem that can advance strategic data and information goals. Throughout the article, the term Enterprise Data Management (EDM) is used to refer to the essential program foundations for success.

Should organizations care about data management?

Since data is essential to every business, organizations depend on access to the right data, available at the right time, and in a satisfactory condition for fact-based analysis and conclusions. There are many disciplines, processes and practices [1] required to accelerate and sustain maximum value from data assets. The core pillars of EDM, however, can be summarized as:

  • Data Architecture – what you build or buy to capture, store and deliver data, and how the data is organized, shared and provisioned
  • Data Governance – how you define, control, and make collaborative decisions about shared data
  • Data Quality – how you define, evaluate, and improve the condition of data according to dimensions (measurements), such as completeness, accuracy, timeliness, etc.

To meet the UN’s full data potential, the Data Strategy roadmap includes data actions and workstreams to build these essential pillars. We’ll return to them in the context of decision-making, but let’s set a marker – no organization that I’ve encountered currently excels in all three areas, because the concept of ‘data as a strategic asset’ is relatively recent.

In “Business at the Speed of Thought” Bill Gates stated: “How you gather, manage and use information determines whether you win or lose.” I think we can all agree that the UN must win. However, for many years, data was an afterthought. This is counter-intuitive; data is the POINT – a critical asset.

With the advent of data warehousing for better reporting, master data management to integrate highly shared data, new technologies emphasizing user empowerment, and the explosion of data volumes in the last decade, today organizations realize that data management is a vital and permanent function. Therefore, much like the management of other core business areas such as Finance and Human Resources, organizations need to manage data, practically speaking, forever.

Since most organizations are evolving their technologies, staff education, and culture to manage data well, we can examine three successive broad accomplishments crucial for success – Capability, Maturity, and Deployment.

Capability consists of doing the right things. To build capabilities, an organization should first assess the current state of data management processes – identify major gaps in capabilities, starting with persistent problems. For example, a major gap may be the need for consistent business terms for shared data across an organization in a manner that can be easily aggregated (for instance, this capability is frequently required for Master Data Management solutions).

Many important capabilities and strengths have already been defined and established - some Member States and organizations have implemented proven, effective data management programs, polices, processes and education over the years. Successful approaches and corresponding work products can be leveraged through the Centers of Excellence envisioned in the UN Data Strategy to benefit the UN family and assist the UN to standardize core data management processes, over time, for all Member States and organizations. Implementing pilot projects can also be an efficient way to establish best practices, and high priority data use case actions are excellent starting points for improving data management processes. If approached strategically, gains will be cumulative, and work products can be baselined and reused for each successive data action.

Maturity consists of formalizing and embedding improvements in capabilities, such that they become 'business as usual'. Formalization consists of mandating a well-defined set of capabilities and processes, promulgating them through policies, supporting them with staff resources, and implementing a compliance program to ensure stability, resiliency, and implementation across the enterprise. Maturity practices are commonly employed elsewhere in an organization, such as in planning, resourcing, leading, standardizing, etc. – they should also be applied to data management processes. With increased maturity, the need for compliance processes is recognized (especially for highly shared data), and that is dependent upon robust, operationalized data governance.

Maturity practices ensure that the capabilities are followed, even during stressful conditions. An organization can evolve through a ‘minimum viable set’ approach, by determining what capabilities are most critical right now for high priority data actions and then extending the scope over time.

Deployment measures the implementation extent of capability and maturity practices across the organization, according to a planned sequence. For instance, the UN COVID-19 Data Hub initiative has developed ready-to-use templates, a standardization mechanism for capturing and reporting COVID-19 data, to accelerate the time to identify insights for national statistical offices. This approach and its corresponding work products can be leveraged for other medical data collection and reporting important for UN Member States and organizations. As capabilities are defined and become mature, they can eventually be implemented across all relevant entities of an organization.

What is the relationship between data maturity and improved decision-making within an organization?

Better data, better decisions – it’s as simple as that.

Let’s consider the worldwide COVID-19 pandemic. The crisis has spotlighted the criticality of timely and accurate data that conforms to agreed-upon standards for an array of metrics—for case counts to the percentage of survivors who experience lasting health challenges. If Member States do not agree on key concepts, such as whether death of a COVID-infected individual is recorded as a heart attack or caused by the virus, a full picture of the pandemic’s spread and effects is not possible, decisions will be less trustworthy, and guidance issued to combat the disease and protect life may be late or insufficiently informed.

As the Data Strategy notes, one of the UN’s top priorities is the enablement of expanded analytics capabilities, and user empowerment for self-service statistical analysis, data visualization, and cutting-edge technologies such as artificial intelligence and machine learning. The diagram below is illustrative of the obstacles that organizations must overcome to leverage advanced capabilities and solutions, based on a Forbes 2016 survey of data scientists.

Note that 'cleaning and organizing data' accounts for 60% of the time analytics teams spend preparing for their modeling and mining activities. If you add the 19% spent 'collecting data sets,' only 16% of their time is left for mining data, refining algorithm, and building training sets. These time consuming tasks need to be minimized to empower the skilled, creative and timely analysis that an organization counts on.

Therefore, in addition to implementing enabling technologies, hiring and training analysts and data scientists, and educating users to ask the right questions and employ self-service analytics tools, the organization must manage its data assets well to accelerate the speed of insight and empower informed decisions.

What are three best practices or tips for achieving data maturity?

To achieve data maturity, organizations should consider recognizing, establishing, and growing two permanent functions – centralized data management and federated data governance, illustrated in the diagram below:

The data management function, recommended to be established as a centralized organization, serves as the backbone of anchoring capabilities and persistent work products (i.e., strategies, policies, processes, standards, and templates for the EDM program), which the organization needs to define, implement, and expand. It’s recommended that the UN treat the data action initiatives envisioned in the Data Strategy as incubators for policies, processes, standards and products (for example: a data catalogue, a glossary of shared terms, and an evolving set of sound data management processes), through Centers of Excellence and a dedicated core organization.

The data management organization can assist the Strategy Oversight function in measuring progress and ensuring that accomplishments in the data action initiatives are baselined and leveraged across the UN. We can consider this organization to be the ‘collective data management memory’ of the organization.

The data governance function is, in essence, mutual decision-making about shared data. Shared data can be defined as: (a) within the scope of ‘enterprise data’ defined by the Data Strategy; (b) produced by one or more organizations or business areas; and (c) consumed by multiple stakeholders. Reference data and master data – examples of ‘highly shared data’ – are especially important, as those data sets need to be timely, accurate, and highly available to multiple applications and user groups. Key responsibilities of governance groups include:

  • Data definition – achieving agreements on key concepts (business terms) is a primary responsibility of governance groups and representatives. The more important a shared concept is to the organization, the more engaged governance participants need to be in creating a consensus definition.
  • Data improvement – governance engagement in improving data quality is vital. Governance representatives are needed to determine what level of quality is desired, what level of quality is acceptable and what quality rules should be applied to improve the data. 
  • Issues escalation – when different stakeholders cannot agree on a decision due to conflicting requirements, governance groups need to determine when issues should be escalated to a more senior decision body. 
  • Access control – the parties who control a data source have accountability for determining how access will be granted and to whom; the parties who maintain a data source (‘data custodians’ or ‘technical data stewards’) are accountable for how access will be managed and executed.
  • Approvals - governance participants need to review and provide their organization’s input for core work products – strategies, policies, processes, standards, and templates for the EDM program. This is critical, as the Data Strategy aims to increase enterprise-wide understanding of shared data assets and build consistent management practices.

Here are three tips that can assist an organization in rapidly and resolutely progressing towards data maturity:

  • To evaluate the current state on the Capability-Maturity-Deployment wheel, a comprehensive assessment against a framework for best practices is highly recommended, to measure where an organization is performing well and what gaps are evident. For analytics, such an assessment can pinpoint the capability deficiencies that result in almost 80% of a data scientist’s time spent on laborious manual data preparation tasks.
  • The selected best practices framework should be industry and technology neutral and focus on clearly stated functional practices, with consensus decisions made by key stakeholder representatives, and it should include a path to gauge further improvement in the processes that contribute to timely, accessible, high quality data. Through this effort, the organization can also quickly discover and mine exemplary work products and save considerable time and redundant work efforts by not reinventing the wheel.
  • Identification of high priority data initiatives requires scoping and definition of critical data sets (also called ‘domains’ or ‘subject areas’). By defining, documenting, publishing, and educating relevant staff about these data sets and the overall data scope [2] encompassed by the initiative, the organization can more quickly determine what data is shared, what data needs to be reconciled, what data is redundant, what data can be integrated, what sources are more complete, what systems can be retired, etc.

The importance of providing education for all staff levels according to their job role is paramount. [3] Several levels of education and training are useful, at a minimum:

  • All staff who use a computer in their work should be educated in data awareness, key concepts supporting the Data Strategy’s vision. 
  • Staff responsible for defining, organizing, or aggregating data need education in data analysis skills, such as defining data, data quality, metadata, etc., as well as specific training in selected platforms.
  • Staff responsible for developing statistics to support critical decisions should be educated in data literacy approaches and concepts that help to extract meaningful information from data, as well as training in the selected analytics tools.
  • As processes are implemented, role training for process actors should be offered. For instance, as data governance is established, new data stewards need to know their responsibilities, the types of tasks that require their engagement, how to escalate issues, etc.
  • Senior staff need to learn how to lead evolution to a data-aware culture, how to navigate major decisions about shared data, how to effectively interact with governance groups, and how to champion data actions and data improvement initiatives.

Organizations are advised to develop an enterprise-wide education plan by staff role type, including a rollout schedule, and identify organizational units interested in piloting the educational offerings. Computer-based training is the most effective method to deliver education at scale, supplemented by instructor-led training, focused workshops and internal knowledge sharing presentations.

The UN Data Strategy’s publication has launched a major data transformation initiative, which has the power to transform the Organization, increase its agility, and sharpen its ability to predict and respond to worldwide challenges. Since a high-functioning UN is vital for the stability and safety of all nations, this is an exciting development for everyone on the planet.

Note: The views expressed herein are those of the author and do not necessarily reflect the views of the United Nations.

[1] For example, the Data Management Maturity (DMM) Model has 25 Process Areas and 414 functional practices, the Data Management Body of Knowledge (DMBoK) has 16 knowledge areas and numerous constituent topics within each.

[2] Data scope, in this context, is the answer to the questions “What data do I need to accomplish these results (or implement these features)?”

[3] The following article provides further information EDM Education Part 1 - Why, What and Who