Trust your data: why you need a governed data catalog

In 1661, Robert Boyle released his groundbreaking book, The Skeptical Chymist, which denounced Aristotle’s definition of four elemental bodies, and instead claimed that every phenomenon was the result of particles colliding in motion. Boyle is thought to be the founder of modern chemistry. Two centuries later, George E. Davis coined the term “chemical engineering,” which sparked a new discipline that took the ideas and concepts from chemistry and turned them into useful and attainable products. After World War I, chemical engineering took off as a reputable career and scientific discipline. But the real difference between chemistry and chemical engineering is not its history; it has to do with its scale. Chemists run experiments to develop novel materials and processes, whereas chemical engineers take these materials and processes and turn them into something larger and more efficient. 

Data catalogs can be like chemistry, small in scale and working well in silos, or they can be like chemical engineering, as true, enterprise data catalogs that are deployed across an organization and used for larger scale projects and company-wide initiatives. However, not all data catalogs can be used as an enterprise data catalog. To be used enterprise wide, a data catalog requires native and comprehensive data governance, data privacy and security capabilities that promote understanding and trust in data. That is, it needs to be a governed data catalog.

What is a governed data catalog?

A governed data catalog enables access to trusted and compliant data at scale across the whole enterprise. It breaks down legacy data silos and provides comprehensive visibility into all data across the data ecosystem with full context, accelerating business outcomes. It goes beyond indexing and business glossary functionality and supports data classification, data standards, data certification, data quality, data lineage and data policy management, resulting in better understanding and trust in data, while protecting data from misuse. 

While many data catalogs help users find data, data is only valuable if it is trusted. If you do not trust your data, you cannot confidently use it to make business decisions. And for data to be trusted, it must be made accessible with the appropriate controls and in accordance with data policies. That’s why organizations need an enterprise data catalog integrated with true data governance capabilities. Such a governed data catalog improves business agility and reduces risk by combining easy access to trusted data with security and privacy controls, leading to broader use of data in new and innovative ways. 

Governed data catalog must-have capabilities

At Collibra, we believe that data governance should be embedded in the foundation of a data catalog — bolt-on data governance is not enough. Collibra Data Catalog provides native and comprehensive data governance capabilities that ensure trust in the data, as well as proper and compliant use of data across the enterprise. These capabilities include:

  • Business Glossary: establish standard business definitions to enable a common understanding of data across the whole organization
  • ML-powered automation capabilities:  assist data stewards in adding business context to technical metadata at scale so users can more easily search for and understand data
  • Roles and responsibilities: assign data stewards, data owners and experts to data assets to make sure assets are maintained and managed consistently
  • Intuitive workflow engine: automate business processes with customizable workflows that facilitate a collaborative approach to data governance and data curation
  • Native, automated business and technical lineage: see how data transforms and flows as it moves from system to system, providing additional understanding and trust in data
  • Data certification: enable data stewards to certify datasets, metrics/KPIs and reports to promote the highest quality data in the data catalog
  • Policy Manager: create policies and apply them to datasets automatically using data classification, ensuring data is only accessed and used in a compliant manner
  • Data Privacy module: provide policies, guidelines and purpose limitations to guarantee compliance with regulations
  • Granular security controls: role-based and asset-level permissions and access controls for secure, enterprise-wide deployment

Why is a governed data catalog important? 

A governed data catalog is important because it operationalizes trust in data. It creates a trusted marketplace by providing visibility into data owners and experts, surfacing certified data assets, transparency of data lineage from source to usage, and crowdsourced feedback. Furthermore, a governed data catalog is necessary for enterprises that have large scale data initiatives and goals. Without governance, a data catalog can only operate in a silo because you do not have the ability to restrict usage across your enterprise, which opens you up to regulatory violations and increases business risk. However, with a governed data catalog, you can deploy it across the enterprise and trust that your data is protected and secure throughout various departments and data silos. 

At Collibra, we believe a governed data catalog is crucial to unlocking the value of data and scaling data initiatives across an enterprise. This governance foundation ensures trust in data and protects data from non-compliant use. It allows large enterprises to move from being chemists to chemical engineers, so they can use data across their enterprise to make impactful business decisions, innovate and grow.

We have been named to the Q3 2020 Constellation ShortList for Metadata Management, Data Cataloging and Data Governance! Read the ShortList to learn how Collibra Data Catalog ensures access to secure and well governed data.

More stories like this one

Nov 12, 2020 - 5 min read

Collibra and Databricks: Taking the partnership to the next level with...

Read more
Arrow
Aug 7, 2020 - 3 min read

Say goodbye to duplicate data spending

Read more
Arrow
people looking at impact analysis on an ipad
Feb 26, 2020 - 4 min read

Simplify impact analysis with automated lineage

Read more
Arrow