Improve data literacy and data sharing in government

Data literacy is foundational to driving successful digital transformation initiatives in the public sector. Congress first proved how much the government could benefit from using data evidence to create policies and inform programs when the Foundations for Evidence-Based Policymaking Act came into effect in January 2019.

The Act entails the development and implementation of a comprehensive data governance, cataloging, data quality and privacy strategy that accounts for all data assets and metadata (data about data) created by, collected by, under the control of, and maintained by each agency.  It further requires that agencies ensure the protection of classified information while making non-classified data available to the public.

In addition, the Federal Data Strategy provides a 10-year vision and plan for improving data literacy and data sharing at the intra-agency and cross-agency levels.  It outlines a series of milestones that agencies must achieve including democratization of fully governed data as well as greater use of self-service analytics.

Moreover, the plan stipulates that all agencies should document use cases for artificial intelligence (AI) and ramp up on data science expertise.  This will further require agencies to implement a robust AI governance and data management strategy that enables them to pivot to explainable AI.

Both the Act and Federal Data Strategy provides a framework for agencies to work towards a common goal to become data-driven entities that are leveraging trustworthy and high-quality data. Ultimately enabling them to improve efficiencies, reduce waste and deliver optimal citizen experiences and services in a timely manner.

Navigating a complex data landscape

Across all levels of government, the data landscape is highly complex with petabyte-scale data spread across hundreds if not thousands of disparate data sources – from data lakes, data warehouses, databases to enterprise applications, legacy systems, and more. With a data sprawl it is virtually impossible for government employees to easily find and understand the data they need in a timely manner.

Additionally, agencies are grappling with poor-quality data (a combination of outdated, duplicate, incomplete, erroneous and missing data). This has a ripple effect within and across agencies, leading to waste and errors – from delayed reimbursements, services that are needlessly duplicated, to tax dollars that may go uncollected, and more.

For instance, a major federal agency discovered that a location code that was inaccurately documented across some of their systems as well as data validation errors caused nearly $12 million USD in fees stuck in the pipeline. It took weeks to rectify the problem resulting in delayed payments.

Solving the data conundrum with Collibra

At Collibra, we work with many of the agencies across all levels of the public sector to help them derive value from their data – including federal, state and local, as well as educational institutions.

For instance, a major federal agency is using Collibra Data Intelligence Cloud to gain full visibility into their data ecosystem and to put the right controls in place to improve data literacy and sharing:

  • They have automated policies, guidelines, data standards and reporting protocols outlined by the agency and in adherence with the Federal Data Strategy and Evidence-based Policymaking Act.
  • With Collibra, they can easily discover the data they have and ensure it is fully governed for use, of high-quality and standardized. This has allowed the agency to mitigate risks associated with what data flows in, out and through their infrastructure.
  • Moreover, they have improved collaboration on data by creating responsibilities and ownership through the introduction of data stewardship. For instance, data stewards and data governance leads can easily validate, rate and prioritize data sets for use.
  • The agency has institutionalized data governance across their entire domain. They have published more than 80 data reference standards that the entire agency and its’ partners can access.

The Collibra Data Intelligence Cloud Advantage

Collibra Data Intelligence Cloud is an intelligent data platform that leverages the power of machine learning and provides a holistic and integrated approach to cataloging, governing, protecting, managing and collaborating on data at scale across on-premises, hybrid and multi-cloud environments.

Government agencies and educational institutions world-wide rely on Collibra Data Intelligence Cloud to:

  • Rapidly catalog all their data regardless of where it resides to create a single, trusted repository of metadata. The catalog offers advanced capabilities that are purpose-built to empower users to easily find and understand the data they need with rich business and technical context at a granular level.
  • Enact a comprehensive data governance strategy and develop a consistent data taxonomy, data usage registry, physical and/or logical data dictionary and glossary of terms. With Collibra Governance, agencies can speed the automation of policies and rules that allow them to meet the guidelines set in the Federal Data Strategy and Evidence-based Policymaking Act. Pre-built templates, out-of-the-box data domains and intuitive workflows provide a framework for cross-functional teams to establish a common understanding of data to ensure consistency.
  • Address data quality and privacy issues at scale to ensure data integrity. For instance, Collibra predictive data quality and observability leverages machine learning (ML)-enabled capabilities to proactively detect anomalies in data such as missing records, values as well as broken relationships across tables or systems resulting in rapid resolution. Moreover, advanced capabilities in data protection helps ensure classified data is easily identified and fully protected with role-based access control.
  • Obtain end-to-end visibility with an active metadata graph and data lineage, enabling users to easily visualize the flow of data from source to destination as well as understand all the data dependencies at a granular level.

      Next Steps
      To learn more, I would like to recommend the following Collibra webinars and reading material:

      More stories like this one

      Nov 8, 2024 - 4 min read

      Announcing Data Quality & Observability with Pushdown for SAP HANA, HANA...

      Read more
      Arrow
      Nov 6, 2024 - 2 min read

      A better way to navigate the requirements of BCBS 239

      Read more
      Arrow
      Nov 6, 2024 - 4 min read

      AI and data compliance: How the AI Act will impact your organization

      Read more
      Arrow