How much is data worth? And how much more could it be worth to your enterprise if you were getting the most out of it?
To begin to answer these questions, it’s instructive to look back at the history of data and data analysis.
A short history of data
Data and the analysis of data has a long history that dates back centuries, even millenia. The Egyptians recorded data on papyrus to track commercial transactions. In the 17th century, pioneering statistician John Graunt collected data on death rates in London during the bubonic plague.
However, the explosion of data — and its dominant role in today’s economy — stems from the introduction and widespread adoption of two primary technologies: the Internet and the smartphone.
With the rise of the Internet at the turn of the 21st century, companies like Yahoo, Amazon, and Google started to analyze customer behavior via clickthrough rates, IP-specific location data, and search logs. HTTP-based web traffic drove a massive increase in structured and unstructured data. Innovative companies spun up BI (aka business intelligence) teams to analyze the massive new volumes of data. In 2004, Facebook launched a social media platform that would grow to more than 2B+ users today. And in the summer of 2007, Apple introduced the iPhone.
By 2008, the global datasphere reached 14.7 exabytes or 2 to the sixtieth power of bytes. Today, the datasphere equals approximately 97 zettabytes (ZB). 1 ZB is equal to 1,000 exabytes or a trillion gigabytes.
The world we live in is truly driven by data.
The problem with data and technology
Data in the broadest sense of the word — including reference and metadata — includes all the metrics, reports, APIs systems, policies, and even business processes that produce or consume data.
Although our technologies have progressed to a sufficiently advanced state to handle today’s insanely high volumes of data, most enterprises haven’t created a data-driven culture. They lack consistent, comprehensive data intelligence that could be driving better ROI. They lack a culture of data maturity. (As a recent IDC report makes clear, data maturity drives 3X better business outcomes.)
Today’s innovative enterprises understand that data is their most valuable asset. The rise of big data catalyzed a crowded market overrun with solutions, frameworks, and communities to address data challenges. However, most companies still struggle to understand how to extract the full value from their investments in the latest technologies.
The good news is that many of the technologies have matured. The next frontier in expanding and activating the role of data is driving more widespread adoption and use of data.
Today, companies are looking to ideas like the data mesh — and its vision of a decentralized, domain-driven data management culture — as the next step in creating value from data.
Today’s challenge: Trusted data
The challenge of data management in a data-driven enterprise led to the creation of companies like Collibra, which was founded in 2008 as a data governance solution.
Historically, data governance and data quality initiatives were back-office functions largely confined to regulated industries that had to comply with legal requirements.
But as the cloud went mainstream, the potential value of data increased dramatically. The value proposition for data quality and trust evolved from primarily a compliance-driven ancillary solution to becoming a lynchpin of competitive advantage.
In the 2010s, we saw widespread adoption of databases, Hadoop, data lakes, and data warehouses. These innovations were largely about getting the technology to work. Highly specialized technical teams with specialized skills developed data architectures and processes to serve a myriad of organizational data needs. And it resulted in a lot of frustration with data initiatives for most organizations who didn’t have the resources of the cloud and social media giants to maximize the value of enterprise data.
This is why today, for example, there is momentum around rethinking monolithic data architectures. The current popularity of data mesh is gaining traction as a way to serve the data needs of decentralized business users. We hear a lot about data democratization. These decentralization efforts around data are great. But they create a new set of problems.
Specifically, how do you deliver a self-service infrastructure for business users and domain experts? How do you automate governance? How do you automate policy and process mandates? In a world of GDPR and CCPA, these concerns become more and more critical to enterprises.
No one is going to use data that isn’t trusted.
As a result, data governance must evolve to become more automated and intelligent. Governance and data lineage is still fundamental to ensuring trust as data moves through an organization.
As data flows, metadata has become increasingly important for data discovery and data classification. The continuous ability to check for data flaws and automating data quality; these features have become a functional requirement of any modern data management platform.
And finally, privacy has become a critical component to cybersecurity.
It’s time for trusted data
Despite all the innovations that’s gone on over the last decade or more with data — the Hadoop movement, data lakes, Spark, the ascendancy of programming languages like Python, introduction of frameworks like TensorFlow, the rise of AI, low-code applications, no-code applications, etc — businesses still find it difficult to get more value from their data initiatives.
While a lot of energy has been focused on more efficiently storing and processing data, more energy should go into thinking about the people and process-side questions. We should be making it easy for data professionals to trust data, gain insights, and leverage it in innovative ways to fuel the creation of valuable data products.
The rising complexity and fragmentation in the broader data landscape is creating massive complexity for enterprise data management. The need for trusted data is now critical. And the level of scrutiny — because of privacy, security, and regulatory concerns — have only made the demand for trusted data more acute.
When you consider today’s economic environment, the challenge of trusted data has become an existential challenge to businesses focused on cost control, productivity, and efficiency. Getting more scaled value from data — not just from a technology perspective, but with people and processes — is the next frontier for enterprises that want to see better ROI from data.
We think the current economic environment will be a catalyst for enterprises that take data maturity seriously. As this becomes increasingly common, Collibra will be there to deliver trusted data for every user across every source.
Are you ready to get more out of your data?