12 steps to Data Intelligence: Part 3

Data registry and Idea matching

Connections and Credibility: Building a Data Registry and a Demand Economy for the Data Intelligence Journey

We are tasked with assessing why a company experiences a high rate of customer churn. How do we drive for an outcome that is accurate, actionable, and low effort so the company can prevent future churn? We’re on a Data Intelligence journey in 12 steps. In part 3 of this 5 part series, we will cover the next 2 steps in the Data Intelligence Journey:  

  • Step 7 — Data Matching: A de-identified method of linking and matching the most important data in your enterprise without the need for integration, complex matching oversight, or data stewardship, and that can be used as a service to automatically assemble 360° views when you need them
  • Step 8 — Idea Management: A communication method between Consumers and Suppliers of the metadata, relationships, and attribution that instills gamification techniques to creatively establish priority and importance

Our journey continues… 

Cliff, a business analyst who’s been tasked with trying to find out why his company is experiencing a concerning trend of high customer churn. Given the large numbers at stake, the company must move quickly but they must accurately uncover the root cause and prescribe an action plan that works. Inaction is bad. The wrong action is worse. The answer is in the data. But what data?

This is the foundation of Data Intelligence. Trusted data belongs to every knowledge worker and should flow through the organizational ecosystem in such a way as to let business professionals connect, communicate and collaborate in every way they need and choose. We’re with Cliff as he tries to find solutions to this real-world problem.

First, we built the foundation for a strategic Data Intelligence program. That led us to stock the shelves in a highly organized way and with easy-to-use context. This is where professionals can use trusted data to

  • Conduct research
  • Analyze patterns
  • Identify problems and opportunities
  • Collaborate with colleagues and partners 
  • Ensure security and compliance

But let’s remember our target audience, Cliff, and therefore the shopping experience metaphor that underpins this journey must respect the skills and capabilities of its users. Let’s continue our journey.

Step 7: Data Matching

Cliff is ready to put items in his or her shopping cart, but something is nagging at him. The analysis he wishes to run to best understand churn requires meaningful segmentation of the customer base. Cliff has determined that Customer Age, Location (Postal Code), and Gender are the most important segments to best understand, predict, and prevent churn. But when Cliff compared and contrasted data sets during Step 6: Profiling & Scoring, he noticed that the best data set for Customer Age is the SFA application, for Location the best data set comes from Fulfillment, and for Gender it is the Web Portal Profile & Preferences data set. Not one of these three data sets is good for all three segmentation attributes. Cliff is faced with a dilemma — what am I willing to sacrifice to get at least a partial answer? 

What if the answer to this dilemma was, “You don’t need to sacrifice! We can give you the best possible data for any Customer in your ecosystem and you don’t have to do anything extra for this reward.” But how?

Taking you back to Steps 2 &4: Establishing Data Domain Models and Cataloguing, we discover, register, and classify your physical data sets to the logical domain models using machine learning and guided stewardship. For each applicable domain model (e.g., Customer, Employee, Product, Location, etc), we know in advance that logical and physical attributes that help assert the identity of any Domain. For example, we know in advance the attributes such as Name, Address, Email, DOB, etc that are useful in uniquely identifying a Customer.

Having this knowledge at hand, it is quite possible to auto-generate the instructions to extract the personal identity information from each data set with absolute precision and with no need to build a custom integration. And much like you discover metadata, lineage, and profile information, you can extract this information at the Edge so you can protect privacy and manage risk.

With this personal identity information extracted from its host database and in memory at the Edge, you now have to address the most common issues that make data matching a challenge

  • Inconsistency
  • Missing data
  • Mistakes
  • Obfuscation

As an example, it is not uncommon to find First Name and Last Name transpositions or the use of a nickname (e.g., Bob vs Robert) in a data set. Or to find the month and the day switched for DOB. These types of challenges trip up deterministic matching systems and lead to false negatives (undetected match). You might be inclined to loosen your matching rules, which then might lead to false positives (over-matching). A thoughtful approach to standardizing the extracted data to account for these types of real-world issues will make a big difference in your ability to control false negatives and false positives.

With our goal of Privacy and Risk by design, we simply cannot copy this data to a central location for processing like Master Data Management solutions require. But to achieve our objective of a Customer 360 for Cliff, we need the ability to compare and match records from all four corners of the data landscape. A really clever method to achieve these competing objectives is the process of de-identify (anonymizing) the personal identity information (PII) that has been standardized at the Edge before transmitting to a central location. This de-identified information must not be re-identifiable or it would fail to support the Privacy and Risk objectives. But if enough of these de-identified attributes for a given Customer record can match enough from another record from another database or system, we might be able to assert with confidence that the two or more records from different databases are in fact the same Customer. 

In Cliff’s situation, if we are able to match and link Customer records from SFA, Fulfillment, and Web Portal, we make it possible to use Age, Postal Code, and Gender from any of these 3 databases without highly complex and expensive projects like MDM or Data Quality enhancement projects. We have unlocked the door(s) for Cliff so he can operate without compromise or sacrifice. 

The benefits are undeniable: This incarnation of Data Matching goes beyond even advanced Master Data Management solutions because it requires zero integration, zero algorithm development and testing, and zero Data Stewardship. Cliff is feeling pretty good about the richness and trustworthiness of the data available and accessible to him or her. 

Step 8: Idea Management

During the shopping experiences, it is quite likely that Cliff will find things missing from the shelves, lacking the quality needed for his or her analysis, or have insufficient information to help guide objective decision making. It would be a shame if Cliff’s observation was lost and subsequent users encounter the very same limitation or frustration. Cliff can certainly share his or her observations or suggestions from within the working environment, similar to the way customers of software might use a solution like Aha! to submit an enhancement request. But there are likely 100’s of Cliffs (the Consumer) to every one of those that supply the content. You could be overrun by the suggestions very quickly.

What if there was a way for Cliff to communicate the relative importance of his suggestion that could be normalized across the company? Imagine Cliff having a finite number of virtual coins for a given time period. 

Let’s say that Cliff has 1,000 virtual coins for the year. When Cliff enters a suggestion, he can apply some percentage of remaining coins towards this suggestion, the more coins spent the more important it is to Cliff. And imagine the suppliers are monitoring these suggestions. When a suggestion that appears to be relatively straightforward and easy to resolve carries with it a large number of virtual coins, it is more likely that a supplier will take on that work compared to a complex and time consuming task with a small number of coins. 

In this way, you are incorporating the effect of a Market Economy and using a very natural means of establishing priority and importance. Leader boards (of Suppliers) can be used to feature those that completed the largest amount of high valued work. Virtual coins can be used to trade in for something of value within your organization — recognition, money, tangible goods, vacation, etc.

We are clearly pivoting from a supply-centric model (build the store and stock the shelves) to a demand-centric model (shoppers explicitly or implicitly driving priorities). When Cliff finds value in engaging with your Data Intelligence platform, he will come back often. Cliff will tell his colleagues and word of mouth and other promotional activities will drive more demand and adoption. The market economy dynamics will propel your organization down a virtuous cycle of improvement where insights beget more insights and before you know it, your organization will be making data-driven decisions as a habit and productivity will soar.

But before we get too far ahead of ourselves, let’s remember that Cliff does not write code. All Cliff has at this point is some really well conceived of instructions. If we really want to democratize the data in a governed and secure way, we need to give Cliff an easy way to request and receive the items in his or her shopping cart. Come back for part 4 of the series to see how we make that a reality. Stay tuned.

Want to learn more about Data Intelligence?

Read our next blog in the Data Intelligence series

Related resources

Blog

12 Steps to Data Intelligence: Part 1

View all resources

Want to learn more about Data Intelligence?

Read our next blog in the Data Intelligence series

More stories like this one

Nov 14, 2024 - 4 min read

Collibra named a Leader in IDC MarketScape: Worldwide Data Intelligence Platform...

Read more
Arrow
Sep 18, 2024 - 4 min read

How Collibra innovation leads the way for customers to do more with trusted data

Read more
Arrow
Mar 11, 2024 - 3 min read

Do more with trusted data: Join us at Data Citizens ’24

Read more
Arrow