Tamr

Version Control for Data Clustering

The Tamr Unify product utilizes machine learning and AI to identify and merge entities from vast datasets and records, creating actionable business insights that were previously unattainable.

The new Version Control feature allows users to track and analyze record clusters over time, enabling them to review all changes to clusters before sharing them with other systems.

How can we build an experience to store and track data cluster changes over time?

Tamr’s entity clusters were previously non-deterministic, making it hard for users to track cluster statistics or review their history. With new stable cluster IDs and a fresh publishing concept, the product requires an updated user experience to view metrics and monitor changes, ensuring high data quality in each new version.

Company

Tamr

Awards

U.S. Patent #123456789

My Role

UX Product Design Lead

User Research

Concept Development

Prototyping and User Testing

Discover the core user problems and technological developments

In order to design for the next set of upcoming features, I wanted to first understand how people used the product today while the AI and machine learning algorithms were still in their non-deterministic forms. I also aimed to discover the users’ issues with the system and how we could better suit their needs.

To conduct this research, I primarily used qualitative research methods via in-person engagements to connect with users and empathize with their experiences.

Engineering partnership and machine learning improvements

Due to the technical nature of this set of features, I spent quite a bit of time collaborating with the machine learning engineers to better understand the technological implications and available upcoming features. Below are some pictures of our white boarding discussions, which often included other members from product and client-success as well.

Cluster Differential view

The Cluster Differential View helps users see which records have entered or left the cluster since the last published update. Clusters are displayed in the left side of the page, and their records in the associated right table. Blue records indicate additions, while red records indicate removals. Users can click on these records for detailed information about their current and previous entity clusters. They can also access more activity and metadata about each cluster, all via the right-hand sidebar.

After reviewing individual cluster changes, users can get a bird’s eye visualization of all cluster changes over time, with specific details around the current version of clusters versus the last published version of clusters. This allows users to confidently manage changes that affect many downstream business applications.

Experience Enhancements

In addition to the main cluster publishing experience, I also designed some general upgrades for the page to better suit the users’s needs based on earlier research and my own intuition.

Below are some screenshots of features like dual-panel review (referred to internally as “2—PANES!", an enhanced filtering experience, and an enhanced searching experience.

U.S. Patent 11,321,359 B2

The team and I applied for a design patent (in addition to technology patent) for this set of features, and we were successfully awarded with the valid patent granted on May 3, 2022. As the lead designer on the project, I was awarded as the first author.