Blog /

Datavant Announces Unstructured De-Identification Solution

Publish Date
Read Time
October 23, 2023

Today, we are announcing an Unstructured De-Identification solution that enables our partners to unlock new opportunities for unstructured free text health data.

Unstructured free text represents a critical source of patient insights, but challenges associated with protecting patient privacy often limit data access and lead to lost value. Datavant now provides a solution: combining accurate and specific data redaction with an accelerated Expert Determination to allow partners to more easily and compliantly realize the value in their unstructured text data.

Unlock New Opportunities for Unstructured Text Data

Datavant’s de-identification solution empowers our partners to share unstructured text data internally and externally, creating new commercial and research opportunities while protecting patient privacy and addressing HIPAA compliance.

Integrated with Datavant’s Privacy Hub—the leader in privacy preservation of health data, offering advanced privacy solutions powered by an industry-leading team of HIPAA experts—the new product unlocks the potential of unstructured health data by upholding stringent privacy standards, maintaining clinical utility of the data, and decreasing the time from data origination to de-identification and usage.

With Datavant’s Unstructured De-Identification Solution, partners can:

  • Access new commercial opportunities for unstructured text:
  • Create new HIPAA-compliant data assets built around unstructured text
  • Differentiate existing data assets with end customers by preserving rich and specific data
  • Enable faster, easier, and wider data handling across their business:
  • Leverage data in internal analysis projects while minimizing privacy risk
  • Unlock new opportunities for “data hungry” AI research and product development

Made possible by:

  • Removing risky identifiers with high accuracy and specificity – more than 99% recall and more than 95% specificity – to maximize privacy protection and preserve clinical utility
  • An accelerated Expert Determination experience through integration with Privacy Hub’s technology and industry-leading team of HIPAA Experts

The best way to unlock new opportunities for your free text data is to address HIPAA compliance with Datavant’s Unstructured De-Identification solution.

The Value of Unstructured Data in Healthcare

Today, the majority of patient insights come from structured datasets, because—compared to unstructured records—structured data is easier to de-identify, organize, and analyze. However, the majority of patient data is unstructured (estimates put the total portion as high as 80%). The result is that unstructured data remains an untapped resource full of potential learnings that, if unlocked, could ultimately drive better patient outcomes.

Unstructured text—such as EHR notes, pathology reports, and clinical trial narratives—is a particularly valuable example because it provides contextual insight into patient experience that is not captured in structured data alone. By illuminating the deeper context of specific events in a patient’s history, unstructured text helps accelerate important healthcare use cases, including:

  • Clinical Research & Patient Care: Unstructured text illuminates details crucial to understanding the impact of care interventions. Where a repeat structured diagnosis may indicate that a disease persists despite treatment, for example, visit summaries may show that treatment is working – the patient’s symptoms and quality of life are improving.
  • Artificial Intelligence Model Development: Unstructured text can be used to train better predictive AI models. Models can learn from transcripts, for example, to predict the behaviors or characteristics most likely to influence a patient’s response to therapy.
  • Operational & Administrative Efficiency: Insights from unstructured text can inform resource allocation and enhance billing accuracy. Medical coding, for instance, can be improved by mining unstructured data to better understand patient symptoms and determine the right diagnosis for payer reimbursement.

Using unstructured data to unlock any of these applications requires successfully navigating several challenges. First and foremost among them is ensuring that patient privacy is protected.

That’s why we developed our Unstructured De-Identification solution: to protect patient privacy while unlocking unstructured data value.

Tackling the De-Identification Challenge

Given the sensitivity of healthcare information, protecting patient privacy is imperative. Patient data must often be legally de-identified before it can be used in research and analysis, but de-identification of free text data under HIPAA’s Expert Determination methodology is uniquely challenging. This is primarily because identifiers are difficult to recognize and expert review is a lengthy process.

Datavant’s Unstructured De-Identification solution addresses the two primary challenges of de-identifying unstructured data:

  • Datavant locates and redacts high-risk identifiers while preserving crucial clinical information. Datavant’s redaction employs an ensemble large language model, fine-tuned on a wide breadth of medical records, to go beyond superficial text replacement, take into account context, and preserve data quality.
  • Datavant accelerates Expert Determination by applying a proprietary, cutting-edge statistical similarity assessment. This statistical comparison enables Privacy Hub experts to rigorously evaluate representativeness between new and prior datasets, expediting their review of the redacted data.

Datavant’s Unstructured De-Identification solution empowers healthcare organizations to unlock the full potential of their unstructured data. Whether partners are using unstructured data to facilitate cutting-edge research and product development or to create new opportunities for commercial data sharing, we’re working across our ecosystem to protect patient privacy and move healthcare towards a more data-driven future.

Privacy Hub by Datavant

Privacy Hub by Datavant is the leader in privacy preservation of health data, offering independent HIPAA de-identification analyses and Expert Determinations as well as advanced technologies and solutions to improve the quality, speed, and verifiability of the compliance process.

Learn about how Privacy Hub can address all your evolving privacy needs.


Unlock Value in Unstructured Data for Research and Analysis

Watch now

Achieve your boldest ambitions

Explore how Datavant can be your health data logistics partner.

Contact us