
As digital health outpaces regulation, tokenization offers a scalable path to privacy and trust, enabling advancements that support regulatory, patient, and consumer priorities.
Healthcare organizations, life sciences innovators, and many others are all working toward building a health information ecosystem that weaves together patient data such as clinical data, insurance claims, and information from wearable devices into a unified, patient-centered network. These initiatives often aim to accelerate interoperability for the purpose of enabling breakthroughs in clinical care and research, or reducing healthcare costs.
There are shared concerns that this rapidly expanding digital health landscape—which includes AI developers, large technology companies, third-party app developers, and other new and non-traditional health industry participants—is potentially outpacing the regulatory guardrails designed to protect it.
We’re confronted with a central policy challenge: how do we unlock the immense potential value of health data while addressing valid concerns over patient privacy and data misuse?
The answer to this challenge requires pairing sound regulation with technical innovation.
The first step is to integrate proven Privacy-Enhancing Technologies (PETs) such as tokenization into the nation’s health data infrastructure. In doing so, policymakers can align privacy goals with operational reality, ultimately creating systems that are secure, trustworthy, and interoperable. (Briefly: tokenization replaces direct patient identifiers with irreversible tokens so that records can be linked for care and research without exposing the underlying identifiers; more on this below.)
The current policy conversation around health data privacy is driven by several known challenges that undermine consumer trust and create regulatory uncertainty:
Privacy-Enhancing Technologies (PETs) offer a class of technical solutions designed to minimize data exposure while preserving its value for care, research, and public health. Among these, tokenization stands out as a proven and widely adopted method to support privacy-preserving record linkage (PPRL), and is already in use across health systems, research networks, and regulated data exchanges.
Tokenization is a process that replaces sensitive identifiers (such as names, dates of birth, or Social Security numbers) with encrypted, irreversible tokens. These tokens can be used to link patient records across systems without ever exposing the underlying direct identifiers (under HIPAA, these are identifiers of Protected Health Information, or PHI; outside HIPAA, similar identifiers are often referred to as personally identifiable information, or PII). In HIPAA terms, tokenization facilitates creation of data sets that can meet de-identification standards under 45 C.F.R. § 164.514 where appropriate.
This process creates site-specific tokens, meaning a single patient has a unique, encrypted token in each data holder's system. This is achieved by applying a site-specific encryption key to a "Master Token," which is itself created through an irreversible hash of the patient's direct identifiers. The software can be installed and run locally behind an organization's firewall, ensuring sensitive patient information never leaves its secure environment. When cross-organizational data linkage is needed, a controlled token transformation under bilateral approval with both audit logging and least-privilege access enables connection of patients across datasets.
Datavant’s patented implementation operationalizes these principles by generating consistent tokens for the same individual, enabling high-quality record linkage across disparate systems (e.g., connecting a hospital encounter with subsequent pharmacy fills) without exposing direct identifiers.
In over 270 clinical trials, Datavant is tokenizing data—including Phase I and Phase II studies—enabling long-term follow-up, real‐world evidence, and regulatory applications without exposing PHI/PII.
To build a digital health system that is both trusted and resilient, federal policy must establish clear expectations for the use of PETs.
The goal is not simply to promise privacy, but to deliberately engineer it into the systems we build. U.S. policymakers should collaborate with technical experts and standards bodies to formally integrate tokenization and other proven PETs into federal health data frameworks. Implementation guidance (e.g., procurement criteria, grant conditions, and technical profiles) should specify required tokenization capabilities and controls. This will translate the nation’s privacy principles into enforceable, operational safeguards.
By doing so, we can ensure the United States builds a system that is secure, trustworthy, and prepared for the future.
Learn more about Datavant’s approach to maximizing data utility while protecting patient privacy.
Looking to map the full patient journey, optimize commercial data spend, boost adherence, orreduce never starts?
Our experts partner with life sciences organizations to compliantly connect disparate datasets and unlock insights that:
We'll tailor a session to your goals and explore how connected data drives better patient outcomes and stronger commercial performance for you.
Let’s talk about how to connect your data and unlock its full potential.
Explore how Datavant can be your health data logistics partner.
Contact Us