Real-World Data Overview
Both RWD and RWE have many valuable uses in healthcare, including patient recruitment for clinical trials, comparing drug efficacy, and monitoring drug safety. However, it is important to understand the key differences between these two concepts.
What Is Real-World Data?
Clinical trials seek to answer specific questions in a controlled environment to earn regulatory approval of an investigational drug or device. Typically, the information gathered from these trials lacks data from a real-world environment. Therefore, post-market studies are often critical to understanding patient adherence and clinical efficacy in the real world, outside of a controlled clinical study.
The pharmaceutical industry has traditionally used randomized controlled trials when seeking approval of therapies, but the U.S. Food and Drug Administration (FDA) has been developing a framework and issuing guidance to support the use of real-world evidence—enabled by RWD—in regulatory decision-making. Additionally, technological innovations that protect patient privacy have expanded the possible sources of real-world data available to researchers immensely.
RWD comes from a variety of sources outside of traditional clinical trials. Researchers routinely collect patient data from sources such as:
- Claims and billing activity
- Electronic health records (EHRs)
- Patient-reported outcomes
- Disease or product registries
- Biometric monitoring sources such as pedometers and smartwatches
Health data from these sources can provide a more comprehensive picture of the patient journey and experience, and can even provide an overview of population health. Although RWD can enable clinical evidence, it’s essential that the data is fit-for-purpose (i.e., relevant, valid, and reliable), which is determined based on the research question.
How Does Real-World Data Differ from Real-World Evidence?
Real-world data and real-world evidence are often used interchangeably, but they are two different concepts. RWE derives from the analysis of RWD and can provide valuable information about the risks, benefits, and use of a therapy. Real-world evidence helps accelerate the approval of new therapies, especially in oncology.
What Is the Value of Real-World Data and Real-World Evidence?
By using sources of patient health data, researchers can evaluate therapies in a larger population, in real-world conditions, and at a lower cost than with typical clinical trials.
RWD has the potential to provide information about a more diverse population than the typical clinical trial participants. Therefore, researchers can get valuable efficacy and safety information on a more representative population than they can from a randomized clinical trial.
RWE provides a more comprehensive view of how a therapy will work in a real-world setting. Researchers can evaluate the therapy while factoring in other variables such as comorbidities, demographic groups, and age groups, among other parameters. Most importantly, RWE helps researchers develop a better understanding of the long-term use of the therapy beyond the clinical trial period.
Example Use Cases of RWD and RWE
Some ideal use cases for RWD and RWE include regulatory requirements and deciding on a treatment plan for patients.
RWE can help support regulatory requirements to expand on a therapy’s indication without performing a full additional clinical trial. For example, if a product is often prescribed for off-label conditions, companies may use RWD to study patient outcomes and therapy safety and then submit this information to regulators for market authorization.
Healthcare providers can use RWD and RWE to better inform a patient’s treatment plan, procedures, tests, and prescriptions, and these data may help develop practice guidelines. For instance, during the beginning of the COVID-19 pandemic, public health officials needed to rapidly evaluate and share information on the prevention and treatment of COVID-19. Much of the information gathered during this time leveraged RWD.
Real-World Data Ecosystem
Numerous types of patient data gathered from multiple sources can be useful for generating RWE. To develop a deeper understanding of how RWD can be used—and how it can add value in healthcare—examining key data types and sources can be instructive.
Real-World Data Types
Health data can be pulled from numerous sources to provide valuable insights into the patient journey.
Claims data results from processing a healthcare claim. Two types of claims data include open and closed. Open claims datasets come from claims clearinghouses or providers’ revenue cycle management systems. They cover a large scale of patient lives, but may not represent complete claims coverage for a given patient.
Closed claims come from health insurance plans or self-insured employer groups. They tend to cover a smaller scale of patient lives, but represent complete claims coverage for a given patient during the time that patient was on the insurance plan or worked at the employer. Claims data is longitudinal in nature and captures a long period of the patient journey, but it does not have as much depth of clinical detail about a particular medical encounter as other data types.
Because closed claims datasets are very comprehensive, they prove ideal for health economics and outcomes research (HEOR) that considers a patient’s journey, resource utilization, and the economic burden of their condition. Open claims datasets prove less useful for HEOR, due to their incompleteness for a given patient. Given the large scale of patients covered as well as lower data latency, open claims datasets can prove useful for marketing use cases.
Claims data is even more powerful when used in tandem with clinical metrics such as lab data, EHR data, or patient-reported outcomes. The combination of these data sets can provide deeper insight into symptoms, disease progression, and clinical outcomes.
Laboratory and Genomics Data
Lab testing data proves valuable for a variety of use cases in healthcare analytics, from market sizing to monitoring disease progression to finding biomarker signals of patients eligible for certain therapeutics. Lab data can provide a deep point-in-time clinical and biochemistry profile of a patient, but isn’t as longitudinal as claims data.
Genomics data is a specialty area of lab testing currently growing in popularity within healthcare analytics given the increase in biomarker-targeted therapeutics. Genomics data proves useful for both clinical development use cases in which scientists may employ genomics data to inform biomarker selection, or in commercial use cases in which genomic results may provide input towards building a predictive model to find patients eligible for a biomarker-targeted therapy.
Pharmacy data provides information about which therapies patients have been prescribed and filled at a pharmacy. It can give insight into how therapies change over time. Pharmacy data proves extremely useful for specialty drugs, which now account for approximately 75 percent of prescription drugs in development. A network of specialty pharmacies, contracted by pharmaceutical manufacturers, typically distributes specialty drugs. The manufacturer will then aggregate real-world data from specialty pharmacies to understand real-world prescribing, dispensing, and medication adherence patterns.
Electronic Health Records (EHR)
As patients move throughout the health system, valuable real-world data is collected as part of their electronic health records (EHRs). EHR data contains richer clinical detail than claims data, but a patient may visit many providers across different care settings, using different EHR systems. This makes finding a single EHR real-world data source very unlikely. EHRs contain data on appointments, medical history, diagnoses, symptoms, medications prescribed, labs, and chart notes. These data are important for gaining a more granular understanding of clinical patient outcomes.
Even though EHR data is valuable, it requires significant curation and cleaning because much of the valuable information may reside in unstructured physician notes.
Generally, the information recorded as part of a patient’s EHR—whether they’re in an inpatient setting, outpatient setting, or a specific therapeutic area—includes:
- Procedures performed
- Vital Signs
- Laboratory results
- Medication orders
- Medications administered
- Patient surveys or questionnaires
- Surgical care information
- Social history, such as smoking status
Note that electronic health records on their own may not contain all of the necessary RWD, so researchers may be required to seek additional sources of data.
RWD plays an important role in answering a variety of research questions surrounding cancer. One of the top priorities in research is generating accurate evidence on the efficacy of cancer prevention, diagnosis, and treatment in a real-world setting.
Researchers often study cancer treatments in a select population in a clinical trial setting. However, researchers can also collect and analyze real-world oncology data to provide RWE on the efficacy and tolerability of new treatment methods in the real world. The main sources of real-world oncology data include:
- Specialty data providers and networks
Each state legally mandates central cancer registries, thus providing a census of all the patients who have cancer within a defined geographic area. Because of this, as well as the capture of detailed exposure information such as diet or physical activity and patient-reported outcomes, these registries provide unique information because data comes from a non-random group of people.
Limitations of cancer registry data include a lack of information on outcomes other than survival as well as long-term treatment. Addressing these limitations requires new initiatives such as linking registry data with data from other organizations. The new initiatives, as well as real-time access to pathology reports, provide opportunities to supplement the understanding of therapeutic advances and impact outside of clinical trials
Recently, researchers have increased the demand for consumer data. This information can provide additional context about a patient population, such as:
- Socioeconomic status
- Languages spoken
These data come from consumer data companies and have traditionally been used for targeted marketing. Note that consumer data companies only have data on adult consumers.
Social Determinants of Health (SDOH)
Social determinants of health (SDOH) are the conditions in which people are born, work, live, play, age, and worship. These have a large impact on peoples’ health, quality of life, and functioning. Some examples of social determinants of health include:
- Polluted water and air
- Access to healthy, nutritious foods
- Physical activity opportunities
Social determinants of health contribute to health inequities and disparities. For example, those who don’t have access to grocery stores that carry healthy foods may not have good nutrition, which can lead to obesity, diabetes, and heart disease. Data on SDOH can provide insights to address health disparities and health equity.
Real-World Data Sources
Different types of data providers are relevant for different situations. Three types of data provider categories exist:
Data platforms provide a technology platform that has intuitive user interfaces (UI) for analyzing data within the platform. These companies have data science and data engineering teams that clean and standardize continuous streams of data coming into the platform, and combined with the UI layer, can be considered user-ready.
In many cases, the platform provides limited ability to export data for use. Working with a data platform is best for companies without data analytics or data engineering capabilities.
Data aggregators offer cleaned and standardized data that have been aggregated from many underlying sources. Typically, a technology platform or user interface overlay doesn’t exist, and the data are available to license as a one-time or continuous data feed.
These data are analytics-ready. Companies working with a data aggregator need to have data analytics or business intelligence analysts who can manipulate the data into analysis, but they do not need to have sophisticated data engineering to clean and standardize the data.
Data originators are closest to the source. They have the most granular and detailed data, but they do not clean it. These data requires the application of sophisticated data engineering capabilities before it can be analytics-ready.
Real-World Data Solutions Providers
The RWD ecosystem includes both real-world sources, as described above, as well as solutions providers that have built analytic and workflow solutions on top of real-world data. Many platform companies are also solutions companies, having built specific data views and analytic tools that provide solutions for specific use cases.
Common commercial analytics and clinical development solutions built on top of real-world data include:
- Specialty pharmacy aggregation: These companies aggregate specialty pharmacy data on behalf of pharmaceutical manufacturers to monitor therapy launches. Specialty drug data is proprietary data of pharma companies that may link to other real-world data such as claims for a longitudinal view of the patient journey.
- Outcomes and patient journey: These companies enable outcome studies and patient journey research. Many of these companies build their solution on top of aggregated and linked claims data to enable a comprehensive view of patients as they move through the healthcare system.
- Commercial triggers: These companies provide triggers to commercial teams at pharmaceutical companies to alert them when a patient eligible for a specific therapy sees their provider, so sales teams can be deployed to the provider’s office for education on the relevant disease or therapy. Use of this solution is especially common in rare diseases since providers are often unaware of the rare disease and patients can be hard to diagnose.
- Digital marketing: These companies identify relevant patients and providers and then serve up digital advertising to educate them on a disease or therapy.
- Commercial analytics and insights: These companies provide aggregated data, usually claims or EHR, to help commercial teams with strategy and insights before and post-launch.
- Trial recruitment: These companies use aggregated data—typically EHR, lab, and claims data—to identify the ideal clinical trial sites that have sizable populations of patients who would meet inclusion/exclusion criteria for a trial.
- Synthetic control arms: Synthetic control arms, also known as external control arms, are studies in which real-world data is utilized as the control arm rather than enrolling actual patients into a control arm where a placebo or standard of care (SOC) is utilized.
This is popular in disease states where patient populations are increasingly sub-stratified by biomarker status (e.g., oncology, rare disease), given the challenges of recruiting enough patients as well as the ethical considerations of placing patients on placebo or standard of care (SOC). Companies that provide these solutions often have deep and highly curated clinical and genomic data to conduct synthetic control arms. Synthetic control arms lower trial costs, increase efficiency, and increase the speed of therapies to market.
- Decentralized clinical trials: These companies provide technology infrastructure to collect data and support decentralized clinical trials (DCTs). DCTs are trials where patient communication and data collection has been decentralized away from a traditional clinical trial site. Instead, remote and digital technologies communicate with study participants and collect their data.
Major Use Cases for Real-World Data in the Healthcare Industry
RWD brings a lot of value to different organizations in the healthcare industry, from life sciences to payers to public health agencies.
Biopharmaceutical organizations can use RWD across the entire drug development lifecycle, from pre-clinical to clinical development to commercial planning and post-marketing monitoring.
In the pre-clinical and clinical development settings, organizations can use RWD for:
- Biomarker selection (pre-clinical)
- External control arms (clinical development)
- Long-term follow up
- Confirmation of patient medical history for clinical trial enrollment
- Additional analysis of patients’ social determinants of health
During the commercial phase, organizations can use RWD for:
- Market access strategy
- Salesforce planning
- Monitoring launch effectiveness
- Drug efficacy comparisons
- Evidence generation to support reimbursement
- Commercial targeting
Payers can use RWD to:
- Assess and validate value-based contracts
- Improve risk adjustment calculations
- Develop a holistic, longitudinal view of applicants
Payers often use RWE to inform comparative efficacy in a real-world setting after a drug has launched to validate coverage. According to recent research, approximately 85 percent of pharmacy administrators reported using RWE to make formulary decisions in oncology for comparative efficacy when clinical trial data wasn’t available.
At the individual patient level, real-world data and real-world evidence are often used for decisions on:
- Test orders
- Prescriptions for patients
RWD and RWE can help healthcare providers create targeted treatment plans for patients. Within the larger hospital system, providers may use them to inform the creation of practice guidelines and the further adoption of these guidelines.
Data and Analytics
Healthcare is becoming more digital due to innovative technologies and the demand for real-world data. Organizations increasingly depend on RWD and RWE to develop analytics, machine learning, and artificial intelligence (AI) applications.
For example, RWD can:
- Train AI models and predict populations at risk of a particular disease
- Identify better treatments
- Understand patient prioritization
- Improve marketing precision
- Understand patient behavior
Clinical Research Networks
Clinical research is the process to determine the efficacy and safety of new treatments. Clinical research should directly impact patient care–bringing insights from “bench to bedside”. Clinical research networks aim to have greater impact by collaborating across many health institutions either regionally or nationally to promote clinical research, for example: building large, diverse patient pools with RWD to enhance patient recruitment.
RWD has proven to be instrumental for understanding efficacy and safety of treatments for COVID-19 as well as post-acute sequelae SARS-CoV-2 infection (PASC), known colloquially as “long COVID”. Creating a national network for COVID research has led to scientific and operational efficiencies and led to faster discoveries and improvements that make a difference in peoples’ lives.
In government applications, RWD and RWE provide benefits for regulatory agencies such as the FDA and the European Medicines Agency (EMA). RWD and RWE can be employed alongside randomized clinical trial evidence for post-market safety monitoring, adverse event signal detection, and marketing authorization.
In 2008, the FDA adopted the Sentinel Initiative to assess approved product safety by integrating nationwide claims and EHR data. The FDA is the primary user of this system, but the system also provides valuable information to researchers and biopharmaceutical companies.
RWD also provides value in the regulatory approval process, in addition to monitoring side effects. For example, physicians can prescribe therapies off-label in the U.S., but the regulatory label of the therapy in question can determine coverage decisions and even how many patients will be able to receive treatment.
Due to RWD, the FDA is increasingly expanding regulatory labels to allow more patients to receive treatment. For example, a therapy initially approved only for women with ER+/HER2- breast cancer was later approved for use in men because of patient outcome data reported in that patient population’s EHRs.
Challenges of Real-World Data
As important as RWD is, it also presents a number of challenges. A plethora of patient data exists, but before researchers can use and analyze it, the data must be de-identified for patient privacy. Additionally, because RWD must be fit-for-purpose, finding the right, relevant data for the applicable use case can prove challenging.
Navigation of the Expanding Data Landscape
The expanding availability of data is creating the demand for additional data, especially specialized data. As more data becomes available, it opens up the possibility of more comprehensive analyses.
The availability of more specialized data doesn’t mean it’s the right data. The right data has become increasingly difficult to find, and finding the right data partner presents a bottleneck to data-sharing.
It’s essential to provide partners in the healthcare ecosystem with the necessary data-sharing tools. Then, it’s essential to show data users where to find the necessary data for their particular use. Assessment tools that better facilitate data exploration, segmentation, and overlap comparison help with analyzing and sharing data.
As data continues to be generated, it also introduces new patient privacy risks, resulting in demands for data that protects patient privacy while maintaining transparency and accountability. The key? An ecosystem of technology companies that allow data management, governance, and data application.
Data-sharing technologies are an unmet need in real-world data. Ideally, every data partner should have control over their records as well as confidence in the integrity of their data. Data providers and users each want something different. Providers want to keep their competitive advantage and keep data independent of their competitors and peers, while users want to easily analyze data without being tied to a specific provider. Both providers and users want transparency.
The solution is a trusted third-party data enclave that doesn’t buy or sell data and has security and privacy as the highest priorities. Collaboration is key to seamless partnership across the RWD ecosystem.
Data Standardization and Quality
RWD is often incomplete and non-standard. Data is collected in many different formats. Many standard data models exist, but they apply to different types of data. This results in data recipients spending many resources to standardize the datasets before they can be analyzed.
Real-world data is often incomplete, which affects the accuracy later on because recipients can’t know that the data reflects the entire patient journey. It’s unclear if the patient outcomes didn’t occur at all, or if the data simply didn’t capture them.
However, as more and more data is generated, the opportunity exists for companies to collaborate on data cleaning, harmonization, and imputation.
Data is generally structured data, which has a standardized format and follows an order. As the health data ecosystem expands, however, much of the data coming online is unstructured data.
Unstructured data has enormous potential. For example, clinician notes may better describe a patient’s medical history or quality of life. Genomic sequencing may provide better insights into the benefits of precision medicine.
Acquiring the right data proves challenging because of the difficulty in determining which information is relevant to the use case. Additionally, deriving insights from unstructured data can prove difficult because it requires complex programs to process. An unmet need exists for technologies with the capability to apply data inputs to unstructured data.
Patient Privacy and Data Utility
Patient privacy is essential to the use of RWD, but ensuring data utility may prove challenging. The key? Finding a balance between not compromising privacy and maintaining utility. The choice between de-identifying patient data through safe harbor versus expert determination depends on research objectives, patient privacy, and business needs.
Expert determination is generally ideal when striking a balance between utility and privacy because of the flexibility it offers. For example, experts may recommend redaction, removal, or modification of identifying data elements in the data set, whereas safe harbor removes a set of 18 predetermined values.
De-identification’s primary disadvantage is its time-intensive nature. It can often take months to complete, but the right expertise and technology can greatly accelerate the process while ensuring the data is fit-for-purpose.
The Future of Real-World Data
Expanding technology and changes in regulations provide plenty of opportunities for the expansion of RWD and quicker de-identification for patient privacy.
New Real-World Data Use Cases
A future use case of real-world health data involves advancements in genomics testing. Radiographic features can determine the genomics of a tumor in oncology. Clinicians can use this insight in real time to correctly diagnose and start the patient on the appropriate treatment.
This data advancement improves precision medicine and helps deliver more meaningful insights. However, the need to maintain patient privacy remains. Using advanced methods such as deploying synthetic data could meet patient privacy needs. Genomics data doesn’t just benefit oncology—it could lead to advancements in treatments for rare diseases and other therapeutic areas as well.
New Types of Data Available
New policies, scientific discoveries, and innovative healthcare technologies have increased the variety and volume of available health data.
Genomic sequencing combined with the increase of biomarker-specific drugs has led to increased genetic testing. Wearable technology and health apps collect data about users’ heart rates, steps taken, geo-locations, and more. Air quality, climate, and weather are even becoming more influential data points. For example, weather can predict the severity of an allergy season, pandemic spread, and even flu prevalence.
All of these factors have led to the rapid growth of RWD. In fact, healthcare data is growing faster than data from any other industry.
Recent Trends with RWD
The increase in health data has brought about the increased use of RWD and RWE in healthcare. Some recent health data trends include:
Demographics and SDOH
Government, health systems, and life science researchers are striving to understand the reasons for disparities and worse patient outcomes in vulnerable populations.
Growing Use of Genomics Data
Data science, machine learning, and artificial intelligence are empowering scientists to tackle questions once left unanswered—such as whether genetic alterations cause health conditions and whether lifestyle or demographic considerations are relevant to a disease. The ability to derive meaningful insights and patterns from genetic sequencing data and other health data is opening new paths for future progress in the field.
Growing Acceptance of Real-World Data
FDA real-world data guidance documents continue to be released, pointing to the growing acceptance of RWD for regulatory decision-making.
Registries capture specific variables related to various conditions, which are then validated to a higher standard than EHR data. This makes it ideal for clinical trials and preparing data for regulatory submissions.
Patient-Reported Outcomes Data
Data on patient outcomes is better for understanding the patient experience.
Data on COVID-19 Vaccinations and Variants
Since the beginning of the pandemic, information on long COVID-19 and the lingering impacts of COVID-19 on a person’s health has been in high demand.
Rare Disease Data
Patients with rare conditions often see numerous specialists and receive specialty drugs, so the patient journey may be fragmented and data spread amongst many partners.
Linking Proprietary Data with Real-World Data
To gain a comprehensive understanding of patient health, a recent trend in healthcare is to link proprietary pharma company data—such as clinical trials, aggregated specialty drug data, and disease or device registries—with real-world data offered by commercial data providers.
Tokenizing every clinical trial and health economics and outcomes research study can enable expedited partner identification for multiple studies at the same time.
Partner with the Largest Health Data Ecosystem
RWD helps accelerate and inform decisions that improve patient outcomes. New data, advancing technologies, and novel RWD use cases are leading healthcare to new frontiers.
Interested in joining us in improving patient outcomes through data connectivity? Contact us today for more information.