Blog /

Elevating Patient-Centered Research with Registry Data: Insights from Angela Dobes of Crohn's & Colitis Foundation

Publish Date
Read Time
February 29, 2024
Angela Dobes, Senior Vice President of IBD Plexus at Crohn's & Colitis Foundation

In our Ecosystem Explorer Series, we interview leaders from organizations who are advancing access to health data. Today’s interview is with Angela Dobes, Senior Vice President of IBD Plexus at Crohn's & Colitis Foundation.

Angela is the Senior Vice President of IBD Plexus, a revolutionary platform that redefines how researchers study inflammatory bowel disease to advance precision-medicine strategies and cut years off the R&D timeline. She strives to elevate the important role registries and biorepositories play in the pursuit of high-value research where her work has been recognized nationally by organizations such as the National Academies. Prior to her experience at the Foundation, Angela worked for clinical technology and pharmaceutical organizations, where she led implementation of various technology solutions focused on R&D optimization and accelerating the delivery of new therapies to patients safely.

Introduction to registry data

Welcome to the Ecosystem Explorer Series! To set the stage, what are patient registries, and how is registry data used in healthcare research?

A registry is a specialized research infrastructure that enables the collection of data about a defined population in a standardized format for scientific, clinical, or policy purposes, fostering new knowledge and innovations. A population can be defined by a particular disease, by an exposure to a health care service or product, or by a community, for example, a community that has been marginalized from health care.

Registries fulfill a broad range of purposes across public health and medicine. As registries evolved from siloed, stand-alone efforts to connectors of prospective research and real-world data, I have been fortunate to see how the acquisition of longitudinal, multimodal data fused with broad applicability gives registries a unique vantage point to accelerate answering many key questions across the research and development continuum. When built collaboratively, registries unite key stakeholders such as researchers, health care providers, and importantly, patients and caregivers who are all motivated by their common drive towards better care and treatments.

I’m excited to share more about the work the Crohn’s & Colitis Foundation does in the registry space and want to thank Datavant for giving me the opportunity to elevate the important role registries play in the pursuit of high-value and high-quality patient-centered research.

It’s our pleasure, and thanks for the overview! Can you tell us a little more about the Crohn's &  Colitis Foundation and your involvement with registries?

The Crohn’s & Colitis Foundation is the leading nonprofit organization focused on both research and patient support for inflammatory bowel disease (IBD), with the mission of curing Crohn’s disease and ulcerative colitis and improving the quality of life for the millions of Americans living with IBD.

As the hub of IBD for over 50 years, we have a deep understanding of the current and emerging state of IBD. We believe the medicine of the future will be personalized, evidence-based, comprehensive and accessible. To accelerate this vision, the Foundation built IBD Plexus®, a data platform and research ecosystem, to collect longitudinal, multimodal data from a representative, diverse patient population to capture a comprehensive picture of disease and facilitate next-generation discoveries that can scale to reach patients' everyday lives.

IBD Plexus efficiently connects datasets across medical settings (clinics, hospitals, and labs) and the home through longitudinal research and quality studies, creating important opportunities to connect not only research and care but also the patient experience. These data points, integrated across decades of patient journeys and linked to biosamples, are transforming research into life-changing precision-medicine strategies.

That’s a bold mission and certainly not an easy task. There are many different types of health data – clinical trial data, real-world data from EHRs and labs, consumer data and social determinants of health, and so on. What makes registry data uniquely valuable for healthcare research?

In a world where gold standard clinical trial infrastructure does not equally fit the needs of patients, it is important to expand the evidence generation toolkit. To get better patient outcomes, researchers need better data. Registries are often established to fill a data void and unlock answers to multiple research areas of interest such as unmet medical needs, hard-to-study sub-populations, and long-term drug safety and efficacy.

Patients and clinicians need a better playbook to identify the most effective, appropriate treatment – which is then also accessible. For chronic diseases, like IBD, that follow a life-long, unpredictable disease course of flare-ups – or a period of temporary intense symptoms - advanced data collection methods are required to close the feedback loop between treatment and its effect. Flares do not only show up at the clinic. It is critical to collect data that reflects the patient experience to add depth and context to clinical and functional outcomes.

In addition, since the FDA does not require an understanding of the mechanism by which the drug or biologic acts, registries that collect multimodal molecular data are critical to better understanding the etiology of disease. This biological understanding is the first step to breaking the current therapeutic ceiling through developing new products and better using our current treatment options.

The 21st Century Cures Act marked a transformational shift toward patient-centered research to rethink evidentiary requirements to serve the needs of patients and better inform clinical practice. There’s never been a more exciting time for registries!

The applications and impact of registry data

What are some applications of both de-identified and identified registry data?

When sharing patient-level data with researchers, data must meet HIPAA de-identification requirements. However, when done in a transparent and secure way, there are benefits for registries to have access to both de-identified and identified patient-level data. When designing a registry, it is important to think about its purpose and the questions the registry is designed to answer to select the right infrastructure to safeguard data while developing the necessary consent language to permit authorized use of identified and de-identified information.

For IBD Plexus, participants’ consent allows the Foundation to house both identified and de-identified data within IBD Plexus. Release of de-identified data to qualified researchers across academia and industry is governed by a data use agreement which authorizes what can and cannot be done with the data. Access to identifiable data allows the Foundation to link data across multiple data sources and research studies that the Foundation manages and to recontact patients for additional activities such as participation in a future research study, including connecting patients to a clinical site that is participating in a clinical trial of interest.

Importantly, we believe the future of research data is longitudinal and multimodal where patients whose data is housed in IBD Plexus also consent to allow the Foundation to link to external datasets and create tokens to link to additional data such as claims data available in the Datavant ecosystem. The value of the data linkage through mechanisms such as tokens is more than additive, it is compounding. The longer and deeper the data is collected, the more valuable the data becomes for research.

Are there notable examples of research breakthroughs in which registry data played an important role? For instance, the Framingham Heart Study revolutionized our understanding of cardiovascular disease risk factors; have there been similar landmark studies in other areas of healthcare that were enabled by your registry program?

There are many similarities between registries and longitudinal cohort studies as they both observe patient populations over a long period of time to gain a deeper understanding of a disease within a  population. For example, one of the Foundation’s research initiatives that contributes data to IBD Plexus, SPARC IBD (Study of a Prospective Adult Research Cohort with IBD), modeled itself after the Framingham Heart Study. SPARC IBD began enrolling and following adult IBD patients in 2016. We now not only have over 7 years of prospective data but, due to patients’ interest in linking study data to real-world data from medical records, we have decades of clinical information on many of the enrolled patients. This longitudinal data linked with molecular multimodal data is helping researchers answer some of the most pressing patient questions, including “which treatment will work best for me?”

While the available treatment options for IBD have expanded rapidly over the past 2 decades, there continues to be significant rates of primary non-response, secondary loss of response, and adverse reactions to drugs. For IBD, there are no known biomarkers to predict if someone will respond to a treatment. One of the biggest challenges in diagnostic development and identification of new drugs is the lack of high-quality, well-annotated biosamples matched with deep clinical data. The data and biosamples housed in IBD Plexus are making great strides to address this hurdle and develop a better treatment decision playbook for patients. Let me share two exciting examples.

Despite aminosalicylates (5-ASAs) being one of the most frequently prescribed therapies available for ulcerative colitis, treatment failure is common, and many patients would be better off initiating a different treatment. Researchers leveraged IBD Plexus data to validate a microbiome signature that could be used to predict which patients are likely to fail 5-ASA treatment. This research, which was published in Nature Medicine, advances the possibility of microbiome-based personalized medicine.

IBD Plexus is also reducing the barrier for scientists to enter the IBD field. For example, a neuroscientist funded through the Foundation’s IBD Ventures program accessed IBD Plexus data and biosamples to generate evidence that a well-established therapeutic target in neurological diseases, GCPII, is also implicated in gut inflammation. With funding from Blackbird Lab, the novel drug target is moving forward into clinical trials.

It’s great to see new technologies reducing barriers to scientific research. How does the Crohn’s and Colitis Foundation approach data & research collaborations with sponsors and/or other collaborators?

Scientific advances take time, but patient organizations are impatient. Powered by the Foundation’s urgency to find cures, we built IBD Plexus to dramatically accelerate scientific innovation by giving researchers expedited access to groundbreaking information, mined from electronic health records, clinician case-report forms, images, genetic and molecular data, and patient experiences.

As the hub of IBD, we have the unique ability to connect patients, researchers, clinicians across academic and community centers, and technology partners, to redefine how the research community studies IBD. This data depth and scientific know-how fuels research excellence. Today, over 200 projects are being conducted by world-leading scientists from academic medical centers, pharmaceutical companies, and many of the most promising biotechnology companies with the same goal – improve the lives of patients living with IBD.

The Foundation has also  leveraged IBD Plexus data and biosamples to address unmet needs such as the development of a diagnostic tool to predict pediatric patients at risk of developing disease complications and development of a disease progression model for adult patients.

IBD Plexus is the first real-world evidence solution to tackle problems holistically across the product lifecycle from discovery to clinical trials, through post-authorization safety studies to care guidelines and access. Our approach provides researchers with unparalleled data for a holistic understanding of IBD in a rigorous, reproducible way to get new discoveries into the clinic faster. By uniting patients and their data with best-in-class technology partners and the IBD research community, IBD Plexus has not only become an integral resource for research but has ignited a continuous cycle of innovation to advance life-changing precision medicine strategies and facilitate the development of safer and more effective treatments.

How does the Foundation's registry data contribute to research efforts in understanding and treating IBD?

One area where the Foundation made a large investment is disease progression in Crohn’s disease (CD), or, simply put, disease that continues to irreversibly worsen despite available therapies. Today, all CD’s treatments on the market are approved and deemed safe and effective for moderately to severely active disease. The FDA evaluates the effectiveness of these treatments based on outcome measures focused on short (8-12 weeks) and intermediate-term (1 year) measures of symptoms and endoscopic findings.  While these are important measures to evaluate a treatment’s effect on CD symptoms and on underlying mucosal inflammation, no correlation has been established between 12-month remission rates and the likelihood of disease progression. While disease progression is not a new concept, lack of access to a longitudinal cohort of people with over ten years of data has stalled work.

The IBD Plexus program is driving a paradigm shift by leveraging longitudinal, multimodal data to validate a definition and determine factors that may predict the aggressive course. The Foundation along with a multidisciplinary working group and leading RWE solution provider, Aetion, developed a disease progression score. The score can be used for downstream research such as drug target discovery and clinical trial enrichment. The disease progression score will be published and made available later this year.

That sounds promising! Looking forward to learning more once it’s published. Speaking of clinical trials, do you partner directly with biopharma organizations to support clinical research & development?

An exciting aspect of IBD Plexus is how we are inspiring the research community to rethink how patients are matched with clinical trial opportunities using a real-world data approach. A few examples: A pharmaceutical company implemented a biomarker strategy into their phase 2 clinical trial. They leveraged IBD Plexus data and biosamples to better understand the prevalence of biomarker in a real-world population to then make a go/no-go decision to include the biomarker in their phase 3 trial. Industry members also took advantage of our clinical trial site identification offering to leverage our data querying and site analytics capabilities to identify sites for clinical trials. Our newest offering in the clinical trial space is post-authorization safety studies to address regulatory needs.

The constraints and challenges with registry data

What are the biggest challenges when it comes to building a robust registry suitable for research? How has the Crohn's & Colitis Foundation overcome some of these challenges?

Many registries want to collect and generate a lot of data to answer the questions of not only today but tomorrow and do it in a way that the data is manageable and trusted. Here’s how IBD Plexus approaches amassing a large dataset that is research-ready, fit-for-purpose, trustworthy, and reproducible.

The heroic data collection and generation effort is not possible without our network of patients and clinicians. It is of utmost importance that the research generated from IBD Plexus is representative of the entire patient population. We know that patients have different needs, priorities, and capacities to engage in research. We built different front doors to ease participation and make research fit as seamlessly as possible into everyday life by providing opportunities to participate in research at home and at time of care such as during an office visit and at key medical events such as endoscopy. Making it easier for patients and clinicians to participate in research is key to collecting longitudinal, comprehensive, multimodal data.

To ease the experience of participating clinicians, we built a Smart Form into the electronic medical record and only ask to collect information that is absent from the medical record, such as disease activity. For everything else, we pull information from real-world data like electronic health records, including medical notes. Some sites have even qualified the Smart Form to be used to generate clinical notes! It’s been great to see how our strategy reinforces how complementary traditional real-world data sets are with prospective research programs to collect well-characterized information that reflects the lived patient experience. Fostering relationships with our patients and clinicians is of utmost importance, and we are working to continue to improve our data collection process and lower the burden of participants. For example, we’re exploring the use of advanced large language model tools to extract critical information from the clinician's notes in a smart and semantic manner to ease use of data for research. Sustainability is always top of mind for registries. Funding infrastructure is not sexy and hence infrastructure investments are hard to get. We believe our approach is the most cost-effective approach to building registries.

In addition to collecting data, we also collect finite biosamples. Striking a good balance between generating data from high-impact assay and “saving” biosamples in freezers until the next big assay comes around is important. (Samples sitting in freezers long-term is not patient-centric!) One of the critical success factors of IBD Plexus was to optimize the use of samples across the researcher community by turning finite biosamples into a reusable asset by converting samples to data.

During the design of IBD Plexus, it was critical to figure out how to not only optimize the use of biosamples but also ensure researchers trust that the derived biosample data is of the highest value and is reusable. Our stakeholders expressed selection of a central biobank was crucial for them to trust that the derived biosample data was high quality. The central biobank allows for standardized sample processing and provides the necessary sophisticated laboratory information management technology to link the biosample-related data with the derived molecular data to advance science. In addition, the Foundation selected best-in-class clinical and molecular labs that specialize in different areas to ensure standard conduction of the different lab and molecular analyses. A crucial component to being able to easily share data derived from biosamples is creating uniform datasets by ensuring the same assays and protocols are used.

Real-world, multimodal data is complex. It was important for the Foundation to offer data science services and support. Our data science team works symbiotically with researchers and act as data navigators to help researchers understand disease and data collection context and maximize usage of data for research. In addition, we have created data products to help researchers piece together the patient journey across clinical, patient-reported, molecular and imaging data and data scripts to help with key analysis such as identification of accurate medication from EHR and patient-reported data. Data without context is noise. Putting a large focus on the growth of our data science core has accelerated researchers' understanding of the data to better harness the power of IBD Plexus.

Maximizing data utility while maintaining privacy is another key challenge for any health data type. How does The Crohn's & Colitis Foundation approach the balance between patient privacy and data accessibility for researchers?

Science is rooted in data. If the data used to generate evidence does not reflect a person’s biology, environment, and preferences, then medical innovations will not reach their full potential. The only way to ensure “you” are reflected in the treatment label and clinical guideline is to contribute data to research. Today, many patients’ digital health footprints are being built passively. It was important for the Foundation to give patients a voice and the opportunity to actively participate in data generation and sharing.

To maintain a balance between privacy and advancing life-changing precision medicine strategies, we have built out a robust information governance strategy to safeguard IBD data and patient interests that drive successful outcomes. Our governance includes:

  1. be transparent in how we use and share data – patient consent and authorization drive everything we do 
  2. development of IRB approved protocols to collect and ingest patient data both actively and passively
  3. safeguard data through implementing and maintaining a comprehensive data security policy, including cybersecurity, in a regulatory compliant manner
  4. establish data integrity principles to ensure data consistency and accuracy throughout its life-cycle (ingestion, storage and transit) to promote interoperability and reproducibility
  5. use of de-identified data for downstream research requires Foundation approval, including an executed data use agreement which specifies exactly how the data can and cannot be used  

Future opportunities/visions for registry data

In the years ahead, what trends or advancements do you hope to see with patient registries?

We hope to see patient registries continue to evolve and improve so that they’re able to have the biggest impact possible on patients’ lives. This means registries that are:

  1. Co-created with patients. De-identified patient-level data and biosamples are critical to advancing research innovations, but we can only achieve a critical mass by co-creating with patients.
  2. Longitudinal and multimodal. Understanding the lived patient experience requires access to long-term, prospective, and comprehensive data. It’s important to meet patients where they are in that journey (past, present, and future) including at pediatric and adult clinics, hospitals, diagnostic and research labs and the home.
  3. Pragmatic. Closing the gap between research and care can only be achieved by putting the lived patient experience first. Registries are central to a future where medicine is personalized, evidence-based, comprehensive, and accessible.
  4. Promoting the use of common data elements (CDEs). Registries need to elevate their role in the development of CDEs, including common outcomes definitions. CDEs not only enable standardized and consistent use of data to help address research reproducibility issues but to close the research and care gap. For example, different instruments are used to measure treatment value in clinical trials, care guidelines, and formulary decisions. Registries should play a larger role in helping to align patient value across the R&D, care, and access continuum. By the way, it was great to see Congress recognize the increasing importance of CDEs!
  5. Ecosystem driven. To achieve data depth, breadth, and reach, it is important that registries evolve towards an ecosystem model. Ecosystems allow registries to achieve more, faster and address unmet needs of the patient community too big or complex to be solved by one company alone.
  6. Solving for real-world data complexities. Next-generation registries will provide scientists with not the largest dataset but the right dataset, in a format that is analysis-ready and curated based on the disease type, disease state, patient characteristics, and research need (e.g., type biomarker / desired).

Are there any technologies that get you excited about the future of patient registries and the research they enable?

With the convergence of multimodal data, emerging biotechnologies, and artificial intelligence, the Foundation’s quest to find cures is within reach. It’s this convergence that has us excited for the future. Here are a few use cases that are top of mind:

  • Use of AI to extract complex and multimodal data to build a more comprehensive longitudinal patient journey that not only includes clinical and patient-reported data but can handle discrete and continuous molecular data
  • Use of AI to interpret medical images and videos beyond human capability
  • Al-facilitated target discovery and treatment modeling
  • Lastly, the use of AI workflows and cloud computing to enhance virtual research workspaces to drive increased collaboration and productivity at scale

As the Foundation looks to partner with companies to capitalize on innovative technologies to enhance IBD Plexus core offerings and create complementary IBD Plexus products and services, these use cases are top of mind.

Thanks so much for sharing your insights and vision on registry data! Any recommendations for our readers if they want to learn more?

Absolutely. Here is a great guide to learn about registries in general.  Additionally, this FDA guide offers some advice for anyone looking to set up a new registry or use an existing one to help make regulatory decisions regarding a drug’s effectiveness or safety.

Visit our website to learn more about IBD Plexus.

For more information, visit, call 888-694-8872, or email


Connect to the Nation's Largest Health Data Ecosystem

Request a demo

Achieve your boldest ambitions

Explore how Datavant can be your health data logistics partner.

Contact us