The All of Us Research Program’s vastly diverse and harmonized datasets set a new standard for precision health research. The pairing of EHR and genomics data is increasingly seen in other research programs. An American Heart Association (AHA) advisory discussed the promise of combining EHR data with biobanks in cardiovascular research. Similar efforts across diabetes, cancer, and other fields have indicated a growing trend.
And that trend has taken off for obvious reasons. When diverse populations are properly studied, comprehensive insights can promote more inclusive research and clinical trials by knowing who is predisposed to what risks. Researchers can also combine EHR and ‘omics data to create an even bigger universe of precisely matched longitudinal data.
Researchers can now create a fuller picture of real people. They can couple data from their medical history via EHR with everyday life via wearables and surveys, and with the building blocks of their very biology via genomics data. In turn, each dataset informs and complements the others: You not only understand who might need treatment, for example, but who might benefit from one treatment over another.
Still, challenges persist in the areas of consent, data quality, interoperability, and privacy. As researchers look to tap into this growing ecosystem of clinical plus genomic insights, they’ll need to address those concerns ahead of time.
If data is comprehensive, consent will be too. Researchers looking to use EHR, genomics, wearables, or other data may need to address those individual sources with participants in any instance where consent is required.
But informed consent requires that studies balance IRB requirements with the needs of research participants, who may not understand highly technical and scientific explanations provided in traditional consent materials. Confusion can prevent an individual from giving consent, especially when the consent process is lengthy, occurring across multiple data inputs. eConsent platforms have helped redefine what’s possible by making informed consent more comprehensive without making it more burdensome.
Developing patient-centric consent can help cut through the complexity and build trust, which should be at the epicenter of the consent process. An example of this is using video for consent to more easily inform prospective participants about the study. In addition to video, there are several other best practices to make the consent process easier for patients.
Depending on the type of research your program will perform, you may need a high degree of understanding of the regulations that govern the EHR data you are collecting, and participant rights. Regulations like the Food and Drug Administration’s part 11 of Title 21 of the Code of Federal Regulations (21 CFR Part 11) govern electronic data collection, and demand a high degree of rigor when it comes to data management and security.
While informed consent is considered a federal requirement, state laws also influence the process. Understanding compliance with collecting EHR and genomic data is essential—particularly for research programs like those in multi-state health systems, or those that focus on restrictive inclusion criteria like rare disease.
The nuances for each state are complicated, and understanding the individual requirements for each state creates a heavy burden for research teams. California, for example, has a notoriously high degree of protections in place for privacy and health data.
Don’t risk running afoul of government regulations. Ensure that your model and data collection tools are fully compliant with state and federal policy.
“Consumer-mediated data exchange” has become an issue. As a paper in JMIR Medical Informatics notes, asking patients to share their medical data with researchers can help bypass the many challenges of getting more expansive longitudinal data.
And yet, accessing data post-consent is not without its own challenges, particularly for EHR data exchanges routed through a health provider system. IT departments at health systems with a lack of integration experience or time for projects may hold up access to EHR data due to data privacy concerns, or fears of penalties.
Luckily, technology partners like Vibrent can work directly with Cerner and Epic to adhere to federally mandated EHR interoperability guidelines and dole out access in concert with the law.
Explore how Vibrent Health can help you increase your data and insights.
EHR data, as useful as it is, can be imperfect. Charts can miss details—not just unstructured data like clinical notes but the more important stuff, too: fewer than 10% of patients’ EHRs even list the ICD code for many conditions. Patients can always discontinue or transfer care elsewhere, too—creating further gaps in their records.
Because this data can’t be counted on to be complete or correct, you can’t make assumptions about its quality. Doing so risks queries that exclude participants or even a whole population of them. For example, some people in ethnic, geographic, or other underrepresented groups may have a lower rate of completion or availability of clinical data. Implicit provider bias has also been shown to affect how patients in underrepresented groups are described in their medical charts.
Quality is important. So important, in fact, that the NIH Health Care Systems Research Collaboratory requires that data validation take place to ensure quality.
But it’s a problem that doesn’t have a neatly packaged solution. EHRs are, and at least for the foreseeable future will be, imperfect things. It’s highly encouraged for you and your teams to understand the limitations of clinical data in research and refer to the Collaboratory’s handbook on assessing data quality before relying on EHR data outright.
EHRs weren’t built for research. They’re organized to serve an in-clinic setting to inform patient care, but not in a way most conducive to investigators’ needs.
Add to that the inconsistencies in EHR quality across systems, and it’s clear why data harmonization has proven difficult. Roughly 55 percent of surveyed projects from the NIH’s Collaboratory have reported having to integrate data from multiple systems. To make genomics and clinical data more interoperable, researchers need to synthesize data from different sources more easily for downstream analysis.
This begs so many questions: How do you store combined data? How do you access it? How do you make EHR data more purpose-built for research? How do you combine datasets when they have different structures and limitations?
Well, normalize. Data normalization makes the incompatible more compatible with standardization, infrastructural support, and rules—and it also paves the way for more efficiencies in data science, such as using automated search queries, a Collaboratory-supported idea.
Multiple initiatives exist to tackle this problem industry-wide, but on a user level, it’s best to partner up: Vendors like Vibrent help with out-of-box harmonization with major EHRs as well as data dictionary mapping to integrate dispersed sets according to HL7/FHIR standards.
Combining EHR with genomic data means that researchers can quickly tap into a larger, more thorough universe of actionable insights. But as data gets more complex, challenges can hold back research. Researchers should consider consent best practices, post-consent collection, and data harmonization as they wrestle with these modern-day barriers of heterogeneous data.
Technology can help. Research teams, including the NIH, are seeing success by using digital platforms that simplify these processes, engage participants, and overcome logistical concerns.
In combination, EHRs and genomics can do great things. As an industry, we need technological infrastructure and capable partners that bring out the best of both.
Ready to see how this would work for your study? Request a pilot to learn more.