As the National Institutes of Health (NIH) continues to release new funding opportunities in support of COVID-19 research, the Indiana Clinical and Translational Sciences Institute (CTSI) wants to make you are aware of three data resources available to you for conducting your critical COVID-19-related research projects.
COVID-19 Research Data Commons (CoRDaCo)
The COVID-19 Research Data Commons (CoRDaCo), was established by a partnership between Indiana University and the Regenstrief Institute. Through CoRDaCo, Regenstrief and IU’s University Information Technology Services’ (UITS) Scalable Compute Archive (RT-SCA) uses curated datasets of COVID-19 patient data to generate synthetic medical data. Synthetic data reflects the characteristics of real patient data, but does not include real patient information. Because it is statistically similar, it can be used in the same way as real data, but without compromising privacy. This allows for quicker access to the information.
The original patient data comes from the Indiana Network of Patient Care (INPC), which is managed by the Indiana Health Information Exchange. The INPC is one of the largest health information exchanges in the nation and contains records from health systems across the state. CoRDaCo currently has data on more than 424,000 COVID-19 positive patients and granular hospitalization data on more than 6,000.
To access this data, please complete the following request form and indicate “CoRDaCo” when asked how to define your population.
National COVID Cohort Collaborative (N3C)
The National COVID Cohort Collaborative (N3C), is a partnership among the National Center for Advancing Translational Sciences (NCATS) supported Clinical and Translational Science Awards (CTSA) Program hubs and the National Center for Data to Health (CD2H). The N3C systematically collects data derived from the electronic health records of people who were tested for the novel coronavirus or who had related symptoms. The data are harmonized and managed in a way that maintains the data’s validity while protecting patient privacy.
The N3C Data Enclave represents one of the largest, most secure clinical data resources for accelerating research on COVID-19. It also includes a powerful analytics platform and tool set for online discovery, visualization and collaboration. The data set and analytics capabilities will grow over time, with over 200 billion rows of patient data already available. Learn more about accessing N3C data here.
N3C is also a collaborative that provides opportunities for researchers to contribute and network with others working in informatics and data science. Those interested in collaborating should contact Andrew Neumann, Project Manager, at neumaand@oregonstate.edu.
For those who are interested in learning more about N3C, they are hosting orientation sessions beginning on February 2, 2021. They are also offering weekly office hours for general questions every Tuesday: https://uw-phi.zoom.us/meeting/register/tJ0tfu2rqzMsEt3lffVPIs2cXWlb0s4-3e-I
Participating sites like our can also submit a question anytime using a ticketing system to: https://covid.cd2h.org/support.
N3C Orientation Session A
Starting 2/2: https://uw-phi.zoom.us/meeting/register/tJAtdOCqqzwrE9eYbehVnqL3WcRbaX2SLQq9
• Provides a general overview of N3C, including goals, organization, and community resources.
• Introduces the 3 data tiers available and important considerations for research driven by the data harmonization process.
• Discusses resources for training and help, as well as the Data Use Request (DUR) process required for researcher access.
N3C Orientation Session B
Stating 2/9:https://uw-phi.zoom.us/meeting/register/tJMocO2gpjguE9fT5wgpl-eEINtKBvw_3MXa
• Focuses on technical aspects of working with N3C data in the secure Enclave, including OMOP concept sets and N3C-specific tooling such the Concept Set Browser.
• Introduces commonly used analysis tools, such as Contour and Code Workbooks and corresponding work flows for simple analyses.
• Introduces the Knowledge Store, a mechanism for sharing and using community- developed code and data across projects.
Indiana Biobank
The Indiana Biobank, a program of the Indiana Clinical and Translational Sciences Institute (CTSI), has data on more than 100 COVID-19 positive subjects and more than 200 COVID-19 positive blood samples.
Specifically, the Indiana CTSI has supported a uniform set of omics analyses on the COVID-19 positive patients, including DNA whole exome sequencing (WES), RNA transcriptomics, chemokine/cytokine multiplex analyses, and PBMC analyses (cell type/activation status, and function). These preliminary data are useful for extramural applications and are available to researchers through the Indiana CTSI’s Project Development Team (PDT) approval process . The application is available at this link.
The Indiana Biobank also has COVID-negative WES and genome-wide association studies available on more than 6,000 subjects. Access to these data can be requested by emailing Brooke Patz.