Value and Vision
Computational technologies and tools vital to clinical and translational research are sometimes developed, deployed, and managed independently, which can render these processes tedious, costly, heterogeneous, and less secure. The Tools & Cloud Infrastructure Core aims to establish a common tool and cloud computing architecture to provide CTSA hubs with an affordable, easy-to-use, scalable deployment paradigm that can remove boundaries and help translational researchers promote and deploy their own tools as well as adopt others.
Much has been written in the contemporary scientific literature and general media concerning the promise of leveraging advanced computational technologies and methods to enable new paradigms for clinical and translational research. Ultimately, this research can and should generate health benefits at both the patient and population levels, informed by the knowledge generated and disseminated via these efforts. We believe these types of emergent clinical and translational research paradigms can and should be predicated on the collection, analysis, and dissemination of relevant, timely, and comprehensive data and knowledge by a variety of end-users in a highly liquid and democratic manner.
The pursuit of clinical and translational research at a national level represents an exciting inflection point in the history of health and life sciences. Capitalizing on this opportunity requires democratization and wide-spread use of computational technologies by a broad spectrum of researchers with variable degrees of technical capability and training and requires us to:
- Enable effective end-user adoption and utilization of computational platforms and tools in a variety of settings
- Ensure technology deployment and user experience are compatible with “real world” workflows and environments
- Overcome limitations in vendor-specific technologies that make it difficult to leverage systems for integrating and interacting with diverse and complex data types across traditional organizational boundaries
- Ensure such platforms are elastic, scalable, and sustainable from both a technology and resource perspective
Community Core Objectives
- Create common cloud computing architecture that can enable the rapid deployment and sharing of reusable software components by CTSA hubs
- Demonstrate the use of shared tools and platforms for the collaborative analysis of clinical data in a manner that transcends individual CTSA hub “boundaries”
- Disseminate a common set of tools that can be employed for the both local and collaborative query of common data warehousing platforms and underlying data models
- Pilot the “cloudification” of software artifacts that can be shared across CTSA hubs to address common and recurring information needs
Presentations and Other Materials
- iEC Leads Meeting Update: CD2H Show & Tell: Feb 5, 2021
- All Hands Meeting: Nov 11, 2019
- iDTF Lead Meeting Update: CD2H Show and Tell: Nov 1, 2019
- EAB slides: August 1, 2019
- All Hands Report Outs: June 14, 2019
- Informatics Maturity and Best Practices Project Reports: June 2019
Tools & Cloud Infrastructure Core community meetings have been repurposed to meet the needs of the N3C Collaborative Analytics workstream. See the N3C website for more information.
A sandbox project designed to create a best practices platform for deploying and evaluating clinical machine learning tools and algorithms. Goals include provisioning community-vetted solutions to common clinical machine learning challenges, including data preparation, analysis of bias sources, and evaluation/validation of algorithms.
A continuation of Phase II collaborative work with the Informatics Enterprise Committee (iEC) working group, this project aims to deploy a suite of natural language processing (NLP) tools and realize evaluation measures and tools as well as best practices.
A sandbox project designed to develop, evaluate, and share tools and methods for data quality assessment. This sandbox project will include a pilot that leverages the Accrual to Clinical Trials (ACT) Network data to understand the quantity and completeness of ACT data and differences in coding practices across institutions.
This project is based on a pilot with the FDA and will create a cloud-based data use agreement toolkit to support the entry of de-identified EHR data from partner institutions into the sandboxes. The project will leverage a preconfigured FHIR repository maintained on the CD2H/NCATS cloud or behind the partner institution’s firewall as a demonstration. The team will work with the community to write Governance, SOPs, and policy for CTSA informatics community collaboration. A pan-sandbox Governance group will have CD2H and community representatives to contribute subject matter for specific domains.
The Tool Registry is a centralized, curated library of software resources developed by and for the NCATS research community. Records will combine descriptive metadata about a piece of software’s origin and purpose—along with semantic context to enable discovery and reuse. Application prototyping will be created with potential use cases for Natural Language Processing tools, EHR DREAM Challenge models, and National COVID Cohort Collaborative (N3C) workflows. Research and design outputs prior to N3C work will be incorporated: existing tool registry solutions, existing tool and other irrelevant ontologies, and software quality models.
A continuation of a Phase II project, Competitions provides a platform for robust peer review across CTSAs with a cloud-based, single sign-on software tool for investigators, reviewers, and administrators.
This project titled "Competitions" is an open source tool to run NIH-style peer review of competitions, pilot projects, and research proposals in a cloud-based, consortium-wide, single sign-on platform.
This project created an open source clinical Enterprise Data Warehouse (EDW) Data Browser to enable querying by data dictionaries, or ontologies, and allow for access to both de-identified and identifiable patient data in a compliant manner.