CDKP transitions to new HuGeAMP framework

The Cerebrovascular Disease Knowledge Portal (CDKP; cerebrovascularportal.org) has now transitioned to a new data and software platform, the Human Genetics Amplifier (HuGeAMP). The platform includes a semi-automated pipeline (LoamStream) that performs quality control and association analysis of individual-level data, and allows the ingest of summary statistic datasets efficiently into cloud-based storage; a system (the Aggregator) that applies bioinformatic methods to genetic and genomic datasets; and a tool (the BioIndex) that indexes all results in order to serve them rapidly to the Knowledge Portals. The HuGeAMP platform allows us to aggregate a much larger volume of data and results than previously, to analyze them by many more methods, and to add new methods quickly, while maintaining an intuitive and responsive user interface.

The updated CDKP includes new genetic association datasets relative to the previous version. Most genetic associations displayed in the CDKP are the results of meta-analysis, performed using the METAL algorithm developed at the University of Michigan, across all available datasets for each phenotype to generate "bottom-line" p-values. As a result, researchers are presented with a single p-value for each variant-phenotype combination rather than seeing associations from multiple discrete datasets, which simplifies and streamlines the information displayed. However, since it is also often useful to view results from individual datasets, the CDKP continues to summarize dataset-specific associations on "Dataset Inspector" pages (see an example) and also displays these associations on Variant pages (see an example; click the blue "Datasets" button in the table under the PheWAS plot).

The updated CDKP also includes many more epigenomic and computational datasets than the previous version, visible on Region pages, Phenotype pages, and Variant pages. Epigenomic results are incorporated into the CDKP through collaboration with the Diabetes Epigenome Atlas, where the data are initially loaded and processed.

Relative to the previous version, the new CDKP offers new interactive interfaces. The GEnomic region Miner (GEM), on Region pages (see an example), is a visualization of genetic associations to which custom tracks may be added to view and filter credible sets and epigenomic annotations. The Gene Finder tool searches gene-level associations generated using the MAGMA method, allowing the user to find genes associated with multiple phenotypes.

The previous version of the CDKP will remain available temporarily. One tool, the Variant Finder, is not yet available in the new framework but may still be accessed in the previous version.

The new CDKP is embedded within the Common Metabolic Diseases Knowledge Portal (CMDKP), which offers access to all of the data in the CDKP plus additional results relevant to cardiovascular disease, types 1 and 2 diabetes, and sleep disorders. Simply click the icon at the top center of a CDKP page to navigate to the analogous page in the CMDKP. The Human Genetics Amplifier (HuGeAMP) software platform underlying the CMDKP was developed at the Broad Institute with support from the Accelerating Medicines Partnership in Type 2 Diabetes.

The CDKP is under continuous development, and we are interested in your feedback! Please contact us with suggestions or questions.

cd, stroke