README
### “CAD_GWAS_primary_discovery_meta.tsv” ###
Aragam KG, Jiang T, Goel A et al (2022). Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nature Genetics.
This file comprises the GWAS summary statistics for the primary discovery meta-analysis. There are technically 3 separate meta-analyses of de novo datasets with previously published CARDIoGRAMplusC4D summary statistics using (a) 1000Genomes imputed genome-wide genotypes, (b) CardioMetabochip, and (c) Exomechip. For each variant, only the results with the largest number of effective cases were kept. The summary statistics are after implementation of a post-meta-analysis QC filter, which required a variant with p_value<1e-5 from the meta-analysis to have at least 2 studies with point estimates in the same direction of effect as the overall meta-analysis direction of effect, and the 2nd largest p_value to be less than 0.2 (i.e. 2 studies have p<0.2). This filter removed some clearly erroneous variants driven by single studies.
Columns in the file include:
# MarkerName = a unique variant identifier comprising chromosome, base_pair_position, first allele, second allele. Positions are based on GRCh37.
# Allele1 = effect allele.
# Allele2 = non-effect allele.
# Freq1 = frequency of the effect allele across the studies (as estimated by METAL).
# FreqSE = the standard error of the frequency of the effect allele across the studies (as estimated by METAL).
# MinFreq = the minimum frequency of the effect allele across the studies (as estimated by METAL).
# MaxFreq = the maximum frequency of the effect allele across the studies (as estimated by METAL).
# Direction = the direction column from the METAL output, which includes the studies in the following order according to the three different meta-analyses (see column descriptor for 'Meta_analysis' below):
'Cardiogram' (n=11): UK Biobank, CARDIoGRAMplusC4D-1000G, EPIC-CVD, GerMIFs5, GerMIFs6, GerMIFs7, deCODE, Greek Coronary Disease cohort, HUNT, Mass General Brigham Biobank, TIMI
'Exome' (n=11): UK Biobank, CARDIoGRAMplusC4D-Exome, GerMIFs1, GerMIFS2, GerMIFs5, GerMIFs6, GerMIFs7, Greek Coronary Disease cohort, deCODE, Mass General Brigham Biobank, TIMI
'Metabo' (n=10): UK Biobank, CARDIoGRAMplusC4D-Metabo, GerMIFs4, GerMIFs5, GerMIFs6, GerMIFs7, Greek Coronary Disease cohort, HUNT, Mass General Brigham Biobank, TIMI
# HetISq = the estimated I2 value representing the between-study heterogeneity in the meta-analysis for each variant (as estimated by METAL).
# HetChiSq = the estimated chi-squared ('Q') value representing the between-study heterogeneity in the meta-analysis for each variant (as estimated by METAL).
# HetDf = the number of degrees of freedom (i.e. non-missing studies minus one) in the meta-analysis for each variant.
# HetPVal = the p-value for the between-study heterogeneity in the meta-analysis for each variant (as estimated by METAL).
cases - the total number of coronary artery disease (CAD) cases included in the meta-analysis for each variant.
# Effective_Cases = the sum of the effective number of cases (calculated within each study as the variant-specific INFO score multiplied by the number of cases, with INFO score=1 for genotyped variants) across studies.
# N = the total sample size (CAD cases and controls) included in the meta-analysis for each variant.
# Meta_analysis = denotes whether the summary statistics for each variant are from the CARDIoGRAMplusC4D 1000Genomes imputed GWAS ('Cardiogram'), Cardiometabochip ('Metabo') or Exomechip ('Exome') meta-analysis (based on the maximum number of effective cases for variants that were available in more than one meta-analysis).
README
### CAD_GWAS_BBJ_meta.tsv ###
Aragam KG, Jiang T, Goel A et al (2022). Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nature Genetics.
This file comprises the GWAS summary statistics for the combined cross-ancestry meta-analysis that includes the primary discovery meta-analysis (predominantly comprising European ancestry participants) as well as the Biobank Japan study (East Asian participants).
Columns in the file include:
# MarkerName = a unique variant identifier comprising chromosome, base_pair_position, first allele, second allele. Positions are based on GRCh37.
# Allele1 = effect allele.
# Allele2 = non-effect allele.
# Freq1 = frequency of the effect allele across the studies (as estimated by METAL).
# FreqSE = the standard error of the frequency of the effect allele across the studies (as estimated by METAL).
# MinFreq = the minimum frequency of the effect allele across the studies (as estimated by METAL).
# MaxFreq = the maximum frequency of the effect allele across the studies (as estimated by METAL).
direction - the direction column from the METAL output, which includes the studies in the following order:
UK Biobank, CARDIoGRAMplusC4D-1000G, EPIC-CVD, GerMIFs5, GerMIFs6, GerMIFs7, deCODE, Greek Coronary Disease cohort, HUNT, Mass General Brigham Biobank, TIMI, Biobank Japan
# HetISq = the estimated I2 value representing the between-study heterogeneity in the meta-analysis for each variant (as estimated by METAL).
# HetChiSq = the estimated chi-squared ('Q') value representing the between-study heterogeneity in the meta-analysis for each variant (as estimated by METAL).
# HetDf = the number of degrees of freedom (i.e. non-missing studies minus one) in the meta-analysis for each variant.
# HetPVal = the p-value for the between-study heterogeneity in the meta-analysis for each variant (as estimated by METAL).
cases - the total number of coronary artery disease (CAD) cases included in the meta-analysis for each variant.
# Effective_Cases = the sum of the effective number of cases (calculated within each study as the variant-specific INFO score multiplied by the number of cases, with INFO score=1 for genotyped variants) across studies.
# N = the total sample size (CAD cases and controls) included in the meta-analysis for each variant.
README
### “CAD_GWAS_SEX_STRATIFIED.txt.gz” ###
Aragam KG, Jiang T, Goel A et al (2022). Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nature Genetics.
This file comprises the sex-differentiated and sex-heterogeneity GWAS meta-analysis including 17 studies with sex-stratified GWAS results, implemented in GWAMA.
Columns in the file include:
#rs_number: Marker ID for the variants as chr:pos_a1_a2
#reference_allele: effect allele
#other_allele: non effect allele
#eaf: frequency of the effect allele in the sex-combined meta-analysis
#beta: beta value for the sex-combined meta-analysis
#se: se value for the sex-combined meta-analysis
#beta_95L: Lower 95% CI for beta for the sex-combined meta-analysis
#beta_95U: Upper 95% CI for beta for the sex-combined meta-analysis
#z: z-score for the sex-combined meta-analysis
#p_value: p-value for the sex-combined meta-analysis
#log10_p_value: Absolute value of logarithm of the sex-combined meta-analysis p-value to the base of 10
#q_statistic: Cochran’s heterogeneity statistic
#q_p_value: Cochran’s heterogeneity statistic’s p-value
#i2: heterogeneity index I2
#n_studies: N studies for the sex-combined meta-analysis
#n_samples: N samples in the sex-combined meta-analysis
#effects: summary of effect directions (“+” positive effect of reference allele, “-“ negative effect of reference allele, “0” no effect, “?” missing data). The order of the effects is: UKBIOBANK.MALES, UKBIOBANK.FEMALES, ARIC.FEMALE, ARIC.MALE, EPIC.MALE, EPIC.FEMALE, FGENTCARD.FEMALE, FGENTCARD.MALE, GerMIFSI.FEMALE, GerMIFSII.FEMALE, GerMIFSIII.FEMALE, GerMIFSIII.MALE, GerMIFSII.MALE, GerMIFSI.MALE, GerMIFSIV.FEMALE, GerMIFSIV.MALE, GerMIFSV.FEMALE, GerMIFSVI.FEMALE, GerMIFSVI.MALE, GerMIFSVII.FEMALE, GerMIFSVII.MALE, GerMIFSV.MALE, HPS.FEMALE, HPS.MALE, HUNT.FEMALE, HUNT.MALE, PROCARDIS.FEMALE, PROCARDIS.MALE, PARTNERS.MEGA.EUR.MALE, PARTNERS.MEGA.EUR.FEMALE, PARTNERS.MEG.EUR.MALE, PARTNERS.MEG.EUR.FEMALE, PARTNERS.MEGEX.EUR.MALE, PARTNERS.MEGEX.EUR.FEMALE
#male_eaf: effect allele frequency for the male-only meta-analysis
#male_beta: beta value for the male-only meta-analysis
#male_se: SE for the male-only meta-analysis
#male_beta_95L: Lower 95% CI for beta in the male-only meta-analysis
#male_beta_95U: Upper 95% CI for beta in the male-only meta-analysis
#male_z: z-score for the males only meta-analysis
#male_p_value: P-value for the males only meta-analysis
#male_n_studies: N studies for males
#male_n_samples: N for males
#female_eaf: effect allele frequency for the female-only meta-analysis
#female_beta: beta value for the female-only meta-analysis
#female_se: SE for the female-only meta-analysis
#female_beta_95L: Lower 95% CI for beta in the female-only meta-analysis
#female_beta_95U: Upper 95% CI for beta in the female-only meta-analysis
#female_z: z-score for the females only meta-analysis
#female_p_value: P-value for the females only meta-analysis
#female_n_studies: N studies for females
#female_n_samples: N for females
#gender_differentiated_p_value: combined p-value of males and females assuming different effect sizes between genders (2 degrees of freedom)
#gender_heterogeneity_p_value: heterogeneity between genders (1 degree of freedom)
#rsid_ukb: rsID from UK Biobank
#maf: Minor allele frequency for the sex-combined meta-analysis
#mac: Minor allele count for the sex-combined meta-analysis
#CHR: chromosome
#BP: genomic position on GRCh37