Identify significant "lead SNPs" for a trait, using the clumping algorithm of the popular PLINK software package. The output will be a list of independent SNPs associated with the trait, and other SNPs in linkage disequilibrium with them. #Download PLINK1.9 from https://www.cog-genomics.org/plink/1.9/ plink #Download genotype information for calculating LD from https://cncr.nl/research/magma/ g1000_eur.{bed,bim,fam} #Download bottom line results for a trait T2D.sumstats.gz #Columns zcat T2D.sumstats.gz | head CHROM POS P N 6 33025792 0.5319 825254 6 31228974 0.3039 123309 17 37868463 0.9932 2630 12 128927440 0.8391 535350 5 143346236 0.6235 8319 1 43912702 0.3543 6658 2 84658819 0.4396 10392 #add rsIDs zcat T2D.sumstats.gz | awk -F"\t" -v OFS="\t" 'FNR == NR {m[$1":"$4] = $2} FNR != NR {if (FNR == 1) {print "SNP",$0} if (m[$1":"$2]) {print m[$1":"$2],$0}}' g1000_eur.bim - > T2D.annot.sumstats #run clumping plink --bfile g1000_eur \ --clump T2D.annot.sumstats \ --clump-p1 1e-5 \ --clump-p2 1e-2 \ --clump-r2 0.1 \ --clump-kb 250 \ --clump-snp-field SNP \ --out T2D #output file head T2D.clump