Summary statistic files accompanying the paper: Single nucleotide polymorphisms associated with motor recovery in patients with non-disabling stroke: GWAS study Chad M. Aldridge1,∗, Robynne Braun2, Keith L. Keene3,4, Fang-Chi Hsu5, Michele M. Sale6, Bradford B. Worrall1,6 1 Department of Neurology, University of Virginia, Charlottesville, VA, USA 2 Department of Neurology, University of Maryland, Baltimore, MD, USA 3 Department of Biology, East Carolina University, Greenville, NC, USA 4 Center for Health Disparities, Brody School of Medicine, East Carolina University, Greenville, NC, USA 5 Department of Biostatistics and Data Science, Division of Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, NC, USA 6 Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA ∗Corresponding Author Chad M. Aldridge, PT, DPT, MS-CR, NCS Department of Neurology University of Virginia Charlottesville, VA cma7n@uvahealth.org Aldridge CM, Braun R, Keene KL, Hsu FC, Worrall BB. Single Nucleotide Polymorphisms Associated With Motor Recovery in Patients With Nondisabling Stroke: GWAS Study. Neurology. 2023 Oct 9:10.1212/WNL.0000000000207716. doi: 10.1212/WNL.0000000000207716. Epub ahead of print. PMID: 37813584. Abstract Background: Despite notable advances in genetic understanding of stroke recovery, most studies focus only on candidate genes. Only two genome-wide association studies (GWAS) are dedicated to stroke outcomes but are limited to the modified Rankin Scale (mRS). The mRS maps poorly to biological processes. Therefore, we performed a GWAS to discover single nucleotide polymorphisms (SNPs) associated with motor recovery post-stroke. Methods: We used the Vitamin Intervention for Stroke Prevention (VISP) dataset of 2,100 genotyped patients with non-disabling stroke. Of these, 488 patients had motor impairment at enrollment. Genotyped data underwent strict quality control and imputation. The GWAS utilized logistic regression models with generalized estimating equations (GEE) to leverage the repeated NIH Stroke Scale (NIHSS) motor score measurements spanning six time points over 24 months. The primary outcome was a decrease in the motor drift score of ≥ 1 vs. < 1 at each time point. Our model estimated the odds ratio of motor improvement for each SNP after adjusting for age, sex, race, days from stroke to visit, initial motor score, VISP treatment arm, and principal components. Results: Although no associations reached genome-wide significance (p < 5 × 10−8), our analysis detected 115 suggestive associations (p < 5 × 10−6). Notably, we found multiple SNP clusters near genes with plausible neuronal repair biology mechanisms. The CLDN23 gene had the most convincing association which affects blood-brain barrier integrity, neurodevelopment, and immune cell transmigration. Conclusion: We identified novel suggestive genetic associations with the first-ever motor-specific post-stroke recovery GWAS. The results seem to describe a distinct stroke recovery phenotype compared to prior genetic stroke outcome studies that use outcome measures, like the mRS. Replication and further mechanistic investigation are warranted. Additionally, this study demonstrated a proof-of-principle approach to optimize statistical efficiency with longitudinal datasets for genetic discovery.   Modeling The GWAS model estimated the Odds Ratio of being motor deficit free with a logistic regression with a generalized estimating equation extension using an exchangeable correlation structure. Formula Motor Recovered = Age + Sex + Treatment Group + Principle Components 3 and 5 + spline of stroke onset to follow-up time in days (knot at 250 days, 1 degree) + SNP + PatientID (random effect) Model implemented with the “gee” R package [1-3] as below: gee(formula = recov_motor ~ age + sex + treat + PC3 + PC5 +bs(stroke2visit, degree=1, knot=250) + SNP, id = PatientID, maxiter = 1000, family = binomial(link=”logit”), corstr = “exchangeable”) References 1. Halekoh U, Højsgaard S, Yan J (2006). The R Package geepack for Generalized Estimating Equations. Journal of Statistical Software, 15/2, 1–11. 2. Yan J, Fine JP (2004). Estimating Equations for Association Structures. Statistics in Medicine, 23, 859–880. 3. Yan J (2002). geepack: Yet Another Package for Generalized Estimating Equations. R-News, 2/3, 12–14.