Valid inference for machine learning-assisted genome-wide association studies

machine learning
GWAS
statistical inference
imputation
false positives
POP-GWAS
genome-wide association study
  • Objective: This paper addressed the critical issue of invalid statistical inference and false-positive risks in Machine Learning (ML)-assisted Genome-Wide Association Studies (GWAS), which use ML-imputed phenotypes. It introduced a new statistical framework called Post-Prediction GWAS (POP-GWAS) to ensure valid results.
  • Problem: The study demonstrated that performing a standard GWAS on an ML-imputed phenotype systematically biases the results, leading to inflated p-values and a high risk of false-positive associations.
  • Solution (POP-GWAS): POP-GWAS corrects for this bias by redesigning the GWAS test statistic to explicitly account for the variance introduced by the ML imputation model, requiring only GWAS summary statistics and information on the prediction model’s performance.
  • Validation: Application to a GWAS of bone mineral density (BMD) revealed that POP-GWAS successfully corrected the inflated p-values and led to the identification of 89 new loci that were missed or deemed unreliable by the naive approach.
Published

23 January 2026

PubMed: 39349818 DOI: 10.1038/s41588-024-01934-0 Overview generated by: Gemini 2.5 Flash, 28/11/2025

Key Findings: Validating ML-Assisted GWAS

This paper addresses the critical issue of invalid statistical inference and pervasive false-positive risks when performing Genome-Wide Association Studies (GWAS) using Machine Learning (ML)-imputed phenotypes. It introduces a novel statistical framework called Post-Prediction GWAS (POP-GWAS) to resolve this issue and enable valid and powerful inference.

  • The Problem of ML-Assisted GWAS: ML-assisted GWAS involves using sophisticated ML techniques (like deep learning or penalized regression) to predict or impute a complex phenotype (e.g., from image data or electronic health records) and then running a GWAS on the predicted phenotype. The study demonstrates that this procedure introduces a systematic bias, leading to an inflated risk of false-positive associations and consequently, invalid p-values, irrespective of the quality of the ML imputation model.
  • The Solution: POP-GWAS Framework: POP-GWAS is a statistical framework designed to perform GWAS directly on ML-imputed outcomes while maintaining valid statistical inference.
    • Mechanism: The framework redesigns the GWAS association test by explicitly accounting for the covariance structure between the true phenotype, the imputed phenotype, and the genetic markers. It requires only summary statistics from the original GWAS (on the imputed outcome) and knowledge of the prediction model (or its cross-validation performance) as input.
  • Empirical Validation and Novel Loci: The study validated POP-GWAS using simulations and applied it to a GWAS of bone mineral density (BMD) derived from dual-energy X-ray absorptiometry (DXA) imaging, which is a common ML-assisted approach.
    • New Discoveries: POP-GWAS successfully identified 89 new loci for BMD at 14 skeletal sites, many of which displayed skeletal site-specific genetic architecture.
    • Bias Correction: It demonstrated that the standard (naive) ML-assisted GWAS p-values were significantly inflated, whereas POP-GWAS restored valid p-values and statistical control.
  • General Applicability: The POP-GWAS framework is agnostic to the specific ML imputation algorithm used, meaning it can be applied universally across different ML techniques (e.g., deep learning, random forest, linear models) and different GWAS settings.

Study Design and Methods

Study Design

This was a methodological study focused on statistical genetics and machine learning. The authors used simulation studies and real-world data application to demonstrate the bias in naive ML-assisted GWAS and validate the performance of the proposed POP-GWAS framework.

Methodology: POP-GWAS

  1. Imputed Outcome (\(\hat{Y}\)): The GWAS is initially performed on the ML-imputed phenotype (\(\hat{Y}\)) derived from the original predictors (\(X\)).
  2. Standard GWAS Bias: The authors mathematically derived the source of the bias in the standard GWAS association test when using \(\hat{Y}\) instead of the true phenotype \(Y\).
  3. POP-GWAS Test Statistic: POP-GWAS introduces a new, corrected test statistic that leverages the properties of the imputed phenotype. This correction factor is derived from the summary statistics of the standard GWAS and the phenotype prediction variance (a measure of the imputation quality, often estimated via cross-validation). \[Z_{\text{POP-GWAS}} = \frac{Z_{\text{naive}}}{\sqrt{\text{Correction Factor}}}\] This correction ensures that the variance of the test statistic is correctly estimated, thus producing valid p-values.

Data Application

  • BMD GWAS: The framework was applied to a large-scale GWAS of Bone Mineral Density (BMD), where BMD phenotypes were often derived from complex imaging data using ML-based prediction methods. The results highlighted the importance of site-specific analysis, finding different causal genes at different skeletal sites.

Conclusions and Recommendations

The study provides a critical warning regarding the statistical validity of naive ML-assisted GWAS and offers a concrete, generalized solution through the POP-GWAS framework.

  • Essential for ML-GWAS: The authors strongly recommend that all GWAS conducted on ML-imputed phenotypes should utilize the POP-GWAS framework to ensure valid statistical inference and prevent the proliferation of false-positive genetic associations.
  • Enabling Discovery: By correcting for bias and maintaining statistical power, POP-GWAS is poised to accelerate genetic discovery in fields that rely heavily on automated phenotyping, such as medical imaging and electronic health records.
  • Future Work: The authors suggest future developments could extend POP-GWAS to handle complex designs, such as family-based studies or Mendelian randomization using imputed phenotypes.