Skip to contents

This function performs an alignment check for GWAS data by comparing the input data (df) with a reference dataset (reference). It checks if the alleles in the data are aligned and flips the alleles when necessary to ensure consistency with the reference. The function also computes an LD matrix, performs a kriging-based procedure to adjust the z-scores, and generates a series of plots to visualize the alignment.

Usage

alignment_check(df, reference, bfile)

Arguments

df

A data frame containing the GWAS summary statistics. The following columns should be present:

  • CHR: Chromosome number (integer).

  • POS: Position of the SNP (integer).

  • SNP: SNP identifier (character).

  • EA: Effect allele (character).

  • OA: Other allele (character).

  • EAF: Effect allele frequency (numeric).

  • BETA: Effect size estimate (numeric).

  • SE: Standard error of the effect size (numeric).

  • P: P-value for the association (numeric).

  • N: Sample size (integer).

  • phenotype: Phenotype identifier (character).

reference

A data frame containing the reference data for comparison. The following columns should be present:

  • Predictor: SNP identifier (character).

  • A1: Allele 1 (character).

  • A2: Allele 2 (character).

  • A1_Mean: Mean value for allele 1 (numeric).

  • MAF: Minor allele frequency (numeric).

  • Call_Rate: Call rate (integer).

  • Info: Information score (integer).

bfile

file path for reference population (built for using 1kG, e.g., /path/EUR/EUR).

Value

A list containing:

  • plots: A list of plots, including the alignment plot and observed vs expected z-score plots.

  • list_df: The final data frame after allele flipping and adjustments.

  • lambda: A list of lambda estimates for the adjusted z-scores.

Details

The function first merges the GWAS summary statistics with the reference data based on the SNP identifier and flips the effect sizes if the effect allele (EA) does not match allele 1 (A1) in the reference. It then computes an LD matrix using the ieugwasr::ld_matrix_local function, which is used in subsequent analyses. The function also runs a kriging procedure using susieR::kriging_rss to adjust the z-scores based on the LD matrix, and generates plots comparing the observed and expected z-scores before and after allele flipping. The function will iteratively flip alleles and update the data until no moreallele flips are needed (based on a log likelihood ratio test).