info
genetics
finemap-colocalisation
expression
proteomics
metabolomics
multi-omics
MR
PGS
statistics
interaction
other
ml
cancer
Papers
Papers
Adjusting for Heritable Covariates Can Bias Effect Estimates in Genome-Wide Association Studies
Protein-metabolite association studies identify novel proteomic determinants of metabolite levels in human plasma
A phenome-wide comparative analysis of genetic discordance between obesity and type 2 diabetes
metaGE: Investigating genotype x environment interactions through GWAS meta-analysis
Post-transcriptional regulation across human tissues
A biobank-scale test of marginal epistasis reveals genome-wide signals of polygenic interaction effects
What is a Differentially Expressed Gene?
Valid inference for machine learning-assisted genome-wide association studies
Genome-wide association scans for secondary traits using case-control samples
Causal modelling of gene effects from regulators to programs to traits
Using GWAS summary data to impute traits for genotyped individuals
Global quantification of mammalian gene expression control
Specificity, length and luck drive gene rankings in association studies
Extensive co-regulation of neighboring genes complicates the use of eQTLs in target gene prioritization
Genetic architecture: the shape of the genetic contribution to human traits and disease
The contribution of genetic determinants of blood gene expression and splicing to molecular phenotypes and health outcomes
Joint analysis of GWAS and multi-omics QTL summary statistics reveals a large fraction of GWAS signals shared with molecular phenotypes
Categories
All
(17)
ascertainment correction
(1)
bayesian statistics
(1)
bioinformatics
(1)
burden tests
(1)
case-control studies
(1)
causal inference
(4)
collider bias
(1)
colocalization
(1)
complex traits
(3)
differential gene expression
(1)
disease mechanisms
(1)
epistasis
(1)
eQTL
(2)
false positives
(1)
functional genomics
(1)
gene expression
(2)
gene regulation
(1)
gene-environment interaction
(1)
genetic architecture
(3)
genetics
(3)
genome-wide association study
(1)
genotype-by-environment
(1)
GWAS
(12)
heritability
(2)
human genetics
(1)
imputation
(2)
inverse-probability-of-sampling weighting
(1)
machine learning
(2)
marginal epistasis
(1)
mendelian randomization
(2)
meta-analysis
(1)
metabolism
(1)
metabolomics
(1)
molecular phenotypes
(1)
mRNA turnover
(1)
multi-environment trials
(1)
multi-omics
(3)
nonlinear associations
(1)
perturbation assays
(1)
phenome-wide analysis
(1)
plant genetics
(1)
pleiotropy
(1)
POP-GWAS
(1)
post-transcriptional regulation
(1)
pQTL
(1)
protein turnover
(1)
proteomics
(3)
QTL
(1)
QTL mapping
(1)
rare variants
(1)
reproducibility
(1)
RNA-seq
(1)
secondary trait
(1)
SNP interactions
(1)
sQTL
(1)
statistical bias
(1)
statistical genetics
(3)
statistical inference
(1)
statistical methods
(1)
summary statistics
(1)
systems biology
(4)
target gene prioritization
(1)
transcriptomics
(3)
translation
(1)
Genetics
Post-transcriptional regulation across human tissues
Core Finding:
Scaled mRNA levels predict overall mean protein abundance across different genes (mean-level variability) but are poor predictors of the same protein’s level across different tissues (across-tissues variability).
Statistical Insight:
The overall high mRNA-protein correlation (
\(R_T\)
) is misleading, as it represents an instance of
Simpson’s paradox
where the strong inter-gene variability masks the weak intra-gene/across-tissue correlation (
\(R_P\)
).
Conclusion:
The reproducible, concerted variability in protein-to-mRNA ratios across tissues confirms that
post-transcriptional regulation
is a substantial and tissue-specific factor, likely contributing approximately 50% of the across-tissues protein variance.
23 January 2026
What is a Differentially Expressed Gene?
Central Problem
: Traditional binary classification of Differentially Expressed Genes (DEGs) using p-value and
\(\log_{2}\)
fold change thresholds suffers from reproducibility issues due to biological and technical variation, especially with low replicate numbers.
Findings
: The use of rigid
\(\log_{2}\)
fold change cut-offs leads to
false negatives
(missing small but significant changes), and inherent gene variability, particularly in small sample sets, leads to
false positives
(misinterpreting noise as a change).
Recommendation
: The authors advocate for
rank-based methods
grounded in
Bayesian statistics
(using Bayes factors) over traditional tools like DESeq2 and edgeR to reduce reliance on arbitrary thresholds and better communicate the uncertainty of differential expression.
23 January 2026
Genetic architecture: the shape of the genetic contribution to human traits and disease
This review defines
genetic architecture
by four components: the number of causal variants (polygenicity), the distribution of their effect sizes, their allele frequency spectrum, and the types of genetic and environmental interactions (dominance, epistasis, GxE).
It highlights that complex traits are
highly polygenic
and influenced by variants across the entire frequency spectrum, addressing
“missing heritability”
by pointing to the role of
rare variants
, non-additive effects, and
Gene-by-Environment (GxE) interactions
.
The authors emphasize that
pleiotropy
(one variant affecting multiple traits) is widespread among common variants, discussing how techniques like
Mendelian Randomization (MR)
are essential for distinguishing causation from pleiotropy in the complex genetic landscape.
23 January 2026
Specificity, length and luck drive gene rankings in association studies
Systematic comparison of GWAS and rare variant burden tests across 209 UK Biobank traits revealing they prioritize different genes through distinct mechanisms
Burden tests favor trait-specific genes while GWAS capture both trait-specific genes and context-specific variants on pleiotropic genes
Gene length and genetic drift are major confounders affecting rankings in burden tests and GWAS respectively
23 January 2026
A phenome-wide comparative analysis of genetic discordance between obesity and type 2 diabetes
Core Finding
: The study used genetic data to define two distinct
obesity profiles
that exhibit highly contrasting, or discordant, effects on the risk of developing type 2 diabetes (T2D).
Phenotypic Differences
: The two profiles showed key differences across a wide range of clinical and molecular traits, including
cardiovascular mortality
,
liver metabolism
, specific
lipid fractions
, and
blood pressure
.
Mechanism
: Instrumental analyses highlighted the prominent
causal role
of factors like
waist-to-hip ratio
and
blood pressure
in driving the differences in T2D risk between the two genetic obesity subtypes.
23 January 2026
metaGE: Investigating genotype x environment interactions through GWAS meta-analysis
Novel meta-analysis approach for multi-environment trials (METs) that jointly analyzes GWAS summary statistics while accounting for inter-environment correlations
Controls Type I error effectively (FDR ≤0.05) where competing methods fail severely (METAL FDR >0.84), with computational efficiency enabling analysis of 600K markers × 22 environments in ~2 minutes
Identified novel competition-responsive flowering QTLs in Arabidopsis and heat-stress yield QTLs in maize through contrast tests and meta-regression with environmental covariates
23 January 2026
Joint analysis of GWAS and multi-omics QTL summary statistics reveals a large fraction of GWAS signals shared with molecular phenotypes
Objective
: The new method,
OPERA
, was developed to integrate GWAS and multi-omics
QTL (xQTL)
summary statistics to quantify the proportion of complex trait genetic signals mediated by molecular phenotypes.
Key Finding
: The study found that approximately
50% of genetic signals identified in GWAS are shared with (and likely mediated by) at least one molecular phenotype
, with
eQTLs
(gene expression QTLs) being the most dominant mediators.
Impact
: OPERA led to the discovery of
89 novel genes
for 11 complex traits, confirming the approach’s ability to significantly enhance gene discovery and fine-mapping by linking genetic variants to their underlying molecular regulatory mechanisms.
23 January 2026
Extensive co-regulation of neighboring genes complicates the use of eQTLs in target gene prioritization
Research Problem
: Systematic benchmarking showed that using
eQTL colocalization methods
to prioritize causal genes for GWAS hits is often complicated by extensive co-regulation of neighboring genes and is less effective than simpler heuristics.
Key Result
: The simple strategy of assigning fine-mapped pQTLs to the
closest protein coding gene
significantly outperformed all tested Bayesian colocalization methods, achieving 76.9% recall and 71.9% precision.
Conclusion
: Linking GWAS variants to target genes remains challenging using eQTL evidence alone, and robust gene prioritization requires the
triangulation of evidence from multiple functional sources
to improve confidence.
23 January 2026
The contribution of genetic determinants of blood gene expression and splicing to molecular phenotypes and health outcomes
Topic
: Investigating the
gene-regulatory mechanisms
(eQTLs and sQTLs) of nonprotein-coding genetic variants in blood and their causal contribution to
3,430 molecular phenotypes
(proteins, metabolites, lipids) and health outcomes.
Method
: Mapped eQTLs and sQTLs in 4,732 individuals and used
colocalization
and
mediation analyses
to link these regulatory variants to downstream molecular and disease traits.
Impact
: Identified 222 molecular phenotypes significantly
mediated by gene expression or splicing
, providing mechanistic insights into diseases (e.g.,
\(WARS1\)
in hypertension) and offering a valuable public resource for human genetic etiology.
23 January 2026
A biobank-scale test of marginal epistasis reveals genome-wide signals of polygenic interaction effects
Methodology:
The paper introduces
FAME (FAst Marginal Epistasis test)
, a new, computationally efficient statistical method designed to detect
marginal epistasis
—the aggregate interaction effect between a single SNP and the entire polygenic background—in large biobanks.
Key Finding:
Applying FAME to 53 quantitative traits in the UK Biobank, the study identified
16 significant marginal epistasis signals
across 12 traits, providing the first systematic, genome-wide evidence of polygenic interaction effects.
Implication:
The findings confirm that genetic interactions are a measurable component of complex trait architecture, suggesting that current additive GWAS models are incomplete and that marginal epistasis may contribute to the “missing heritability.”
23 January 2026
Valid inference for machine learning-assisted genome-wide association studies
Objective
: This paper addressed the critical issue of
invalid statistical inference
and
false-positive risks
in
Machine Learning (ML)-assisted Genome-Wide Association Studies (GWAS)
, which use ML-imputed phenotypes. It introduced a new statistical framework called
Post-Prediction GWAS (POP-GWAS)
to ensure valid results.
Problem
: The study demonstrated that performing a standard GWAS on an ML-imputed phenotype systematically biases the results, leading to
inflated p-values
and a high risk of false-positive associations.
Solution (POP-GWAS)
: POP-GWAS corrects for this bias by redesigning the GWAS test statistic to explicitly account for the variance introduced by the ML imputation model, requiring only GWAS summary statistics and information on the prediction model’s performance.
Validation
: Application to a GWAS of
bone mineral density (BMD)
revealed that POP-GWAS successfully corrected the inflated p-values and led to the identification of
89 new loci
that were missed or deemed unreliable by the naive approach.
23 January 2026
Adjusting for Heritable Covariates Can Bias Effect Estimates in Genome-Wide Association Studies
Adjusting a Genome-Wide Association Study (GWAS) for a
heritable covariate
(a correlated, genetically influenced trait) introduces an unintended
collider bias
, which distorts SNP effect estimates and can lead to false positive associations.
The bias is approximately proportional to the product of the genetic effect on the covariate and the phenotypic correlation between the traits, and was empirically confirmed by finding a
significant enrichment of SNPs with opposite effects
in the WHR adjusted for BMI GWAS (
\(p=0.005\)
).
The authors strongly caution against interpreting adjusted results as true direct genetic effects, recommending
unadjusted GWAS
for total effect discovery and
bivariate methods
for power gains without inducing collider bias.
23 January 2026
Using GWAS summary data to impute traits for genotyped individuals
Novel nonparametric LS-imputation method recovers genetic components of traits from GWAS summary statistics and individual genotypes, enabling nonlinear association analyses impossible with summary data alone
Perfectly recovers trait values when test genotypes match training genotypes (correlation >0.999), capturing nonlinear SNP-trait information despite using only linear marginal associations
Outperforms PRS-CS for association analyses in UK Biobank HDL data: successfully detects non-additive genetic effects, SNP-SNP interactions, and trains nonlinear prediction models (random forests) while PRS-CS shows severe false positive inflation
23 January 2026
Protein-metabolite association studies identify novel proteomic determinants of metabolite levels in human plasma
Approach
: A large-scale multi-omics resource was created by meta-analyzing proteomic and metabolomic data from three cohorts, followed by the use of
Mendelian Randomization (MR)
with
pQTLs
to infer causal relationships.
Causal Findings
: The study identified
224 putative causal associations
between 95 proteins and 96 metabolites, including novel links like the causal role of
ADAMTSL3
in regulating
BCAA metabolites
.
Validation
: Over 50% of the top causal findings were
experimentally validated
through metabolomic profiling of
mouse knockout strains
, providing strong biological proof for the in-silico MR results.
23 January 2026
Global quantification of mammalian gene expression control
Core Discovery
: This seminal study, using parallel metabolic pulse labeling and absolute quantification in mammalian cells, concluded that the
cellular abundance of proteins is predominantly controlled at the level of translation
, not transcription.
Quantification
: The study found a strong correlation between mRNA and protein abundance (
\(R \approx 0.73\)
) but
no correlation between the half-lives
(turnover rates) of corresponding mRNA and proteins.
Mechanism
: Protein synthesis rates (translation) were found to be the most variable component, serving as the primary determinant of protein steady-state levels, while degradation rates primarily determine the
kinetics and response time
of the system.
23 January 2026
Genome-wide association scans for secondary traits using case-control samples
This statistical methodology paper examines the bias introduced when a
case-control GWAS
(designed for a primary disease
\(D\)
) is used to analyze a
secondary quantitative trait
(
\(T\)
).
It demonstrates that
naïve analysis
(ignoring case-control ascertainment) leads to
biased effect estimates
for the marker-secondary trait association (
\(G-T\)
) specifically when
both the marker
\(G\)
and the trait
\(T\)
are independently associated with the primary disease
\(D\)
.
The authors propose using
Inverse-Probability-of-Sampling-Weighted (IPW) regression
as the robust method, which provides
unbiased estimates
in all scenarios, though at the cost of reduced statistical power, recommending naïve analysis for markers not associated with the primary disease.
23 January 2026
Causal modelling of gene effects from regulators to programs to traits
Framework:
This paper proposes a novel statistical approach to infer
causal mechanistic pathways
that link genes to traits by combining gene-trait effect sizes (from GWAS LoF burden tests) with gene-regulatory relationships (from Perturb-seq experiments).
Causal Hierarchy:
The model establishes a
three-step causal graph
: Gene
\(\longrightarrow\)
Regulatory Programs
\(\longrightarrow\)
Trait, allowing researchers to explain gene-trait associations via intermediate functional steps.
Proof of Concept:
Applied to three blood traits (RDW, MCH, IRF) using human HSPC Perturb-seq data, the model successfully identified the regulatory programs (e.g., ribosomal genes) and directionally predicted how gene perturbations causally influence the traits.
23 January 2026
No matching items